Python語言基礎 4（詞頻統計）

#coding:utf-8
import time
import string
num=[6,2,7,4,1,3,5]
str='dfjyfhbs'
print sorted(num,reverse=true)
for a,b in zip(num,str):
print b,'is',a
a=t1=time.clock()
for i in range(1,20000):
print time.clock()-t1
t1=time.clock()
b=[i for i in range(1,200)]
#print b
print time.clock()-t1
#列表推導式 線'|'後面是for迴圈的表示式，而線'|'前面的可以認為是我們想要放在列表中的元素
#list=[item|for item in iterable]
c=[n for n in range(1,10) if n%2==0]
z=[letter.lower() for letter in 'abcdefg']
#c[2, 4, 6, 8]
#z['a', 'b', 'c', 'd', 'e', 'f', 'g']
#print c,'\n',z
#詞頻統計
path='c:\users\administrator\desktop\s.txt '
with open(path,'r')as text:
#strip(string.punctuation)可以去掉所有的標點符號
#在文字的首位去掉了連在一起的標點符號，並把首字母大寫的單詞轉化成小寫
words=[raw_word.strip(string.punctuation).lower() for raw_word in text.read().split()]
#將列表用set函式轉換成集合，自動去掉了其中所有重複的元素
words_index=set(words)
#建立乙個以單詞為key，出現頻率為value的字典
counts_dict=
print(words)
#列印整理後的函式，其中key=lambda x:counts_dict[x]叫做lambda表示式
#可以暫且理解為以字典中的值為排序的引數
for word in sorted(counts_dict,key=lambda x:counts_dict[x],reverse=true):
print('{}---{} times'.format(word,words.count(word)))

Python 統計詞頻

calhamletv1.py def gettext txt open hamlet.txt r read txt txt.lower for ch in txt txt.replace ch,將文字中特殊字元替換為空格 return txt hamlettxt gettext words haml...

python 詞頻統計

import re 正規表示式庫 import collections 詞頻統計庫 f open text word frequency statistics.txt article f.read lower 統一轉化成小寫 f.close pattern re.compile t n articl...

python統計詞頻

已知有鍵值對店名，城市的鍵值對，我們現在的需求是根據城市來統計店的分布。資料的格式如下我們希望輸出資料的格式如下所示所有的資料都是以txt檔案儲存的。from collections import counter from pprint import pprint import os imp...

Python語言基礎 4（詞頻統計）

Python 統計詞頻

python 詞頻統計

python統計詞頻

相關推薦