python的jieba簡單使用

函式

含義jieba.cut(string)

精確模式，返回乙個可迭代的資料型別

jieba.cut(string,cut_all = true)

全模式，輸出文字string中的所有可能的單詞

jieba.cut_for_search(string)

搜尋引擎模式，適合搜尋引擎建立索引的分詞結果

jieba.lcut(string)

精確模式，返回乙個列表型別

jieba.lcut(string,cut_all = true)

全模式，返回乙個列表型別

jieba.lcut_for_search(string)

搜尋引擎模式，返回乙個列表型別

jieba.add_word(word)

向分詞詞典中增加新詞

# _*_ coding:utf-8 _*_
import jieba
words = jieba.cut("山東的氣候屬暖溫帶季風氣候型別",cut_all=true) #全模式
print("全模式:", '/ '.join(words) ) 
words = jieba.lcut("山東的氣候屬暖溫帶季風氣候型別",cut_all=true) #全模式，返回列表
print("全模式:", words)
words = jieba.cut("山東的氣候屬暖溫帶季風氣候型別",cut_all=false) #精確模式
print("精確模式:", '/ '.join(words) ) 
words = jieba.lcut("山東的氣候屬暖溫帶季風氣候型別",cut_all=false) #精確模式，返回列表
print("精確模式:", words ) 
words = jieba.cut("山東的氣候屬暖溫帶季風氣候型別") #預設是精確模式
print("精確模式:", '/ '.join(words) ) 
words = jieba.cut_for_search("山東的氣候屬暖溫帶季風氣候型別") #搜尋引擎模式
print("搜尋引擎模式：", '/ '.join(words) )
words = jieba.lcut_for_search("山東的氣候屬暖溫帶季風氣候型別") #搜尋引擎模式，返回列表
print("搜尋引擎模式：", words )

執行效果：

使用 jieba 分詞對乙個文字進行分詞，統計次數出現最多的詞語，以盜墓筆記為例

# _*_ coding:utf-8 _*_
import jieba
text = open('c:/users/dell/desktop/test/盜墓筆記.txt', 'r' ,encoding = 'utf-8').read()
words = jieba.cut(text)
word_counts = {}
for word in words:
if len(word) < 2:
continue
word_counts[word] = word_counts.get(word, 0) + 1 # 遍歷所有詞語，每出現一次其對應的值加1
word_counts_items = list(word_counts.items())
word_counts_items.sort(key=lambda x: x[1], reverse=true) # 根據詞語出現的次數進行從大到小排序
for i in range(5):
print(word_counts_items[i])

執行效果：

python中的jieba簡單使用

jieba常用三個函式 jieba.lcut x jieba.lcut x,cut all true jieba.lcut for rearch x 練習 import jieba s 中國特色社會主義進入新時代，我國社會主要矛盾已經轉化為人民日益增長的美好生活需要和不平衡不從分的發展之間的矛盾。...

python中jieba分詞的簡單應用

話不多說，上來就貼注意編碼問題 encoding utf 8 import jieba jieba.load userdict wangzhan.txt 儲存不需要切分的重要詞語 def creadstoplist stopwordspath stwlist line.strip for line...

python使用jieba實現簡單的詞頻統計

import jieba defgettext txt open hamlet.txt r read txt txt.lower for ch in txt txt.replace ch,return txtharmtxt gettext words harmtxt.split counts for...

python的jieba簡單使用

python中的jieba簡單使用

python中jieba分詞的簡單應用

python使用jieba實現簡單的詞頻統計

相關推薦