python PyEnchant（拼寫檢查）

本文主要是記錄了一下pyenchant包的安裝以及簡單使用。這個包主要功能是對英語單詞進行拼寫檢查，並可以對拼寫錯誤的單詞推薦一些可能的正確單詞。

使用pip直接安裝即可，指令如下：

pip install pyenchant

使用上述指令，如果安裝沒有錯誤就是安裝成功了，一般情況下（mac、ubuntu等系統）是沒有任何問題的。但是在亞馬遜的ec2機器上，使用上述指令會提示錯誤。主要原因是ec2系統上缺少enchant組建，這裡就需要在ec2機器上安裝該組建。使用如下指令：

sudo yum install enchant

安裝好enchant之後，在使用pip安裝pyenchant即可。但是在python中使用enchant時，會發現剛剛裝好的enchant沒有安裝預設字典。這就需要另外安裝常用英語字典，來支援pyenchant的正常使用，使用如下指令：

sudo yum install aspell-en

sudo yum install enchant-aspell

在pyenchant中最主要的就是dict物件，我們可以使用它來檢查單詞的拼寫是否正確，同時還可以對拼寫錯誤的單詞提供幾個可能的正確拼寫。

首先介紹如何建立dict物件，並用其檢查某個單詞的拼寫：

>>> import enchant
>>> d = enchant.dict("en_us")
>>> d.check("hello")
true
>>> d.check("helo")
false

建立dict物件可以使用如下方式：

方法描述

d = enchant.dict(language)

使用指定語言建立dict物件

d = enchant.request_dict(language)

使用指定語言建立dict物件

d = enchant.request_pwl_dict(filename)

只用本地檔案中的詞彙建立dict物件

d = enchant.dictwithpwl(language, filename)

將內建某語言以及本地檔案中的詞彙合併來建立dict物件

注意上述方法中用到了本地檔案filename，檔案中每一行只存放乙個單詞。

enchant模組還提供了如下幾個關於語言的方法：

方法描述

enchant.dict_exits(language)

檢視當前enchant模組是否支援某種語言

enchant.list_languages()

檢視當前enchant模組支援的所有語言

dict物件有如下方法與屬性方便使用者使用：

方法or屬性

描述d = enchant.dict(language)

指定語言建立乙個dict物件

d.tag

當前dict使用的語言

d.check(word)

檢查word的拼寫是否正確

d.suggest(word)

對拼寫錯誤的word提供幾個正確拼寫的單詞

>>> import enchant
>>> d = enchant.dict("en_us")
>>> d.tag
'en_us'
>>> d.check("hello")
true
>>> d.check("helo")
false
>>> d.suggest("helo")
['hole', 'hello', 'helot', 'halo', 'hero', 'hell', 'held', 'helm', 'help', 'he lo']
>>> enchant.dict_exists("aa")
false
>>> enchant.dict_exists("en_us")
true
>>> enchant.list_languages()
['de_de', 'en_au', 'en_gb', 'en_us', 'fr_fr']

這裡使用enchant.checker中的spellchecker類來解決對一整段文字中的單詞進行拼寫檢查

>>> 
from enchant.checker import spellchecker
>>> chkr = spellchecker("en_us")
>>> chkr.set_text("this is sme sample txt with erors.")
>>> 
for err in chkr:
... 
print
"error", err.word
...error sme
error txt
error erors

將英語文字進行分詞，返回結果格式(word, pos)，其中pos是word在整個文字中出現的位置

>>> 
from enchant.tokenize import get_tokenizer
>>> tknzr = get_tokenizer("en_us")
>>> [w for w in tknzr("this is some ****** text.")]
[('this', 0), ('is', 5), ('some', 8), ('******', 13), ('text', 20)]

python PyEnchant（拼寫檢查）

FxCop的NamingRule之拼寫檢查不生效

mysql事務拼寫 MySQL語句的拼寫

fourth 拼寫糾正

python PyEnchant（拼寫檢查）

FxCop的NamingRule之拼寫檢查不生效

mysql事務拼寫 MySQL語句的拼寫

fourth 拼寫糾正

相關推薦