python的拼音 Python之拼音拆分

2021-10-10 19:16:17 字數 4198 閱讀 5940

經常會需要用到將zhangwei轉化為zhangwei、zw、zhangw之類的,就涉及到乙個拼音拆分演算法,這裡寫了乙個demo分享給大家

我的思路是先將聲母轉換為大寫,然後就可以根據大寫字母來分割單個拼音

轉化**

def sm(strs):

smlist = 'bpmfdtnlgkhjqxrzcsyw'

for s in smlist:

strs = strs.replace(s,s.upper())

return strs

然後發現有個問題,韻母中也包含了聲母的元素,zhangwei就會變成zhangwei

發現兩個問題,乙個是zh、ch、sh這類的包含了聲母h,乙個是er、an、en、in、un、vn、ang、eng、ing、ong這類的包含了聲母r、n、g

於是再加乙個轉換

def sm(strs):

smlist = 'bpmfdtnlgkhjqxrzcsyw'

nosm = ['er','an','en','in','un','vn','ng','ng']

rep =

for s in smlist:

strs = strs.replace(s,s.upper())

for s in nosm:

strs = strs.replace(s,s.lower())

for s in rep.keys():

strs = strs.replace(s,rep[s])

return strs

這時候zhangwei已經可以轉為zhangwei了

在進行批量轉換的時候又遇到乙個問題,碰到chenguiying(陳桂英)這種拼音的時候,會轉化為chenguiying,這是因為r、n、g既可以做結尾,也可以做聲母,於是又對nosm這個list進行一次判斷,發現這類後,再往後判斷乙個字元,判斷是否在聲母表中

def sm(strs):

smlist = 'bpmfdtnlgkhjqxrzcsyw'

nosm = ['er','an','en','in','un','vn','ng','ng']

rep =

for s in smlist:

strs = strs.replace(s,s.upper())

for s in nosm:

strs = strs.replace(s,s.lower())

for s in rep.keys():

strs = strs.replace(s,rep[s])

for s in nosm:

tmp_num = 0

isok = false

while (tmp_num < len(strs)) and (isok==false):

try:

tmp_num = strs.index(s.lower(),tmp_num)

except:

isok = true

else:

tmp_num = tmp_num + len(s)

if strs[tmp_num:tmp_num+1].lower() not in smlist:

strs = strs[:tmp_num-1]+strs[tmp_num-1:tmp_num].upper()+strs[tmp_num:]

return strs

這時候已經可以提取聲母了,剩下就簡單了,碰到大寫字母後就是乙個拼音的開始,提取簡拼就只找大寫字母

拆分def onep(strs):

restr = ''

strs = sm(strs)

for s in strs:

if 'a' <= s and s <= 'z':

restr = restr + ' ' + s

else:

restr = restr + s

restr = restr[1:]

restr = restr.lower()

return restr.split(' ')

返回['chen','gui','ying']

簡拼提取

def ******p(strs):

restr = ''

strs = sm(strs)

for s in strs:

if 'a' <= s and s <= 'z':

restr = restr + s

restr = restr.lower()

return restr

返回cgy

然後就可以玩很多了

附乙個通過拼音生成弱口令字典的指令碼

#!/usr/bin/python

# author : wkong

# crack

def clearchar(chars):

restr = ['\n','\r','\t',' ']

for res in restr:

chars = chars.replace(res, '')

return chars

def sm(strs):

smlist = 'bpmfdtnlgkhjqxrzcsyw'

nosm = ['er','an','en','in','un','vn','ng','ng']

rep =

for s in smlist:

strs = strs.replace(s,s.upper())

for s in nosm:

strs = strs.replace(s,s.lower())

for s in rep.keys():

strs = strs.replace(s,rep[s])

for s in nosm:

tmp_num = 0

isok = false

while (tmp_num < len(strs)) and (isok==false):

try:

tmp_num = strs.index(s.lower(),tmp_num)

except:

isok = true

else:

tmp_num = tmp_num + len(s)

if strs[tmp_num:tmp_num+1].lower() not in smlist:

strs = strs[:tmp_num-1]+strs[tmp_num-1:tmp_num].upper()+strs[tmp_num:]

return strs

def ******p(strs):

restr = ''

strs = sm(strs)

for s in strs:

if 'a' <= s and s <= 'z':

restr = restr + s

restr = restr.lower()

restr = restr.capitalize()

return restr

def repass(name):

ulist =

pwdlist =

ce = ['!@#123','123!@#','@123','@1234','@12345','@123456','123','1234','12345','123456','123.','1234.','12345.','123456.','123123','abc','abc@123','qwer!@#','!@#qwer','qwe!@#','!@#qwe','!qaz2wsx','1q2w3e']

for s in ce:

for u in ulist:

return pwdlist

def autocrack(username, password):

print(username+':'+password)

if __name__ == '__main__':

userfile = 'zhangwei.txt'

puserfile = open(userfile, 'r')

userlist = puserfile.readlines()

puserfile.close()

for user in userlist:

user = clearchar(user)

pwd = repass(user)

for pw in pwd:

autocrack(user, pw)

image.png

python處理漢字的拼音

一 漢字拼音轉換工具 python 版 二 安裝 pip install pinyin三 例項 import pinyin as py print py.get 我是乙個中國人 print py.get initial 我是乙個中國人 print type py.get initial 我是乙個中國...

python漢字轉換為拼音

使用pypinyin包 pip install pypinyin from pypinyin import pinyin,normal 將漢字轉換為拼音,pinyin 轉換後是列表,不加style轉換後帶聲調 pos 1 for piny in pinyin self.name,style norm...

Python漢字轉換成拼音

最近在使用python做專案時,需要將漢字轉化成對應的拼音.網上的一些包大多是python2.x的,使用下面這個包,支援python3.6 from xpinyin import pinyin p pinyin default splitter is p.get pinyin u 上海 shang ...