python處理中文待補充

字串在python內部的表示是unicode編碼，因此，在做編碼轉換時，通常需要以unicode作為中間編碼，即先將其他編碼的字串解碼（decode）成unicode，再從unicode編碼（encode）成另一種編碼。

decode的作用是將其他編碼的字串轉換成unicode編碼，如str1.decode('gb2312')，表示將gb2312編碼的字串str1轉換成unicode編碼。

encode的作用是將unicode編碼轉換成其他編碼的字串，如str2.encode('gb2312')，表示將unicode編碼的字串str2轉換成gb2312編碼。

**中字串的預設編碼與**檔案本身的編碼一致。

如：s='中文'

如果是在utf8的檔案中，該字串就是utf8編碼，如果是在gb2312的檔案中，則其編碼為gb2312。這種情況下，要進行編碼轉換，都需要先用decode方法將其轉換成unicode編碼，再使用encode方法將其轉換成其他編碼。

如果字串是這樣定義：s=u'中文'

則該字串的編碼就被指定為unicode了，即python的內部編碼，而與**檔案本身的編碼無關。因此，對於這種情況做編碼轉換，只需要直接使用encode方法將其轉換成指定編碼即可。

以日期為例：

#raw_date is a gbk-coding string
defparse_date(raw_date):
entry_date = raw_date.decode("
gbk"
) month =int(entry_date[0])
#unicode 對中文的長度是1，如果6月2日那麼長度就是4，如果6月25日，長度就是5
if len(entry_date) == 5:
day = 10 * int(entry_date[2]) + int(entry_date[3])
else
: day = int(entry_date[2])
return 2013, month, day

python處理中文待補充

指標待補充

SVN常用問題處理辦法（待補充）

dinic 模板待補充

python處理中文 待補充

指標 待補充

SVN常用問題處理辦法（待補充）

dinic 模板 待補充

相關推薦

python處理中文待補充

指標待補充

dinic 模板待補充