python中字串的encode和decode

為什麼python使用過程中會出現各式各樣的亂碼問題，明明是中文字元卻顯示成「/xe4/xb8/xad/xe6/x96/x87」的形式？為什麼會報錯「unicodeencodeerror: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)」？本文就來研究一下這個問題。

字串在python內部的表示是unicode編碼，因此，在做編碼轉換時，通常需要以unicode作為中間編碼，即先將其他編碼的字串解碼（decode）成unicode，再從unicode編碼（encode）成另一種編碼。

decode的作用是將其他編碼的字串轉換成unicode編碼，如str1.decode('gb2312')，表示將gb2312編碼的字串str1轉換成unicode編碼。

encode的作用是將unicode編碼轉換成其他編碼的字串，如str2.encode('gb2312')，表示將unicode編碼的字串str2轉換成gb2312編碼。

因此，轉碼的時候一定要先搞明白，字串str是什麼編碼，然後decode成unicode，然後再encode成其他編碼

---------------------

原文：

python中字串的encode和decode

python中的字串

python中的字串

python中的字串

python中字串的encode和decode

python中的字串

python中的字串

python中的字串

相關推薦