Python 編碼轉換

# coding: utf-8

s = 'abc'

print type(s) # str(utf-8)

print len(s) # 3

s = unicode(s) # str -> unicode，其中str的每個字元值必須小於128

print type(s) # unicode

print len(s) # 3

s = u'abc'

print type(s) # unicode

print len(s) # 3

s = s.encode('utf-8') # unicode -> str(utf-8)

print type(s) # str

print len(s) # 3

s = s.decode('utf-8') # str(utf-8) -> unicode，這裡str的每個字元值任意

print type(s) # unicode

print len(s) # 3

s = '中國'

# 由於整個檔案以utf-8編碼

print type(s) # str(utf-8)

print len(s) # 6

s = u'中國'

print type(s) # unicode

print len(s) # 2

s = s.encode('utf-8')

print type(s) # str(utf-8)

print len(s) # 6

s = s.decode('utf-8')

print type(s) # unicode

print len(s) # 2

s = raw_input(u'輸入：') # windows下貌似中文按gbk編碼，每個中文佔2個位元組

print type(s) # str(gbk)

print len(s) # 4

s = s.decode('gbk') # 要想gbk編碼轉為utf-8編碼，先將gbk編碼轉為unicode

print type(s) # unicode

print len(s) # 2

s = s.encode('gbk')

print type(s) # str(gbk)

print len(s) # 4

# 根據以上的驗證，得出結論

# 各種編碼都可以通過unicode來轉化，unicode可以假想為一張各種字元的對照表，在這個表中可以找到世界範圍內的任何一種字元

# 當然，也包括中文，每個字元都對應乙個序號，如'a' -> 0x61，'中' -> 0x4e2d

# unicode -> utf-8 unicode.encode('utf-8')

# utf-8 -> unicode str.decode('utf-8')

# gbk -> unicode str.decode('gbk')

# unicode -> gbk unicode.encode('gbk')

Python 編碼轉換

coding utf 8 s abc print type s str utf 8 print len s 3 s unicode s str unicode，其中str的每個字元值必須小於128 print type s unicode print len s 3 s u abc print ty...

python編碼轉換

參見主要介紹了python的編碼機制，unicode,utf 8,utf 16,gbk,gb2312,iso 8859 1 等編碼之間的轉換。常見的編碼轉換分為以下幾種情況 1.自動識別字串編碼 coding utf8 import urllib import chardet rawdata ur...

python 編碼轉換

主要介紹了python的編碼機制，unicode,utf 8,utf 16,gbk,gb2312,iso 8859 1 等編碼之間的轉換。常見的編碼轉換分為以下幾種情況可以使用 chardet 模組自動識別字元創編碼 chardet 使用方法例如 a為unicode編碼要轉為gb2312。a...

Python 編碼轉換

Python 編碼轉換

python編碼轉換

python 編碼轉換

相關推薦