python讀寫文字檔案解決亂碼問題

普通文字檔案的遍歷

假設文字的內容如下：

hello world， this is a text file to test.

maybe is a very ****** example.

import os
defreadfile
(filename):
fopen = open(filename, 'r')
for eachline in fopen:
print(eachline)
fopen.close()
if __name__ == '__main__':
filepath = "d:\\documents and settings\\desktop\\python\\hello.txt"
readfile(filepath)

此時的執行結果為：

hello world，

this is a text file to test.

maybe is a very ****** example.

當文字出現亂碼時，執行檔案會報錯

文字如下所示：

hello world， this is a text file to test. now i will add some gibberish co7(n聜蔸?c鄭o, i want to skip the last line.

maybe is a very ****** example.

出錯資訊如下：

traceback (most recent call last):

file 「d:/documents and settings/desktop/python/hello.py」, line >13, in

readfile(filepath)

file 「d:/documents and settings/desktop/python/hello.py」, line >6, in readfile

for eachline in fopen:

unicodedecodeerror: 『gbk』 codec can』t decode byte 0xa6 in position >95: illegal multibyte sequence

一般來說，如果是編碼不匹配，比如是utf-8編碼，那麼執行下述語句即可

fopen = open(filename, 'r','utf-8')

但是我們文字中確實含有亂碼，就只能採用異常處理了

為了能夠遍歷文字，且忽略錯誤，改造後的python指令碼如下：

import os
defreadfile
(filename):
fopen = open(filename)
while
true:
try:
eachline = fopen.readline()
if(eachline == ''):
break
print(eachline)
except unicodedecodeerror as err:
pass
fopen.close()
if __name__ == '__main__':
filepath = "d:\\documents and settings\\desktop\\python\\logfile.txt"
readfile(filepath)

這段**處理我的資料沒有問題，但是處理上面的文字無法輸出任何資料，

用我的資料文件進行輸出測試，發現文字前面很多行被直接忽略掉。

這裡有一篇文章可以參看下：

利用python從檔案中讀取字串（解決亂碼問題）

裡面對文字檔案編碼的儲存和python設定文字編碼格式有比較詳細的介紹，這裡就不細說了。

就我目前的知識而言，眼下只能採用二進位制讀寫的方式開啟文字了。

fopen = open(filename,'rb')
while
true:
try:
eachline = fopen.read(200)
ifnot eachline:
break
print(eachline)
except unicodedecodeerror as err:
pass
fopen.close()

上面的**能夠列印所有的文字，只是每行有乙個』b』的字眼，這個本身不是問題，只是看著不爽而已，但是當用字串處理函式時，就發現下面的語句根本無法執行

if ("skip the last line"
in eachline)

另外我的本意是想按行讀取，經過測試，readline函式在二進位制讀取還是有效的，那麼剩下的問題就是怎麼轉碼了，經過修改後的**如下：

import os
defreadfile
(filename):
fopen = open(filename,'rb')
while
true:
try:
eachline = fopen.readline()
ifnot eachline:
break
string = eachline.decode('utf-8')
print(string)
if ("skip the last line"
in string):
print("true")
except unicodedecodeerror as err:
pass
fopen.close()
if __name__ == '__main__':
filepath = "d:\\documents and settings\\desktop\\python\\hello.txt"
readfile(filepath)

現在終於有輸出了：

this is a text file to test.

now i will add some gibberish

i want to skip the last line.

true

maybe is a very ****** example.

上面的**，將每行讀取的資料轉碼為字串，然後再進行處理。轉碼的過程中，有可能有問題，這個時候就需要根據報告的異常逐個處理了。

python 讀寫文字檔案

本人最近新學python 用到文字檔案的讀取，經過一番研究，從網上查詢資料，經過測試，總結了一下讀取文字檔案的方法.a f open filename r content f.read decode utf 8 b f codecs.open encoding utf 8 content f.rea...

Python 讀寫文字檔案

def readfile filename,mode r 讀取檔案內容 filename 檔名 return string bin 若檔案不存在，則返回空字串 import os if not os.path.exists filename return fp open filename,mode,...

讀寫文字檔案

讀文字 function readtext filename string string vars string alltext string f textfile begin assignfile f,filename 將c myfile.txt檔案與f變數建立連線，後面可以使用f變數對檔案進行操...

python讀寫文字檔案 解決亂碼問題

python 讀寫文字檔案

Python 讀寫文字檔案

讀寫文字檔案

相關推薦

python讀寫文字檔案解決亂碼問題