python之四種方式讀取文件

1：**

# coding=utf-8
"""@author: jiajiknag
程式功能： 讀取txt檔案
"""# 導包
from bs4 import beautifulsoup
from urllib.request import urlopen
# 要讀取文件
texpage= urlopen("")
# 編碼輸出
print(str(texpage.read(),'utf-8'))

2：結果

程式功能：讀取csv檔案

"""from urllib.request import urlopen

from io import stringio

import csv

# csv檔案

data = urlopen("").read().decode('ascii', 'ignore')

# 封裝成stringio物件

datafile = stringio(data)

# csvreader = csv.reader(datafile)

# 讀取

dictreader = csv.dictreader(datafile)

# 輸出

print(dictreader.fieldnames)

for row in dictreader:

print(row)

#print("\nthe album \"" + row[0] + "\" was released in " + str(row[1]))

2：結果

1：**

# coding=utf-8
"""@author: jiajiknag
程式功能： 讀取pdf檔案
注釋：readpdf函式最大的好處是，如果你的pdf檔案在電腦裡，你就可以直接把urlopen返回
的物件pdffile替換成普通的open()檔案物件：
pdffile = open("../pages/warandpeace/chapter1.pdf", 'rb')
"""from urllib.request import urlopen
from pdfminer.pdfinterp import pdfresourcemanager,process_pdf
from pdfminer.converter import textconverter
from pdfminer.layout import laparams
from io import stringio
from io import open
# 建立乙個讀取pdf函式
defreadpdf
(pdffile):
# 建立物件
# 解析pdf
rsrcmgr = pdfresourcemanager()
# 建立stringio物件
retstr = stringio()
laparams = laparams()
device = textconverter(rsrcmgr, retstr,laparams=laparams)
process_pdf(rsrcmgr, device,pdffile)
# 關閉
device.close()
# 利用restr.getvalue()轉換為檔案物件
content = retstr.getvalue()
# 轉換文成之後關閉
retstr.close()
pdffile = urlopen("")
# 讀取檔案
outputstring = readpdf(pdffile)
# 輸出
print(outputstring)
# 關閉
pdffile.close()

2：結果

程式功能：

"""from zipfile import zipfile

from urllib.request import urlopen

from io import bytesio

from bs4 import beautifulsoup

"""這段**把乙個遠端word文件讀成乙個二進位制檔案物件（bytesio與本章之前用的

stringio類似），再用python的標準庫zipfile解壓（所有的.docx檔案為了節省空間都

進行過壓縮），然後讀取這個解壓檔案，就變成xml了。

"""wordfile = urlopen("").read()

wordfile = bytesio(wordfile)

document = zipfile(wordfile)

xml_content = document.read('word/document.xml')

wordobj = beautifulsoup(xml_content.decode('utf-8'))

textstrings = wordobj.findall("w:t")

for textelem in textstrings:

closetag = ""

try:

style = textelem.parent.previoussibling.find("w:pstyle")

if style is

notnone

and style["w:val"] == "title":

print(""

except attributeerror:

# 不列印標籤

pass

print(textelem.text)

print(closetag)

2：結果

java中四種讀取檔案方式

讀取檔案有多種方式，基於傳統的輸入流方式或基於nio的buffer緩衝物件和管道讀取方式甚至非常快速的記憶體對映讀取檔案。randomaccessfile 隨機讀取，比較慢優點就是該類可讀可寫可操作檔案指標 fileinputstream io普通輸入流方式，速度效率一般 buffer緩衝讀取基於...

使用dom4j讀取xml文件的四種方式

以下是四種讀取xml檔案的方式，每種都有自己的用處。這是我在寫日誌管理器的時候查詢到的。希望能給大家幫助首先我們先給出乙個簡單的xml檔案 17891 sdffff job2010 1 1 5000.00 1000.00 2 7369 smith clerk 7902 1980 12 17 800...

python中四種命名方式

1 object 公用方法 2 object 半保護被看作是 protect 意思是只有類物件和子類物件自己能訪問到這些變數，在模組或類外不可以使用，不能用 from module import 匯入。object 是為了避免與子類的方法名稱衝突，對於該識別符號描述的方法，父類的方法不能輕易地被...

python之四種方式讀取文件

java中四種讀取檔案方式

使用dom4j讀取xml文件的四種方式

python中四種命名方式

相關推薦