人生苦短 python爬蟲學習週期

from urllib import request

def
get_data()
: url =
' '#建立request物件url請求頭
headers =
req = request.request(url, headers=headers)
#傳user-agent
response = request.urlopen(req)
if response.getcode()==
200:
#確認是否成功
data = response.read(
)#讀取響應的結果
data =
str(data,encoding=
'utf-8'
)#轉換為str
#將資料寫入檔案中
with
open
('idnex.html'
,mode=
'w',encoding=
'utf-8'
)as f:
f.write(data)

data parse_data():
with
open
('index.html'
,mode=
'r',encoding=
'utf-8'
)as f:
html = f.read(
) bs = beautifulsoup(html,
'html.parser'
)#使用解析器
#1.find方法，獲取第乙個匹配的標籤
#div = bs.find('div') #找到相應的內容
#print('div') #列印相應的內容
#print(type((div)) #內容否認型別
#2.find_all方法，或取所有匹配的標籤
#metas = bs.find_all('meta') #返回的是所有的集合
#print(metas[0])
#print(bs.find_all(id='hello')) #根據id 獲取的資料，返回集合
#print(bs,find_all(class_='itany')) #根據class 獲取
#3.獲取select()方法，使用css選擇器獲取資料
#print(bs.select('#hello'))
#print(bs.select('.itany'))
#print(bs.select('p#world span'))
#print(bs.select('[title]'))
#獲取文字
#print(bs.select('.div')[0].get_text())
#print(bs.find_all('article'))
value = bs.select(
'#article')[
0].get_text(strip=ture)
#print(len(value))
print
(value)

python人生苦短人生苦短,我用Python

python學習筆記每日總結,反思.學習,1,注釋單行注釋注釋內容多行注釋注釋內容快捷鍵 ctrl 2,變數 type 變數用來檢視變數型別變數型別,程式中需要特別注意變數型別,很容易報錯,或者很熟悉變數型別的報錯,錯了再改也成.格式轉化紅線常用 bool布林型別 ture和fla...

人生苦短，Python 當歌！

每時每刻，搜尋引擎和都在採集大量資料，非原創即採集。採集資訊用的程式一般被稱為網路蜘蛛 web spdier 網路爬蟲 web crawler 網路鏟可模擬洛陽鏟其行為一般是先爬到對應的網頁上，再把需要的資訊鏟下來。其實，網路資料採集程式就像乙隻辛勤採蜜的bee，它飛到花目標網頁上...

人生苦短，我用python

python是一種物件導向的解釋型計算機程式語言，由荷蘭人guido van rossum於1989年發明，第乙個公開發行版發行於1991年。python是純粹的自由軟體，源和直譯器cpython遵循 gpl gnu general public license 協議 python語法簡潔清晰，特...

人生苦短 python爬蟲 學習週期

python人生苦短 人生苦短,我用Python

人生苦短，Python 當歌！

人生苦短，我用python

相關推薦

人生苦短 python爬蟲學習週期

python人生苦短人生苦短,我用Python