Python爬蟲和函式除錯

一：函式除錯

用之前學過的try···except進行除錯

測試球賽的gameover函式

輸出true

二：python爬蟲

requests庫是乙個簡潔且簡單的處理http請求的第三方庫。

get()是對應與http的get方式，獲取網頁的最常用方法，可以增加timeout=n 引數，設定每次請求超時時間為n秒

text（）是http相應內容的字串形式，即url對應的網頁內容

content（）是http相應內容的二進位制形式

用requests（）開啟搜狗主頁20次

import
requests
try:
for i in range(20):
r=requests.get("
",timeout=30)
print
(type(r))
print
(r.text)
except
: 
print('
error
')

三、

import
refrom bs4 import
beautifulsoup
html = """
我的第乙個段落。
row 1, cell 1
row 1, cdll 2
row 2, cell 1
row 2, cell 2 
"""soup = beautifulsoup(html,"
html.parser")
print
(soup.prettify())
print("
(b).該html的body標籤內容為\n{}
".format(soup.body.prettify()))

該html的body標籤內容為

我的第乙個段落。

row 1, cell 1< d>

row 1, cdll 2< d>

< r>

row 2, cell 1< d>

row 2, cell 2< d>

< r>

< able>

Python爬蟲使用函式

1.open 函式開啟檔案 2.write 函式寫入內容 3.close 函式關閉檔案 sp open d python spyder spyder.txt w 開啟檔案 sp.write os 向檔案中寫入內容 sp.close 關閉檔案4.read 函式讀取檔案所有內容 sp open ...

scrapy爬蟲除錯

在scrapy框架執行時，除錯爬蟲是必不可少的一步，用於常規檢查爬蟲執行過程中item與介面返回值，主要操作如下新建 debug.py檔案，寫入內容如下 from scrapy import cmdline name main scrapy的名稱 cmd scrapy crawl format n...

爬蟲 JS除錯

其中常用的有elements 元素面板 console 控制台面板 sources 源面板 network 網路面板 1 找發起位址 2 設定事件觸發斷點 event listener breakpoint 3 監測dom樹變化的斷點 attributes modifications 屬性修改 n...

Python爬蟲和函式除錯

Python爬蟲使用函式

scrapy爬蟲除錯

爬蟲 JS除錯

相關推薦