例項一,不需要修改頭部直接爬取
import requests
url =
""try:
r =
requests.get(url)
r.raise_for_status(
) print
(r.text[
:1000])
except
:print
("爬取失敗"
)
例項二,有的**可能會檢測到頭部 』 user-agent 』 : 』 python-requests/2.11.1』,而識別出當前請求訪問的是乙個爬蟲,因此需要修改頭部
import requests
url =
""try:
kv =
r = requests.get(url,headers = kv)
r.raise_for_status(
) print
(r.text[
:1000])
except
:print
("爬取失敗"
)
import requests
keyword =
"python"
try:
kv =
r = requests.get(
"",params=kv)
print
(r.request.url)
r.raise_for_status(
)print
(len
(r.text)
)except
:print
("爬取失敗"
)
360搜尋全**
import requests
keyword =
"python"
try:
kv =
r =requests.get(
"",params=kv)
print
(r.request.url)
r.raise_for_status(
)print
(len
(r.text)
)except
:print
("爬取失敗"
)
例項四:爬取
# + ".jpg" 使得儲存下的檔案以jpg格式儲存
try:
ifnot os.path.exists(root)
: os.mkdir(root)
ifnot os.path.exists(path)
: r = requests.get(url)
with
open
(path,
'wb'
)as f:
f.write(r.content)
f.close(
("檔案儲存成功"
)else
("檔案已存在"
)except
("爬取失敗"
)url是位址,root中是想存放在本機的哪個位置
import requests
url =
""try:
r = requests.get(url+
'202.204.80.112'
) r.raise_for_status(
) print
(r.text[
-500:]
)except
:print
("爬取失敗"
)
這個沒跑出來,,不知道問題出在**,,, 天蛛爬蟲學習筆記 Requests爬蟲例項
定義乙個爬蟲的通用框架 import requests defgethtmltext url 爬蟲通用框架,try 捕捉到錯誤後會執行except的語句 r requests.get url r.raise for status 返回值若為200,則表示正常訪問 繼續執行,否則會返回httperro...
Python網路爬蟲學習(二)
十五.京東商品頁面的爬取 import requests r requests.get r.status code r.encoding r.text 1000 十六.亞馬遜商品頁面的爬取 import requests def main url try kv r requests.get url,...
網路爬蟲 python學習筆記
pip install requestsr requests.get url r requests.get url,params none,kwargs request其實只有乙個方法 request 有兩個物件 import request r requests.get print r.statu...