url庫與urlerror異常chuli

2021-08-14 02:36:45 字數 2662 閱讀 5914

抓取網頁例項:

import 

urllib.request

file = urllib.request.urlopen('')

data = file.readlines()

with

open('c:/users/python/desktop/myhtml/my1.html'

,'wb') as

f:for

i in

data:

f.write(i)

file.read()讀取檔案的全部內容,與readlines不同的是,read會把讀取的內容付給乙個字串變數。

file.readline()讀取檔案的一行內容。

瀏覽器的模擬:headers屬性(出現403錯誤)

方法一:使用build_opener()修改報頭

import 

urllib.request

url =

""headers = ("user-agent"

,"mozilla/5.0 (windows nt 6.1;wow64) "

"(khtml, like gecko) chrome/38.0.2125.122 "

"safari/537.36 se"

"2.x metasr 1.0")

opener = urllib.request.build_opener()

opener.addheaders = [headers]

data = opener.open(url).readlines()

with

open('c:/users/python/desktop/myhtml/my2.html'

,'wb') as

f:for

i in

data:

f.write(i)

方法二:使用add_header()新增報頭

import 

urllib.request

url =

""req = urllib.request.request(url)

req.add_header("user-agent"

,"mozilla/5.0 (windows nt 6.1;wow64)"

"(khtml, like gecko) chrome/38.0.2125.122"

" safari/537.36 se 2.x metasr 1.0")

data = urllib.request.urlopen(req).readlines()

with

open('c:/users/python/desktop/myhtml/my2.html'

,'wb') as

f:for

i in

data:

f.write(i)

超時設定

import 

urllib.request

for

i in

range(1

,100):

try:

file = urllib.request.urlopen(""

,timeout

= 0.1) #timeout

修改超時時間

data = file.readlines()

print(len(data))

except

exception

as e:

print("

出現異常

-->"

+str(e))

http請求

1.get請求

import 

urllib.request

keywd =

"hello"

url =

"/s?wd="

+keywd

req = urllib.request.request(url)

data = urllib.request.urlopen(req).readlines()

with

open('c:/users/python/desktop/myhtml/my3.html'

,'wb') as

f:for

i in

data:

f.write(i)

import 

urllib.request

keywd = "國家

"#當keywd

為中文時,對

keywd

進行編碼

key_code = urllib.request.quote(keywd)

url =

"/s?wd="

+key_code

req = urllib.request.request(url)

data = urllib.request.urlopen(req).readlines()

with

open('c:/users/python/desktop/myhtml/my4.html'

,'wb') as

f:for

i in

data:

f.write(i)

url命名與反轉url

2.在cms應用的views.py檔案裡輸入如下 return httpresponse cms首頁 def login request return httpresponse cms登入頁面 3.在front應用的views.py檔案裡輸入如下 return httpresponse 前台首頁 d...

python異常處理及Url編碼

url編碼 import traceback import urllib.parse s besttest 自動化測試 print urllib.parse.quote s url編碼 print urllib.parse.quote plus s url編碼,src print urllib.pa...

說說靜態URL與動態URL

的url是優化的基礎,一般情況下就是減少動態引數 降低層級 偽靜態 規範搜尋變數對應的引數等幾種方法,特別是在企業站點,這種操作就相對更加簡單了。而從 長期運營的角度來講,越早解決越好,不然往後拖就會成為制約 發展和產品開發的決定性因素。百幫網路特分享如下內容 3 b3 d f.g n n e.首先...