python 爬蟲抓取19樓租房資訊

chrome中開啟19lou.com，按f12可以開啟開發者工具檢視

不獲取cookie會導致爬取**時重定向而抓不到內容

headers =

我們請求的url為：

page=1為第一頁

迴圈標籤獲取所有子頁面的url，然後請求詳情頁獲取租房資訊和

for child in soup.table.find_all('a'):
url = child.get('href')
try:
request = urllib2.request(url,headers=headers)
response = urllib2.urlopen(request)
except:
print
'server conect failed'
raw_input('press enter key to exit')
exit()

html = response.read()
soup = beautifulsoup(html,"html.parser",fromencoding="gb18030")
ul = soup.find('ul',)
tr = soup.find('table',).find_all('tr')
hx = tr[2].td.get_text() #獲取 戶型 資訊
#迴圈獲取房屋位址
for row in ul.find_all('li'):
image = row.find('a',).img.get('src')
image_big = row.find('p').img.get('src')

以上為部分講解，下面是**鏈結抓取到的資訊

github完整**

Python爬蟲入門 16 鏈家租房資料抓取

作為乙個活躍在京津冀地區的開發者，要閒著沒事就看看石家莊這個國際化大都市的一些資料，這篇部落格爬取了鏈家網的租房資訊，爬取到的資料在後面的部落格中可以作為一些資料分析的素材。我們需要爬取的為首先確定一下，哪些資料是我們需要的可以看到，框就是我們需要的資料。接下來，確定一下翻頁規律 pg1 pg...

python 爬蟲基本抓取

首先，python中自帶urllib及urllib2這兩個模組，基本上能滿足一般的頁面抓取，另外，requests 也是非常有用的。對於帶有查詢欄位的url，get請求一般會將來請求的資料附在url之後，以?分割url和傳輸資料，多個引數用連線。data requests data為dict，js...

python 爬蟲，抓取小說

coding utf 8 from bs4 import beautifulsoup from urllib import request import re import os,time 訪問url，返回html頁面 defget html url req request.request url ...

python 爬蟲抓取19樓租房資訊

Python爬蟲入門 16 鏈家租房資料抓取

python 爬蟲 基本抓取

python 爬蟲，抓取小說

相關推薦

python 爬蟲基本抓取