2 個人學習python 爬取豆瓣單個電影

**如下：

# -*- coding: utf-8 -*
import requests,time
from lxml import etree
url=''
html=requests.get(url).text #此處獲取html網頁**
s=etree.html(html) #此處獲取html節點物件
film_name=s.xpath('//*[@id="content"]/h1/span[1]/text()')[0]
director=s.xpath('//*[@id="info"]/span[1]/span[2]/a/text()')
directors='，'.join(director)
actor=s.xpath('//*[@id="info"]/span[3]/span[2]/a/text()')
actors='，'.join(actor)
cls=s.xpath('//*[@id="info"]/span[@property="v:genre"]/text()')
clss='，'.join(cls)
print ('電影名：',film_name,'\n導演：',directors,'\n主演：',actors,'\n型別：',clss)
actor=s.xpath('normalize-space(//*[@id="info"]/span[3]/span[2]/a/text())')
print ('主演：',actor,' //','normalize-space()等同於[0]')

結果如下：

小結：1、使用xpath拿到得都是乙個個的節點物件，即列表，所以如果需要查詢內容的話，還需要遍歷拿到資料的列表；

2、也可用normalize-space()或加[0]，兩者都是取的第乙個值，但normalize-space()會去掉轉義字元；或用join()函式；

3、切記：瀏覽器複製xpath 不是完全可靠的，可以去網頁原始碼看看是否這一層級。

python爬取豆瓣影評

看的別人的爬取某部影片的影評沒有模擬登入只能爬6頁 encoding utf 8 import requests from bs4 import beautifulsoup import re import random import io import sys import time 使用se...

python爬取資料豆瓣讀書

xpath爬取指令碼 from urllib import request from lxml import etree base url response request.urlopen base url html response.read decode utf 8 htmls etree.ht...

python爬取豆瓣網頁短評實戰！

首先我們開啟我的父親母親的網頁介面鏈結可以觀察到如下介面以及讀者對本書的評價接下來我們直接附上書名我的父親母親出版社南海出版公司原作名 alfred and emily 譯者匡詠梅出版年 2013 1 頁數 238 定價 29.50元裝幀精裝叢書新經典文庫萊辛作品 is...

2 個人學習python 爬取豆瓣單個電影

python爬取豆瓣影評

python爬取資料豆瓣讀書

python爬取豆瓣網頁短評實戰！

相關推薦