自學Python資料處理 二 資料解析

2021-10-17 10:32:49 字數 1656 閱讀 7939

資料採用的是「豆瓣讀書top250」的第一頁的資料

#-

*- codeing = utf-8-

*-#@time:

2021/1

/2413:

52#@file:booklist.py

#@software:pycharm

import requests

from bs4 import beautifulsoup

url =

''#請求資料

headers =

data = requests.

get(url,headers = headers)

print

(data.text)

#解析資料

soup =

beautifulsoup

(data.text,

'lxml'

)print

(soup)

books = soup.

find

('div',)

books = books.

find_all

('table'

)book_list =

list

(books)

#獲取書籍的相關資訊

img_urls=

titles=

ratings=

authors=

details=

for book in books:

#位址img_url = book.

find_all

('a')[

0].find

('img').

get(

'src'

) img_urls.

(img_url)

#書名title = book.

find_all

('a')[

1].get_text()

titles.

(title)

#評分rating = book.

find

('div',)

.get_text()

rating = rating.

replace

('\n',''

).replace

(' ',''

) ratings.

(rating)

#作者author = book.

find

('p',)

.get_text()

author = author.

replace

('\n',''

).replace

(' ',''

) authors.

(author)

#其它detail = book.

find_all

('p')[

1].get_text()

detail = detail.

replace

('\n',''

).replace

(' ',''

) details.

(detail)

#print(details) #用來驗證資料是否讀取、解析成功

自學Python資料處理 一 資料請求

獲取資料首先要向伺服器傳送request請求,但是直接傳送的話很容易會被發現是 爬蟲 所以就需要進行簡單的 偽裝 import requests url headers data requests.get url,headers headers print data.text 偽裝的重點是heade...

python處理資料(二) 咪咕資料處理

路徑中包含變數的情況 fw open result date migu.singerinfo.txt w filelist gci str startswith和endswith的用法 if f.startswith str a m songer and f.endswith txt 寫shell指...

Python 資料處理

將檔案切分,存入列表 strip split with open james.txt as jaf data jaf.readline james data.strip split 資料檔案為 2 34,3 21,2.34,2.45,3.01,2 01,2 01,3 10,2 22 print ja...