BeautifulSoup解析資料

4、基本操作

# coding: utf-8
__author__ = "wengwenyu"
from bs4 import beautifulsoup
fp = open('soup_text.html', encoding='utf-8')
soup = beautifulsoup(fp, 'lxml')
# print(soup)
# 根據標籤名進行查詢 : 定位到head標籤 注意: 只能找到第乙個符合條件的標籤
# print(soup.head)
# print(soup.div) #找到第一次出現的div
# 獲取某乙個標籤中的屬性 （字典形式）
# print(soup.a.attrs) 獲取某一標籤所有屬性和屬性值
# print(soup.a.attrs["target"]) # 獲取指定屬性值
# 獲取內容
# print(soup.p.string) 獲取直系內容 相當於xpath的/text()
# print(soup.p.text) 所有文字內容 xpath //text()
# print(soup.body.get_text()) 所有文字內容 相當於 xpath //text()
# find返回符合要求的第乙個標籤
# print(soup.find('div', class_="song"))
# print(soup.find('div', class_="tang"))
# 找到所有符合條件的標籤
# print(soup.find_all('a'))
# print(soup.find_all('a',limit=2)) # 限制前兩個
# print(soup.select('.song'))#可以使用選擇器
# selet函式的層級選擇器
# print(soup.select('div > img ')) #返回乙個列表
print(soup.select('div li')) # >表示直系層級 空格表示多個層級

資料解析 BeautifulSoup

bs4資料解析例項化乙個beautifulsoup物件，並且將頁面遠嗎載入到該物件中。通過呼叫beautifulsoup物件中相關屬性方法進行標籤定位，資料提取。pip install bs4 pip install lxml 解析器下面介紹乙個是從本地html文件中載入beautifulsou...

BeautifulSoup解析xml檔案的使用初步

借助拉手網的開放api藉口，獲取特定城市的當日資料列印響應獲取每個店鋪的短標題和購買數量 print each.data.display.shorttitle.text,each.data.display.bought.text if name main fetch 沒有和etree.elem...

使用BeautifulSoup解析HTML

通過css屬性來獲取對應的標籤，如下面兩個標籤可以通過class屬性抓取網頁上所有的紅色文字，具體如下 from urllib.request import urlopen from bs4 import beautifulsoup html urlopen bsobj beautifulsou...

BeautifulSoup解析資料

資料解析 BeautifulSoup

BeautifulSoup解析xml檔案的使用初步

使用BeautifulSoup解析HTML

相關推薦