python中用beautifulSoup 安裝

分享一下我老師大神的人工智慧教程！零基礎，通俗易懂！

如果你用的是新版的debain或ubuntu,那麼可以通過系統的軟體包管理來安裝:

$ apt-get install python-bs4

beautiful soup 4 通過pypi發布,所以如果你無法使用系統包管理安裝,那麼也可以通過 easy_install 或 pip 來安裝.包的名字是beautifulsoup4 ,這個包相容python2和python3.

$ easy_install beautifulsoup4

$ pip install beautifulsoup4

(在pypi中還有乙個名字是 beautifulsoup 的包,但那可能不是你想要的,那是 beautiful soup3 的發布版本,因為很多專案還在使用bs3, 所以 beautifulsoup 包依然有效.但是如果你在編寫新專案,那麼你應該安裝的 beautifulsoup4 )

$ python setup.py install

如果上述安裝方法都行不通,beautiful soup的發布協議允許你將bs4的**打包在你的專案中,這樣無須安裝即可使用.

作者在python2.7和python3.2的版本下開發beautiful soup, 理論上beautiful soup應該在所有當前的python版本中正常工作

beautiful soup支援python標準庫中的html解析器,還支援一些第三方的解析器,其中乙個是 lxml .根據作業系統不同,可以選擇下列方法來安裝lxml:

$ apt-get install python-lxml

$ easy_install lxml

$ pip install lxml

另乙個可供選擇的解析器是純python實現的 html5lib , html5lib的解析方式與瀏覽器相同,可以選擇下列方法來安裝html5lib:

$ apt-get install python-html5lib

$ easy_install html5lib

$ pip install html5lib

解析一般的網頁（html），直接：

from bs4 import beautifulsoup,beautifulstonesoup

import urllib2

import html5lib

url_header = "******"

webpage = urllib2.urlopen(url_header).read()

soup = beautifulsoup(webpage)

print soup.prettify( )

但是在解析shtml的網頁的時候，beautifulsoup模組支援的直譯器有lxml，html5lib和htmlparse三種，只有html5lib支援解析shtml，所以在生成beautifulsoup物件的時候，要加上乙個引數：soup = beautifulsoup(webpage,"html5lib")，不然的話，當解析shtml頁面的時候，標籤裡面的內容是無法解析的。

更多詳情請參考:beautifulsoup 4.2.0官方檔

給我老師的人工智慧教程打call！

python中用beautifulSoup 安裝

python中用列表作為佇列

python中用filter求素數

Python中用tuple作為key

python中用beautifulSoup 安裝

python中用列表作為佇列

python中用filter求素數

Python中用tuple作為key

相關推薦