Python學習爬蟲之pdfkit用法

1、pdfkit模組用途

2、pdfkit用法

簡單講，可以有以下三種用法，第乙個引數為將要轉化的html鏈結（檔案，或字串），第二個引數為儲存在本地的pdf文件名稱。

import pdfkit 
pdfkit.from_url('','out1.pdf') 
pdfkit.from_file('123.html','out2.pdf') 
pdfkit.from_string('hello!','out3.pdf')

3、例項

import requests
from bs4 import beautifulsoup
import pdfkit
html_template = """
"""res = requests.get('')
soup = beautifulsoup(res.content,'html.parser')
article = soup.select('.article.article_16')[0]
article = str(article)
html = html_template.format(content = article)
html = html.encode('utf-8')
with open('news_china.html','wb') as f:
f.write(html)
pdfkit.from_file('news_china.html','news.pdf')
print('succeed')

需要注意以下幾點：

爬蟲 Python爬蟲學習筆記之Urllib庫

1.urllib.request開啟和讀取url 2.urllib.error包含urllib.request各種錯誤的模組 3.urllib.parse解析url 4.urllib.robotparse解析 robots.txt檔案傳送get請求引入urlopen庫用於開啟網頁 from u...

Python學習之爬蟲基礎

第0步獲取資料通過requests庫來獲取資料 requests.get 用法 import requests 引入requests庫 res requests.get url requests.get是在呼叫requests庫中的get 方法，它向伺服器傳送了乙個請求，括號裡的引數是你需要的資...

Python之爬蟲學習（四）

from urllib import request from urllib import error from bs4 import beautifulsoup import random import time class annualreport object def init self se...

Python學習 爬蟲之pdfkit用法

爬蟲 Python爬蟲學習筆記之Urllib庫

Python學習之爬蟲基礎

Python之爬蟲學習（四）

相關推薦

Python學習爬蟲之pdfkit用法