用Python3寫乙個簡單的爬小說的爬蟲（下）

import requests
import re
from bs4 import beautifulsoup
#獲取整個頁面，筆趣閣某本**的**
url = ''
response = requests.get(url)
html = response.text
#清洗資料
title = re.findall('(.*?)', html, re.s)[0]#得到**名
fb = open('%s.txt' % title, 'w', encoding='utf-8')#儲存檔案以**名命名
list = re.findall('(.*?)' , html, re.s)[0]#得到整個列表
soup = beautifulsoup(list, 'html.parser')#解析列表
chapter_list = soup.find_all('a')#找到列表裡面所有的a標籤，得到每個章節的連線和章節名
'''分別列印輸出a標籤的href和value
for link in links:
print(link.name, link['href'], link.get_text())
'''for chapter in chapter_list:
chapter_url = chapter['href']
chapter_title = chapter.get_text()
chapter_response = requests.get(chapter_url)
chapter_html = chapter_response.text
chapter_soup = beautifulsoup(chapter_html,'html.parser')
chapter_content_list = chapter_soup.find_all('p')#**的內容都放在p標籤裡面
fb.write(chapter_title)#章節名
fb.write('\n')
for chapter_content in chapter_content_list:
print(chapter_content.get_text())#列印到控制台
fb.write(chapter_content.get_text())#寫進文件裡
fb.write('\n')
fb.write('\n')
fb.write('\n')
#exit()先列印一章，看是否出錯，用exit來測試

用Python3寫乙個簡單的爬小說的爬蟲（上）

import requests import re url 模擬瀏覽器傳送http請求 response requests.get url 網頁原始碼 html response.text 0 取列表下第0個元素.eg title的輸出結果為鬥神狂飆無彈窗鬥神狂飆最新章節列表鬥神狂飆5200 ...

用python寫乙個簡單的視窗

import sys if name main 建立乙個視窗 w qwidget 設定視窗的尺寸 w.resize 400,200 移動視窗 w.move 300,300 設定視窗的標題 w.setwindowtitle 第乙個基於pyqt5的桌面應用顯示視窗 w.show 進入程式的主迴圈並通...

ROS 用Python寫乙個簡單服務

一.編寫服務資料在功能包的頂級目錄中，建立srv資料夾，並在裡面建立.srv檔案先成為a.srv 在srv檔案中，填入服務資料，如 int64 a int64 b int64 sum其中,上方是請求資料，下方是答應資料二.修改cmakelist和package.xml cmakelist ca...

用Python3寫乙個簡單的爬小說的爬蟲（下）

用Python3寫乙個簡單的爬小說的爬蟲（上）

用python寫乙個簡單的視窗

ROS 用Python寫乙個簡單服務

相關推薦