雪球網爬取資料並存入資料庫

2021-08-23 14:36:26 字數 1617 閱讀 1284

from urllib import request

import json

import pymysql

class

mysql_connect

(object):

# 初始化的建構函式

def__init__

(self):

self.db = pymysql.connect(host='127.0.0.1',user='root',password='yao123',port=3306,database='pachong')

self.cursor = self.db.cursor()

# 執行修改操作

defmysql_do

(self,sql):

self.cursor.execute(sql)

self.db.commit()

# 結束函式

def__del__

(self):

self.cursor.close()

self.db.close()

url = ''

# 預設從第一頁開始爬取

defxueqiu

(number=1,max_id=none,count=none):

if max_id is

none:

full_url = url.format(-1,10)

else:

full_url = url.format(max_id,count)

count = 15

headers =

# 最大頁碼數

if number<=4:

print('第%d頁:'%number)

number += 1

req = request.request(full_url,headers=headers)

response = request.urlopen(req)

result = response.read().decode('utf-8')

# json處理

j = json.loads(result)

m = mysql_connect()

for i in j['list']:

detail = json.loads(i['data'])

print(i['id'],detail['title'])

description = detail['description']

# 此處sql語句description有特殊字元會轉義sql語句,只能傳入前幾條語句,所以進行為none處理

sql = 'insert into snowball values ("{}","{}","{}","{}");'.format(detail['id'],detail['title'],none,detail['target'])

m.mysql_do(sql)

print(j['list'][0])

xueqiu(number,j['list'][-1]['id'],count)

if __name__ == '__main__':

xueqiu(1,-1,10)

爬取拉勾網資料,並存入Mongodb資料庫

import time import pymongo import requests from bs4 import beautifulsoup 簡歷資料庫連線 client pymongo.mongoclient localhost 27017 mydb client mydb lagou myd...

爬取京東商品評論並存入資料庫(二)

前言 上篇我詳細說明了爬取得過程,這裡就不過多解釋,直接上爬取得 簡單明瞭 import requests import json import pymysql comment url def jd page params headers comment resp requests.get url ...

使用scrapy框架爬取資料並存入excel表中

爬取 爬取目標 獲得乙個地區七天之內的天氣狀況,並存入excel 中 爬蟲檔案部分 import scrapy from items import tianqiyubaoitem class tianqispider scrapy.spider name tianqi allowed domains...