模擬登陸並爬取Github

因為崔前輩給出的**執行有誤，略作修改和簡化了。

書上例題，不做介紹。

import
requests
from lxml import
etree
class
login(object):
def__init__
(self):
self.headers =
#登陸位址
self.login_url = '
'#post請求位址
self.post_url = '
'#使用session保持狀態，並自動處理cookies(在訪問其他子網頁時，可以保持登陸，爬取網頁)
self.session =requests.session()
deftoken(self):
#獲取網頁資料
response = self.session.get(self.login_url, headers=self.headers)
#提取網頁中我們需要的authenticity_token並返回
selector =etree.html(response.text)
token = selector.xpath('
//input[@name="authenticity_token"]/@value')
return
token
deflogin(self, email, password):
post_data =
#使用post方法模擬登陸
response = self.session.post(self.post_url, data=post_data, headers=self.headers)
#登陸正常，輸出登陸後的網頁**，並將它儲存帶d盤github.txt
if response.status_code == 200:
print
(response.text)
with open(
'd:/github.txt
', '
w', encoding = '
utf-8
') as f:
f.write(response.text)
else
: 
print("
error!!!")
if__name__ == "
__main__":
login =login()
login.login(email='
[email protected]
', password='
password
')#輸入你自己的賬戶密碼
可以改成網頁形式檢視
 用Python模擬登陸GitHub並獲取資訊
搜狗的反爬有點厲害，即使我用了高匿 它還是會提醒我ip訪問過於頻繁，然後跳轉驗證碼頁面。不過方法還是有的，通過其他搜狗搜尋 動態改變 乙個賬號沒辦法呀.這裡先對github進行模擬登陸，了解會話及cookies相關知識。01 網頁分析 首先看一下登入頁，獲取authenticity token引數值...
使用requests模擬登陸github
學了了下python requests 以及文字處理和正則工具re,順便應用一下。使用requests模擬登陸github 準備 tampler data 使用教程 利用它獲取到登陸所需要的header，post引數等資訊。requests 快速入門教程 模擬 import requests imp...
模擬登陸 github模擬登陸，列印資訊流
目的 動態獲取cookie 1 開啟開發者工具，檢視各自請求 2 可以看到name為session的請求 方式post，傳入的data 3 檢視name為login的請求，原始碼中獲得token，作為上乙個請求中的data的一部分 檢視資訊流請求的url，自行構建對應的url，解析 1 這裡有個技巧...

模擬登陸並爬取Github

用Python模擬登陸GitHub並獲取資訊

使用requests模擬登陸github

模擬登陸 github模擬登陸，列印資訊流

相關推薦