Python Pandas將最新的資料更新到資料庫

在做資料分析時，時常會用到 dataframe.to_sql 語句，把一些計算或分析好的資料加入到資料庫中，一般的新增是很好操作的，直接設定乙個引數即可：

import pandas as pd
import sqlite3 as sq3
conn = sq3.connect(db_file_name, detect_types=sq3.parse_decltypes)
dataframe.to_sql(
table_name,
con=conn,
index=false
)

如果存在，就新增到後面去，但是如果我們資料中都是有唯一key的怎麼辦？比如更新使用者資訊，假如乙個使用者已經存在資料表中了，我們要怎麼處理？

乙個笨方法就是把資料全部讀取出來，然後更改之後再一次寫進去，這麼做如果資料量過大，就會很慢。

除了這個，那就只能依靠資料庫本身的功能了，replace。

就是先把要更新的使用者和新使用者加入到乙個臨時**中，然後運用replace語句去替換或新增到使用者表裡頭去

做法如下：

user_table_name = 'userinfo'
# dft 是乙個 dataframe 資料集
dft.to_sql('temp', conn, if_exists='replace', index=false) # 把新資料寫入 temp 臨時表
connection = conn.cursor()
# 替換資料的語句
args1 = f""" replace into ""
select * from "temp"
"""connection.execute(args1) 
args2 = """ drop table if exists "temp" """ # 把臨時表刪除
connection.execute(args2)
connection.close()
conn.commit()

這樣就可以把新資料更新到資料表中了~

Python pandas，建立Series型別

numpy只能處理數值型別的資料。pandas除了可以處理數值型別外，還可以處理非數值型別的資料例如字串時間序列等 pandas常用的資料型別 series 一維，帶標籤的陣列，對應資料庫中的一條記錄 dataframe 二維，series容器，對應資料庫中的表 demo.py series的...

python pandas使用記錄

在使用numpy中array格式的矩陣時，我們通常使用如a 2 4,5 10 獲取陣列中一部分資料，但是dataframe結構的陣列就不能這麼寫，可以使用iloc方法，即index locate,另外有個相似的方法loc,這個方法是通過column名字進行資料定位的 import pandas as...

Python pandas總結未完

obj.index obj.values obj4.name population obj4.index.name state obj.index bob steve jeff ryan data frame pd.dataframe data frame2 pd.dataframe data,co...

Python Pandas將最新的資料更新到資料庫

Python pandas，建立Series型別

python pandas使用記錄

Python pandas總結未完

相關推薦