pandas 處理資料常用功能

mean() 
也可設定skipna=false，預設為true，跳過空值
count(),min(), sum(), median(), 
quantile(q=0.75) 統計分位數
std(), var(), skew(), kurt() 標準差，方差，偏度，峰度
cumsum(), cumprod(), cummax(), cummin() 累計和，差，最大值，最小值
唯一值：series.unique()
值計數：series.value_count(sort=flase) 引數預設true
成員資格：isin()

通過str訪問，自動排除/na值
s.str.count()
df['key1'].str.upper() # df no attribute 'str'
df.coulumns.str.upper()
字串常用方法：針對series每個元素操作
1、lower, upper, len, startswith, endswith
2、strip(), replace(, n=1)#n替換個數, split() 引數expand和n, 
3、字串索引，取每個字元的前n個，eg：'abcd'-->str[:2] --> 'ab'

pd.merge(left, right, how='inner', on=none, 
left_on=none, right_on=none,
left_index=false, right_index=false, 
sort=true, suffixes=('_x', '_y'), copy=true, indicator=false)
how引數: inner, outer, left, right
sort：像字典，會對第一列排序。預設為false，設定為false會大幅提高效能
另一種sort: 也可以呼叫sort_index(), sort_values()方法。
pd.join() --> 直接通過索引鏈結
引數on: 也可以設定關鍵字

連線 - 沿軸執行連線操作
pd.concat(objs, axis=0, join='outer', join_axes=none, ignore_index=false,
keys=none, levels=none, names=none, verify_integrity=false,
copy=true)
引數axis：0表示行相加，1表示列相加
引數join: outer、inner
引數join_axes: 指定聯合的index
引數keys：axis為0時變為multiindex，1時覆蓋列名
修補 pd.combine_first()
df1.combine_first(df2)
根據index, df1空值被df2替換
如果df2的index多於df1，則更新到df1上
覆蓋 df1.update(df2) 按照index覆蓋

去重  .duplicated()   .drop_duplicates() series和df都可以直接使用
替換 .replace() sr和df都可以直接用(sr.str可以替換單個字元'bab'中的'a')

df.groupby(by=none, axis=0, level=none, as_index=true, sort=true, group_keys=true, squeeze=false, **kwargs)
引數level：唯一索引用level=0,將相同index分為一組
#分組 - 可迭代物件
list(df.groupby('x'))
#提取分組後的組
df.groupby('x').get_group('a') 
df.groupby('x').size() # 檢視分組長度
#axis=1 即列分組，將按照值型別分組
#通過字典或者series作為分組依據
#通過函式分組，df.groupby(len).sun()
#分組計算函式方法
first, last # 非nan第乙個，最後乙個
sum, mean, median, count, min, std, prod #prod為積
#多函式計算：agg()
#分組後應用函式，再轉為df

# .add_prefix('mean_')：新增字首

transform 即使groupby，也還是對每個計算並分開顯示

Pandas常用功能

02 查詢 03 行列操作 04 資料統計 05 資料預處理 8.pd.set option 設定pycharm顯示行數 06 取值 11.rotation 文字方向 14.df.eval 15.df.drop duplicates 16.select dtypes 17.lambda x impo...

Pandas常用功能

在使用pandas之前，需要匯入pandas庫 import pandas as pd pd作為pandas的別名常用功能如下功能 1 dataframe 建立乙個dataframe物件 2 df.values 返回ndarray型別的物件 3 df.index 獲取行索引 4 df.colum...

Pandas常用功能自用

df.to csv 希臘債務.csv index none encoding utf 8 sig sep t head list 天數性別身高失蹤地點失蹤年失蹤月到達地 df dealed pd.dataframe dealed list,columns head list 新增行名 d...

pandas 處理資料常用功能

Pandas常用功能

Pandas常用功能

Pandas常用功能 自用

相關推薦

Pandas常用功能自用