Pandas的22種核心操作

讀取csv格式的資料集

pd.dataframe.from_csv(
"csv_file"
)pd.read_csv(
"csv_file"
)

讀取excel資料集

pd.read_excel(
"excel_file"
)

將dataframe直接寫入csv檔案

df.to_csv(
"data.csv"
, sep=
",", index=
false
)

基本的資料集特徵資訊

df.info(
)

基本的資料集統計資訊

df.describe(
)

將 dataframe 列印成**的樣子

print
(tabulate(print_table, headers=headers)
)

列出所有列的名字

df.colums

刪除缺失資料

df.fropna(axis=
0, how=
'any'
)

替換缺失資料

df.replace(to_replace=
none
, value=
none
)

檢查空值 nan

pd.isnull(
object
)

刪除特徵

df.drop(
'feature_variable_name'
, axis=1)
axis=
1表示列
axis=
0表示行

將目標型別轉換為浮點型

pd.to_numeric(df[
"feature_name"
], error=
'coerce'
)

將dataframe轉換為numpy陣列

df.as_matrix(
)

取 dataframe 的前面「n」行

df.head(n)

通過特徵名取資料

df.loc[feature_name]

對 dataframe 使用函式]

def
multiply
(x):
return x*
2df[
"height"].
(multiply)

重新命名行

df.rename(columns =
, inplace=
true
)

子dataframe

new_df = df[
["name"
,"size"
]]

總結資料

df.
sum(
)df.
min(
)df.
max(
)df.idxmin(
)df.idxmax(
)df.mean(
)df.median(
)df.corr(
)df[
"size"
].median(
)

排序

df.sort_values(ascending=
false
)

布林型索引

df[df[
"size"]==
5]

按行、列取值

df.loc([0
],['size'
])

Pandas的拼接操作

import numpy as np from pandas import dataframe,series import pandas as pd 0回顧numpy的級聯練習 1.生成2個3 3的矩陣，對其分別進行兩個維度上的級聯 nd np.random.randint 0,10,size 3...

Pandas的簡單操作

1.建立用列表建立 pd.series 1 2,3 4,5 用numpy陣列建立 a np.array 1 2,3 4,5 pd.series a 用字典建立 dict s pd.series dict s 也可通過其他series定義新的series物件 arr np.array 1 2,3 4...

pandas的基本操作

資料讀寫讀入mysql資料庫資料匯入第三方模組 import pymysql 連線mysql資料庫 conn pymysql.connect host localhost user root password test database test port 3306 charset utf8 讀...

Pandas的22種核心操作

Pandas的拼接操作

Pandas的簡單操作

pandas的基本操作

相關推薦