python學習（二） Pandas篇（1）

pandas 乙個資料分析處理的庫，基於python 底層是基於numpy的，pandas的核心結構是dataframe。此篇簡單學習了pandas的基礎操作，主要包括對.csv檔案的讀取（pandas.read_csv（「path」））；讀取資料的前幾行（.head()）或者後幾行(.tail())；顯示資料的列名也即是抬頭（.columns()）以及顯示資料的大小，有多少行多少列（.shape），以及取某一行或ji'hang幾行的資料（.loc）; 根據某些條件篩選某些列的資訊；對資料的列與列之間做一些資料處理；獲得某一列的最大值等；最後是對資料的排序（.sort_values("energ_kcal",inplace=true)降序，.sort_values("energ_kcal",inplace=true,ascending=false)公升序）。

e:\python_code_test\food_info.csv

具體**如下：

# 資料分析處理庫 pandas
# pandas 封裝了很多函式，是在numpy基礎之上，底層是基於numpy的。
# 資料讀取 讀.csv檔案 pandas.read_csv('')
import pandas
food_info = pandas.read_csv('e:\\python_code_test\\food_info.csv')
print(type(food_info)) # dataframe pandas的核心結構
print(food_info.dtypes) # 錶值的型別 int64 float64 object(相當於string型)
# print(help(pandas.read_csv)) # 可以輸出相應函式的幫助文件，更多的了解函式
# 從前往後取或者從後往前取資料
print(food_info.head()) # 在table中顯示前五行資料報括抬頭
print(food_info.tail()) # 顯示尾幾行[5 rows x 36 columns]
print(food_info.tail(3)) # [3 rows x 36 columns] 
# 顯示列名 即抬頭
print(food_info.columns)
print(food_info.shape) # (8618, 36) # 8618組資料，每組資料含有36個屬性
# 取資料 .loc 相當於乙個index
print(food_info.loc[0]) # 取第一行的資料
# ndb_no 1001
# shrt_desc butter with salt
# water_(g) 15.87
# energ_kcal 717
# ……
#…………………………………………………………… 取資料 ……………………………………………………………………………………………………………
# 通過切片取資料
print(food_info.loc[3:6]) # 取第三行到第六行的整行資料 [3,6]
# 整列整列的取資料
get_col = food_info["water_(g)"] # 根據抬頭定位到某一列
print(get_col) # name: water_(g), length: 8618, dtype: float64
# 或通過變數定位
# col_name = "water_(g)"
# get_col = food_info[col_name] 
two_col_name = ["energ_kcal","water_(g)"]
get_two_col = food_info[two_col_name] 
print(get_two_col) # 列印出兩列的指標 [8618 rows x 2 columns]
#…………………………………………………………………… 根本條件進行查詢 ………………………………………………………………………………
# 查詢.csv檔案中那些引數以及列名是以(g)為結尾的列
col_names = food_info.columns.tolist() # 將列名儲存為list
gram_columns = # 空的
# 條件篩選獲得end為(g)的列名 並複製給gram_columns
for c in col_names:
if c.endswith("(g)"): 
gram_df = food_info[gram_columns] # 獲得符合篩選列名條件的列名所在列
print(gram_df.head(3))
# water_(g) protein_(g) ... fa_mono_(g) fa_poly_(g)
# 0 15.87 0.85 ... 21.021 3.043
# 1 15.87 0.85 ... 23.426 3.012
# 2 0.24 0.28 ... 28.732 3.694
# …………………………………………………………… 做一些加減乘除的操作 ………………………………………………………………………
div_1000 = food_info["cholestrl_(mg)"]/1000
print(div_1000) # 相當於把mg轉化成了g
# 對兩個列做一些組合 維度一樣時，對應位置進行相應的操作
water_energy = food_info["water_(g)"] * food_info["energ_kcal"]
# 新增一列 並命名 iron_(g)
print(food_info.shape) # (8618, 36)
iron_gram = food_info["iron_(mg)"]/1000
food_info["iron_(g)"] = iron_gram
print(food_info.shape) # (8618, 37)
# 取某一列最大值
max_calories = food_info["energ_kcal"].max()
print(max_calories) # 902
# 歸一化
normalized_calories = food_info["energ_kcal"] / max_calories
normalized_fat = food_info["lipid_tot_(g)"]/food_info["lipid_tot_(g)"].max()
food_info["normalized_fat"] = normalized_fat
print(food_info.shape) # (8618, 38)
#…………………………………………………………………… 做排序 …………………………………………………………………………………………
food_info.sort_values("energ_kcal",inplace=true) # 從小到達排序 公升序
print(food_info["energ_kcal"]) 
food_info.sort_values("energ_kcal",inplace=true,ascending=false) # 從大到小排序 降序
print(food_info["energ_kcal"])

python學習（二） Pandas篇（1）

Pandas學習筆記（二）

入門學習（二）Pandas

學習筆記 Pandas（二）

python學習（二） Pandas篇（1）

Pandas學習筆記（二）

入門學習（二）Pandas

學習筆記 Pandas（二）

相關推薦