pandas排序與統計

《python for data analysis》

sort_index()

對行或列索引進行排序

in [1]: import pandas as pd
in [2]: from pandas import dataframe, series
in [3]: obj = series(range(4), index=['d','a','b','c'])
in [4]: obj
out[4]:
d 0
a 1
b 2
c 3
dtype: int64
in [5]: obj.sort_index()
out[5]:
a 1
b 2
c 3
d 0
dtype: int64
in [6]: import numpy as np
in [8]: frame = dataframe(np.arange(8).reshape((2,4)), index=['three','one'],
...: columns=['d','a','b','c'])
in [9]: frame
out[9]:
d a b c
three 012
3one 456
7in [10]: frame.sort_index()
out[10]:
d a b c
one 456
7three 012
3in [11]: frame.sort_index(axis=1)
out[11]:
a b c d
three 123
0one 567
4in [12]: frame.sort_index(axis=1, ascending=false)
out[12]:
d c b a
three 032
1one 476
5

sort_values

對series按值進行排序, 排序時，任何缺失值預設都會被放到series的末尾。

in [18]: obj = series([4, np.nan, 6, np.nan, -3, 2])
in [19]: obj
out[19]:
04.0
1 nan
26.0
3 nan
4 -3.0
52.0
dtype: float64
in [21]: obj.sort_values()
out[21]:
4 -3.0
52.0
04.0
26.0
1 nan
3 nan
dtype: float64

在dataframe上，根據乙個或多個列中的值進行排序。將乙個或多個列的名字傳遞給by選項即可達到該目的：

in [16]: frame.sort_values(by='b')
out[16]:
d a b c
three 012
3one 456
7

sum、mean、max

約簡方法的選項選項

說明axis

約簡的軸。dataframe的行用0，列用1

skipna

排除缺失值，預設值為true

level

如果軸是層次化索引的（miltiindex）,根據level分組約簡。

idxmin,idxmax：達到最小值或最大值的索引。

cumsum

df.describe:數值型和非數值型不同。

corr(): 相關係數

cov()：協方差

unique:可以得到series中的唯一值陣列。

isin：用於判斷向量化集合的成員資格

value_counts:用於計算乙個series中各值出現的概率。

Pandas 統計函式與apply

import numpy as np import pandas as pd from pandas import series,dataframe方法說明count 非na值的數量 describe 針對series或各dataframe列計算匯計 min max 計算最小值和最大值 argm...

pandas 排序 Pandas 資料排序

python 的 pandas 庫中有一類對資料排序的方法，主要分為對引數列排序，對數值排序，及二者混合三種。一.引數列排序首先我們生成乙個亂序數列 unsorted df 隨後我們可通過 df.sort index 函式對資料集進行排序操作如不做規定，返回行引數正序排序新增引數 ascend...

Pandas 統計功能

dataframe 描述性統計和匯計count 非na值的個數 describe 計算series和dataframe各列的匯計集合 min max 計算最小值最大值 argmin argmax 計算最小值與最大值所在的索引位置整數 idxmin idxmax 計算最小值與最大值所在的索引標...

pandas排序與統計

Pandas 統計函式與apply

pandas 排序 Pandas 資料排序

Pandas 統計功能

相關推薦