Pandas 基本介紹

本文是pandas的基本介紹

若用 python 的列表和字典來作比較, 那麼可以說 numpy 是列表形式，而 pandas 就是字典形式。

pandas是基於numpy構建的，讓numpy為中心的應用變得更加簡單。

要使用pandas，首先要了解他主要兩個資料結構：series和dataframe。

series的字串表現形式為：索引在左邊，值在右邊。

若沒為資料指定索引，就會自動建立乙個0到n-1（n為長度）的整數型索引。

import pandas as pd
import numpy as np
s = pd.series([1
,3,6
,np.nan,44,
1])print
(s)"""
0 1.0
1 3.0
2 6.0
3 nan
4 44.0
5 1.0
dtype: float64
"""

dataframe是乙個**型的資料結構，它包含有一組有序的列，每列可以是不同的值型別（數值，字串，布林值等）。

dataframe既有行索引也有列索引，

可被看做由series組成的大字典。

dates = pd.date_range(
'20160101'
,periods=6)
df = pd.dataframe(np.random.randn(6,
4),index=dates,columns=
['a'
,'b'
,'c'
,'d'])
print
(df)
""" a b c d
2016-01-01 -0.253065 -2.071051 -0.640515 0.613663
2016-01-02 -1.147178 1.532470 0.989255 -0.499761
2016-01-03 1.221656 -2.390171 1.862914 0.778070
2016-01-04 1.473877 -0.046419 0.610046 0.204672
2016-01-05 -1.584752 -0.700592 1.487264 -1.778293
2016-01-06 0.633675 -1.414157 -0.277066 -0.442545
"""

print (df[ 'b'] )""" 2016-01-01 -2.071051 2016-01-02 1.532470 2016-01-03 -2.390171 2016-01-04 -0.046419 2016-01-05 -0.700592 2016-01-06 -1.414157 freq: d, name: b, dtype: float64

"""

建立一組，沒給定行標籤和列標籤的資料 df1

會採取預設的從0開始 index。

df1 = pd.dataframe(np.arange(12)
.reshape((3
,4))
)print
(df1)
""" 0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
"""

還有一種生成 df 的方法

能對每一列的資料進行特殊設定

df2 = pd.dataframe(
)print
(df2)
""" a b c d e f
0 1.0 2013-01-02 1.0 3 test foo 
1 1.0 2013-01-02 1.0 3 train foo
2 1.0 2013-01-02 1.0 3 test foo
3 1.0 2013-01-02 1.0 3 train foo
"""

用屬性dtype，可以檢視資料中的型別:

print
(df2.dtypes)
"""df2.dtypes
a float64
b datetime64[ns]
c float32
d int32
e category
f object
dtype: object
"""

index檢視行的序號

print
(df2.index)
# int64index([0, 1, 2, 3], dtype='int64')

columns檢視列的名稱

print
(df2.columns)
# index(['a', 'b', 'c', 'd', 'e', 'f'], dtype='object')

只檢視所有df2的values:

print
(df2.values)
"""array([[1.0, timestamp('2013-01-02 00:00:00'), 1.0, 3, 'test', 'foo'],
[1.0, timestamp('2013-01-02 00:00:00'), 1.0, 3, 'train', 'foo'],
[1.0, timestamp('2013-01-02 00:00:00'), 1.0, 3, 'test', 'foo'],
[1.0, timestamp('2013-01-02 00:00:00'), 1.0, 3, 'train', 'foo']], dtype=object)
"""

用 describe()檢視資料的總結

df2.describe( )""" a c d count 4.0 4.0 4.0 mean 1.0 1.0 3.0 std 0.0 0.0 0.0 min 1.0 1.0 3.0 25% 1.0 1.0 3.0 50% 1.0 1.0 3.0 75% 1.0 1.0 3.0 max 1.0 1.0 3.0

"""

transpose翻轉**的行和列

print
(df2.t)

對資料的 index 進行排序

print
(df2.sort_index(axis=
1, ascending=
false))
""" f e d c b a
0 foo test 3 1.0 2013-01-02 1.0
1 foo train 3 1.0 2013-01-02 1.0
2 foo test 3 1.0 2013-01-02 1.0
3 foo train 3 1.0 2013-01-02 1.0
"""

對資料值排序輸出:

print
(df2.sort_values(by=
'b')
)"""
a b c d e f
0 1.0 2013-01-02 1.0 3 test foo
1 1.0 2013-01-02 1.0 3 train foo
2 1.0 2013-01-02 1.0 3 test foo
3 1.0 2013-01-02 1.0 3 train foo
"""

參考莫煩python，簡單易懂！

打call

Pandas 基本介紹和基礎操作

目錄一 numpy和pandas的不同二 pandas序列操作二 pandas序列 import pandas as pd import numpy as np 1.生成乙個簡單的一維陣列下面的語句可以理解為新建了乙個一維陣列，但是每行都有乙個標號 s pd.series 1,3,6,np....

pandas的學習1 基本介紹

numpy 和 pandas 有什麼不同如果用 python 的列表和字典來作比較,那麼可以說 numpy 是列表形式的，沒有數值標籤，而 pandas 就是字典形式。pandas是基於numpy構建的，讓numpy為中心的應用變得更加簡單。要使用pandas，首先需要了解他主要兩個資料結構 se...

Pandas介紹使用

一介紹資料處理工具 1 便捷的資料處理。2 讀取檔案方便 3 整合了 numpy 和 matplotlib的計算跟畫圖。二核心資料結構 1 dataframe 結構既有行索引，又有列索引的二維陣列 pd.dataframe stock change,index stock,columns ...

Pandas 基本介紹

Pandas 基本介紹和基礎操作

pandas的學習1 基本介紹

Pandas介紹 使用

相關推薦

Pandas介紹使用