pandas中資料結構 Series

pandas是乙個開源的，bsd許可的python庫，為python程式語言提供了高效能，易於使用的資料結構和資料分析工具。python與pandas一起使用的領域廣泛，包括學術和商業領域，包括金融，經濟學，統計學，分析等。在本教程中，我們將學習pythonpandas的各種功能以及如何在實踐中使用它們。

安裝

pip install pandas

匯入

import pandas as pd
from pandas import series, dataframe

>>> import pandas as pd
>>> obj=pd.series([4,7,-5,3])
>>> obj
0 4
1 7
2 -5
3 3
dtype: int64

series的字串表現形式為：索引在左邊，值在右邊。由於我們沒有為資料指定索引，於是會自動建立乙個0到n-1（n為資料的長度）的整數型索引。你可以通過series 的values和index屬性獲取其陣列表示形式和索引物件：

>>> import pandas as pd
>>> obj.values
array([ 4, 7, -5, 3], dtype=int64)
>>> obj.index
rangeindex(start=0, stop=4, step=1)

通常，我們希望所建立的series帶有乙個可以對各個資料點進行標記的索引：索引和值是一一對應的關係

>>> obj2=pd.series([4,7,-5,3],index=['d','b','a','c'])
>>> obj2
d 4
b 7
a -5
c 3
dtype: int64

>>> obj2['a']
-5>>> obj2['d']
4>>> obj2['c','a','d']
>>> obj2[['c','a','d']]
c 3
a -5
d 4
dtype: int64

>>> obj2[obj2>0]
d 4
b 7
c 3
dtype: int64

>>> obj2*2
d 8
b 14
a -10
c 6
dtype: int64

>>> import numpy as np
>>> np.exp(obj2)
d 54.598150
b 1096.633158
a 0.006738
c 20.085537
dtype: float64

還可以將series看成是乙個定長的有序字典，因為它是索引值到資料值的乙個對映。它可以用在許多原本需要字典引數的函式中：

>>> 'b' in obj2
true
>>> 'e' in obj2
false

1.傳入乙個字典來建立乙個series

>>> sdata = 
>>> obj3=pd.series(sdata)
>>> obj3
ohio 35000
texas 71000
oregon 16000
utah 5000
dtype: int64

2.傳入新的索引來改變字典的順序

由於新增的california沒有值與它對應，所以表示資料缺失

>>> states = ['california', 'ohio', 'oregon', 'texas']
>>> obj4 = pd.series(sdata, index=states)
>>> obj4
california nan
ohio 35000.0
oregon 16000.0
texas 71000.0
dtype: float64

3.檢測資料的缺失

>>> pd.isnull(obj4)
california true
ohio false
oregon false
texas false
dtype: bool
>>> pd.notnull(obj4)
california false
ohio true
oregon true
texas true
dtype: bool

簡單的說就是對應索引的值相加

>>> obj3
ohio 35000
texas 71000
oregon 16000
utah 5000
dtype: int64
>>> obj4
california nan
ohio 35000.0
oregon 16000.0
texas 71000.0
dtype: float64
>>> obj3+obj4
california nan
ohio 70000.0
oregon 32000.0
texas 142000.0
utah nan
dtype: float64

>>> obj4.name='population'
>>> obj4.index.name='state'
>>> obj4
state
california nan
ohio 35000.0
oregon 16000.0
texas 71000.0
name: population, dtype: float64

>>> obj
0 4
1 7
2 -5
3 3
dtype: int64
>>> obj.index=['bob','steve','jeff','ryan']
>>> obj
bob 4
steve 7
jeff -5
ryan 3
dtype: int64

pandas中資料結構 Series

pandas資料結構

Pandas資料結構

pandas中的資料結構 DataFrame

pandas中資料結構 Series

pandas資料結構

Pandas資料結構

pandas中的資料結構 DataFrame

相關推薦