Python金融資料處理之Pandas包

在python的pandas包中，有兩種資料結構可以很方便地用於儲存複雜的資料，為series和dataframe。

一、series

首先先講一下series，series是dataframe的基礎。series可以認為是個具有索引（index）的一維陣列，可以和程式設計中另乙個常用的概念hash（雜湊）聯絡起來。

建立乙個series的基本格式為s = np.series(data,index=index,name=name)。可以從建立最簡單的series開始：

import pandas as pd
import numpy as np
a = np.random.rand(5)
s = pd.series(a)
print(s)

首先引入了pandas和numpy的包，可以從輸出結果中看到，左邊的引索預設是從0開始的，這裡需要注意的是，如果需要加入制定index，指定index的長度要和data的長度一致。如果要查詢引索，可以用s.index進行。

s = pd.series(a,index = ['1','2','3','4','5'])
print(s)
print(s.index)

另乙個可選項為name。也可以通過s.name來進行訪問

s = pd.series(a,index = ['a','b','c','d','e'],name = 'my_data')
print(s)
print(s.name)

同時，字典可以通過字典dict來進行建立

d = 
s = pd.series(d)
print(s)

在訪問字典的時候，可以像陣列一樣使用0，1，2，3...n的下標，也可以像雜湊一樣使用a b c d e 的引索，也可以指定條件過濾。

s = pd.series(np.random.rand(5),index = ['a','b','c','d','e'])
print(s[0:2])
print(s['a'])
print(s[s>0.5])

二、dataframe

dataframe的實質是由很多series列組成的列表，可以很方便地處理不同資料型別的列。但如果遇到像如果是全部浮點數，求逆之類的，用矩陣會更加方便。

首先可以用鍵為列名，值為series的字典來建立dataframe。可以在建立dataframe的時候可以用index指定需要的引索，用columns來指定需要列名，如果不存在此元素則會輸出nan

d = 
df = pd.dataframe(d)
print(df)
df = pd.dataframe(d,index=['e','f','g','h'],columns=['three','four'])
print(df)

python 金融資料處理demo

1.掃瞄當前目錄下過濾的檔案比如所有csv檔名遍歷所有csv檔案進行讀取資料處理資料處理結果寫入result.csv檔案裡面輸出 python2.78 import glob,os,pdb,csv count 0 amount 0 for filename in glob.glob data c...

Python之資料處理

靠別人不如靠自己，學學學學學學學學！原資料需求 coding utf 8 txtfile aminer1.txt newtxtfile open new txtfile,w with open txtfile,r as file to read lines file to read.readlin...

python之資料處理

檔案資料讀寫的基本操作 import this 本地檔案的界定指向乙個本地儲存的檔案，是乙個連線或者乙個對映 path1 c users 11786 desktop test.txt 正斜線兩個或者反斜線乙個來用於資料路徑的表達再或者用r 寫在檔案路徑外面推薦第三種 path2 c users...

Python金融資料處理之Pandas包

python 金融資料處理demo

Python之資料處理

python之資料處理

相關推薦