Python資料分析基礎之CSV檔案 4

有些時候，我們也並不需要檔案中所有的列。有兩種方法可以在csv檔案中選取特定的列：

1.使用列索引值；

2.使用列標題。

1.基礎python

#!/usr/bin/env python3
import csv
import sys
input_file = sys.ar**[1]
output_file = sys.ar**[2]
my_columns = [0, 3]
with open(input_file, 'r', newline='') as csv_in_file:
with open(output_file, 'w', newline='') as csv_out_file:
filereader = csv.reader(csv_in_file)
filewriter = csv.writer(csv_out_file)
for row_list in filereader:
row_list_output = 
for index_value in my_columns:
filewriter.writerow(row_list_output)

我們來解釋一下上面的**。

my_columns = [0, 3]

這行**建立了乙個列表變數my_columns，其中包含了想要保留的兩列的索引值，分別對應**商姓名和成本兩列。

for row_list in filereader:
row_list_output = 
for index_value in my_columns:
filewriter.writerow(row_list_output)

2.pandas

使用pandas模組的**如下：

#!/usr/bin/env python3
import pandas as pd
import sys
input_file = sys.ar**[1]
output_file = sys.ar**[2]
data_frame = pd.read_csv(input_file)
data_frame_column_by_index = data_frame.iloc[:, [0, 3]]
data_frame_column_by_index.to_csv(output_file, index=false)

在上述**中，iloc函式根據索引位置選取列。

此處省略輸出結果。

當想保留的列的標題非常容易識別，或者在處理多個輸入檔案時，各個輸入檔案中列的位置會發生改變，但標題不變的時候，使用列標題來選取特定的列的方法非常有效。

1.基礎python

舉個例子，在前面的csv檔案中，我們只想保留發票號碼和購買日期兩列，**如下：

#!/usr/bin/env python3
import csv
import sys
input_file = sys.ar**[1]
output_file = sys.ar**[2]
my_columns = ['invoice number', 'purchase date']
my_columns_index = 
with open(input_file, 'r', newline='') as csv_in_file:
with open(output_file, 'w', newline='') as csv_out_file:
filereader = csv.reader(csv_in_file)
filewriter = csv.writer(csv_out_file)
header = next(filereader, none)
for index_value in range(len(header)):
if header[index_value] in my_columns:
filewriter.writerow(my_columns)
for row_list in filereader:
row_list_output = 
for index_value in my_columns_index:
filewriter.writerow(row_list_output)

我們來解釋一下上面的**。

my_columns = ['invoice number', 'purchase date']
my_columns_index =

這裡建立了乙個列表變數my_columns，其中包含的兩個字串即要保留的兩列的名字。下面建立的空列表變數my_columns_index要使用兩個保留列的索引值來填充。

header = next(filereader, none)

這行**使用next()函式從輸入檔案中讀取第一行，並儲存在列表變數header中。

for index_value in range(len(header)):
if header[index_value] in my_columns:
filewriter.writerow(my_columns)
for row_list in filereader:
row_list_output = 
for index_value in my_columns_index:
filewriter.writerow(row_list_output)

這幾行**與上一種方法的**思路類似，在此不再贅述。

在命令列視窗中執行這個指令碼，並開啟輸出檔案檢視結果。

使用pandas模組的**如下：

#!/usr/bin/env python3
import pandas as pd
import sys
input_file = sys.ar**[1]
output_file = sys.ar**[2]
data_frame = pd.read_csv(input_file)
data_frame_column_by_name = data_frame.loc[:, ['invoice number', 'purchase date']]
data_frame_column_by_name.to_csv(output_file, index=false)

在上面的**中，使用loc函式來選取列。

此處省略輸出結果。

python基礎分析資料分析 Python基礎

學python前要明確其使用目的。學python是為了進行資料分析，所以現階段最主要的任務是了解python的最基礎知識，然後通過運用python進行資料分析的專案，從而學會使用python。一資料型別 1 整數浮點數字串 2 列表 list 2list.count obj 統計某個元素在列表...

python基礎資料分析

單樣本t檢驗乙個連續變數與乙個數值的顯著性關係 sm.stats.descrstatsw a b ttest mean 0.1 0.1為引數雙樣本t檢驗乙個分類變數二分類與乙個連續變數的顯著性關係方差齊性檢查 a1 creditcard exp a b 0 c a2 creditcard...

python資料分析基礎

python資料分析基礎學資料分析之前應該明白整個資料分析的基本流程 1.明確需求與目的 2.資料收集 3.資料預處理 4.資料分析 5.編寫報告假設檢驗,也稱為顯著性檢驗，是通過樣本的統計量,來判斷與總體引數之間是否存在差異差異是否顯著即我們對總體引數進行一定的假設,然後通過收集到的資料,...

Python資料分析基礎之CSV檔案 4

python基礎分析 資料分析 Python基礎

python基礎資料分析

python資料分析基礎

相關推薦

python基礎分析資料分析 Python基礎