資料基本處理

import 基本模組

import numpy as np
np.set_printoptions(suppress=
true
)import pandas as pd
pd.set_option(
'display.max_columns'
,none
)pd.set_option(
'display.max_rows'
,none
)import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

讀取資料

housing.isnull().
any(
)#返回布林
database.isnull().
sum(
)#返回數量

檢視資料

housing_tr.head(
)#檢視資料
housing.info(
)#檢視資料型別
housing.describe(
)#檢視資料情況
housing[
'ocean_proximity'
].unique(
)#看某一列有什麼資料
housing[
'ocean_proximity'
].value_counts(
)#看資料分別有多少個

檢視空值

housing.isnull().
any(
)#返回布林
database.isnull().
sum(
)#返回數量

處理空值

database.drop([""
], axis=
1, inplace=
true
)#刪除列
database=database.dropna(subset['']
).reset_index(drop=
true
)#刪除行
housing[""]
.fillna(np.median, inplace=
true
)#填充列

切割連續變數分類

housing[
"price_range"
]= pd.cut(housing[
"median_house_value"],
bins=[0
,150000
,300001
, np.inf]
, labels=[0
,1,2
])

normalization

from sklearn.preprocessing import standardscaler
stander = standardscaler(
)stander.fit(housing_ex)
housing_st = stander.transform(housing_ex)

切割訓練集測試集

from sklearn.model_selection import train_test_split
train_set, test_set = train_test_split(housing_st, test_size=
0.3, random_state=
42)

housing.isnull().any() #返回布林

database.isnull().sum() #返回數量

List集合基本處理

1.迴圈list中的所有元素然後刪除重複 for int i 0 i list.size 1 i 2.通過hashset踢除重複元素 hashset h new hashset list list.clear list.addall h 3.刪除arraylist中重複元素，保持順序 for ite...

JPivot的基本處理流程

一介紹 jpivot 是乙個自定義的jsp的標籤庫，可以用來在jsp頁面中嵌入olap 和圖表。使用者可以執行典型的olap導航，如下鑽，切片和切塊。它使用mondrian 作為其olap伺服器。二基本處理流程等有時間了才能夠修正錯誤及細化，現在只能給出一些流程圖示意，非規範 1.總體處理流...

python 檔案的基本處理

一檔案的開啟檔案開啟方法 open name mode buf name 檔案的路徑 mode 開啟方式 buf 緩衝buffering大小 mode 說明注意 r 唯讀方式開啟檔案必須存在 w 只寫方式開啟檔案不存在建立檔案，檔案存在則清空檔案內容 a 追加方式開啟檔案不存在則建立檔案...

資料基本處理

List集合基本處理

JPivot的基本處理流程

python 檔案的基本處理

相關推薦