使用sklearn進行mnist資料集分類

2021-10-06 19:19:50 字數 2594 閱讀 2949

深度之眼-西瓜書課後**

import time

import matplotlib.pyplot as plt

import numpy as np

from sklearn.datasets import fetch_openml

from sklearn.linear_model import logisticregression

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import standardscaler

from sklearn.utils import check_random_state

print

(__doc__)

t0 = time.time(

)train_sample =

5000

x,y = fetch_openml(

'mnist_784'

,version=

1,return_x_y=

true

)random_state = check_random_state(0)

#設定隨機種子

permutation = random_state.permutation(x.shape[0]

)x = x[permutation]

y = y[permutation]

#兩個進行順序打亂

# x = x.reshape((x.shape[0],-1))

x_train,x_test,y_train,y_test = train_test_split(x, y

, train_size=train_sample

, test_size=

10000

)scaler = standardscaler(

)#對資料進行歸一化,即對資料求方差與均值

x_train = scaler.fit_transform(x_train)

#先進行擬合,再進行歸一化變換

x_test = scaler.transform(x_test)

#只是進行歸一化變換

clf = logisticregression(c=50.

/train_sample

, multi_class=

'multinomial'

,penalty=

'l1'

,solver=

'saga'

,tol=

0.1)

clf.fit(x_train,y_train)

sparsity = np.mean(clf.coef_ ==0)

*100

score = clf.score(x_test,y_test)

print

('sparsity with l1 penalty: %.2f%%'

% sparsity)

#76.82%

print

('sparsity with l1 penalty %'

.format

(sparsity)

)#76.82%

print

('test score with l1 penalty: %.4f'

% score)

# 0.8287

print

('test score with l1 penalty '

.format

(score)

)# 0.8287

coef = clf.coef_.copy(

)plt.figure(figsize=(10

,5))

scale = np.

abs(coef)

.max()

for i in

range(10

):l1_plot = plt.subplot(2,

5, i +1)

l1_plot.imshow(coef[i]

.reshape(28,

28),interpolation=

'nearest'

,cmap=plt.cm.rdbu

,vmin=

-scale

,vmax=scale)

l1_plot.set_xticks(()

) l1_plot.set_yticks(()

) l1_plot.set_xlabel(

'class %i'

% i)

plt.suptitle(

'classification vector for .....'

)run_time = time.time(

)-t0

print

('example run in %.3f s'

% run_time)

plt.show(

)

沒搞懂,裡面很多函式用法都沒看懂,

如果要使用fetch_openml,建議進行一下科學上網

裡面同時對%和format兩者輸出形式進行了對比

使用sklearn進行增量學習

sklearn.bayes.bernoullinb sklearn.linear model.perceptron sklearn.linear model.sgdclassifier sklearn.linear model.passiveaggressiveclassifier regressi...

使用sklearn進行增量學習

sklearn.bayes.bernoullinb sklearn.linear model.perceptron sklearn.linear model.sgdclassifier sklearn.linear model.passiveaggressiveclassifier regressi...

使用sklearn進行K Means文字聚類

k means演算法 中文名字叫做k 均值演算法,演算法的目的是將n個向量分別歸屬到k個中心點裡面去。演算法首先會隨機選擇k個中心向量,然後通過迭代計算以及重新選擇k個中心向量,使得n個向量各自被分配到距離最近的k中心點,並且所有向量距離各自中心點的和最小。步驟一 在輸入資料集裡面隨機選擇k個向量作...