scikit learn ID3手寫數字識別

2021-08-03 15:13:09 字數 1664 閱讀 6751

判定樹是乙個類似於流程圖的樹結構:其中,每個內部結點表示在一屬性上的測試,

每個分支代表乙個屬性輸出,而每個樹葉結點代表類或類分布。樹的頂層是根結點。

id3演算法根據的就是資訊獲取量(information gain):gain(a) = info(d) - infor_a(d)

#coding:utf-8

"""python 3

sklearn 0.18

"""from sklearn.model_selection import gridsearchcv

from sklearn.model_selection import train_test_split

from sklearn.tree import decisiontreeclassifier

from sklearn.metrics import accuracy_score,confusion_matrix,classification_report

import input_data

import numpy as np

import pickle

mnist = input_data.read_data_sets('mnist/',one_hot=false)

x = mnist.train.images

y = mnist.train.labels

#採用交叉驗證

train_data,validation_data,train_labels,validation_labels = train_test_split(x,y,test_size=0.2)

#訓練乙個decisiontree分類器

clf = decisiontreeclassifier(random_state=0,splitter='best',criterion='entropy')

clf.fit(train_data,train_labels)

predictions=

for i in range(1000):

if i % 100 ==0:

print('= = = = = = > > > > > >','epoch:',int(i/100))

#將**結果存入predictions

output = clf.predict([mnist.test.images[i]])

print(confusion_matrix(mnist.test.labels[0:1000],predictions))

print(classification_report(mnist.test.labels[0:1000],np.array(predictions)))

print('test accuracy is:',accuracy_score(mnist.test.labels[0:1000],predictions))

with open('id3.pickle','wb') as f:

pickle.dump(clf,f)

對於mnist手寫數字的識別,採用id3演算法的檢測精度達到87%,決策樹的缺點在於分類時,類別越多錯誤增加越快,而且決策樹越深越容易出現overfitting

css3手機開發

1 justify content center 縱向方向居中 2.text align center 垂直方向居中 align item center 水平居中 2.1 display table 多行或者單行垂直居中 3.vertical align top 頂部對齊 在需要對齊的元素中新增 4...

Python3 安裝cx Oracle 指導手冊

1.工具安裝清單 編號安裝包說明1 vcredist x64.exe microsoft visual c 2005 sp1 redistributable package x64 補丁檔案,如以安裝則不需要。2oracle instantclient 11 2 for win64 oracle11...

tensorflow3 手寫數字識別

28 28個輸入單元,200個中間單元,10個輸出單元 coding utf 8 created on fri may 17 19 39 39 2019 author 666 import tensorflow as tf from tensorflow.examples.tutorials.mni...