MNIST學習筆記

英文教程：

主要**：

n_train_batches = train_set_x.get_value(borrow=true).shape[0] // batch_size
n_valid_batches = valid_set_x.get_value(borrow=true).shape[0] // batch_size
n_test_batches = test_set_x.get_value(borrow=true).shape[0] // batch_size

以n_train_batches為例，train_set_x.get_value.shape[0]為訓練樣本集中樣本的個數（shape[0]為矩陣的行數，即訓練樣本數，而shape[1]是訓練樣本的列數，即樣本的屬性個數），batch_size為乙個mini-batch（取樣集，網上無合適的翻譯，但有人用了這個先這樣命名吧）中有多少樣本數，兩者相除得到的結果為一次epoch（世代）需要訓練幾個mini-batch。

index = t.lscalar()
x = t.matrix('x')
y = t.ivector('y')
classifier = logisticregression(input=x, n_in=28 * 28, n_out=10)
cost = classifier.negative_log_likelihood(y)

index為mini-batch中的索引，代表一次epoch中的第幾個mini-batch。x為乙個矩陣，儲存的一行資料為一張 28*28=784畫素的，行數為訓練樣本的個數。y為標記向量，如[1,0,0,0,0,0,0,0,0,0]代表樣例的數字為0，[0,0,1,0,0,0,0,0,0,0]代表樣例的數字為2。利用logisticsregression例項化classifier物件，同時定義cost為classifier的損失函式。需要在訓練中使得損失函式的數值cost最小（使用隨機梯度下降法），即訓練模型成功。

test_model = theano.function(
inputs=[index],
outputs=classifier.errors(y),
givens=
)validate_model = theano.function(
inputs=[index],
outputs=classifier.errors(y),
givens=
)

輸入是index，輸出則是classifier物件中的errors方法的返回值，其中y作為errors方法的輸入引數。其中的classifier接收x作為輸入引數。givens關鍵字的作用是使用冒號後面的變數來替代冒號前面的變數，本例中，即使用測試資料中的第index批資料(一批有batch_size個)來替換x和y。test_model用中文來解釋就是: 接收第index批測試資料的影象資料x和期望輸出y作為輸入，返回誤差值的函式，函式theano.tensor.neq(self.y_pred, y)用於統計self.y_pred和y中不相等的樣本的個數。

g_w = t.grad(cost=cost, wrt=classifier.w)
g_b = t.grad(cost=cost, wrt=classifier.b)

計算的是梯度, 用於學習演算法，t.grad(y, x) 計算的是相對於x的y的梯度。

updates = [(classifier.w, classifier.w - learning_rate * g_w),
(classifier.b, classifier.b - learning_rate * g_b)]

updates是乙個長度為2的list, 每個元素都是一組tuple, 在theano.function中, 每次呼叫對應函式, 使用tuple中的第二個元素來更新第乙個元素。

train_model = theano.function(
inputs=[index],
outputs=cost,
updates=updates,
givens=
)

與test_model和validate_model類似，但是有所不同的是增加了updates引數，這個引數給定了每次呼叫train_model時對某些引數的修改(w和b)。同時outputs也變成了cost。在訓練中需要使得損失函式最小。

某些語句解釋：

1、theano.shared 共享變數

self.w = theano.shared(
value=numpy.zeros(
(n_in, n_out),
dtype=theano.config.floatx
),name='w',
borrow=true
)

shared函式將變數設定為全域性變數，讓變數的值可在多個函式中使用；

numpy.zeros是得到形狀為(n_in, n_out)的二維零矩陣，n_in為行，n_out為列；

dtype型別需要設定成theano.config.floatx，這樣gpu才能呼叫；

引數name：用於標識此引數的字串：

import numpy, theano
np_array = numpy.zeros(2, dtype='float32')
s_default = theano.shared(np_array, name='s_default')
print ('s_default.name:',s_default.name)

輸出：

s_default.name: s_default

引數borrow=true/false：對資料的改變會/不會影響到原始變數：

import numpy, theano
np_array = numpy.zeros(2, dtype='float32')
s_default = theano.shared(np_array)
s_false = theano.shared(np_array, borrow=false)
s_true = theano.shared(np_array, borrow=true)
np_array += 1
print('s_default:',s_default.get_value())
print('s_false:',s_false.get_value())
print('s_true:',s_true.get_value())

輸出：

s_default: [ 0.  0.]
s_false: [ 0. 0.]
s_true: [ 1. 1.]

MNIST學習筆記

Tensorflow學習筆記二 MNIST入門

MNIST機器學習入門

MNIST機器學習入門

MNIST學習筆記

Tensorflow學習筆記 二 MNIST入門

MNIST機器學習入門

MNIST機器學習入門

相關推薦

Tensorflow學習筆記二 MNIST入門