反向傳播以及梯度下降法

2021-10-04 06:52:15 字數 3274 閱讀 6524

反向傳播:從後向前,逐層求損失函式對每層神經元引數的偏導數,迭代更新所有引數。

我們訓練網路的目的就是不斷優化引數,尋找最小的損失函式,我們通過梯度下降函式來實現這個目標(乙個函式沿梯度方向下降最快):

初始化引數w為5,學習率為0.2,則

1次 引數w:5 5-0.2*(2*5+2)=2.6

2次 引數w:2.6 2.6-0.2*(2*2.6+2)=1.16

3次 引數w:1.16 1.16-0.2*(2*1.16+2)=0.296

4次 引數w:0.296

import tensorflow as tf

tf.enable_eager_execution() # 啟用動態圖機制

w = tf.variable(tf.constant(5, dtype=tf.float32))

lr = 0.2

epoch = 40

for epoch in range(epoch): # for epoch 定義頂層迴圈,表示對資料集迴圈epoch次,此例資料集資料僅有1個w,初始化時候constant賦值為5,迴圈100次迭代。

with tf.gradienttape() as tape: # with結構到grads框起了梯度的計算過程。

loss = tf.square(w + 1)

grads = tape.gradient(loss, w) # .gradient函式告知誰對誰求導

w.assign_sub(lr * grads) # .assign_sub 對變數做自減 即:w -= lr*grads 即 w = w - lr*grads

print("after %s epoch,w is %f,loss is %f" % (epoch, w.numpy(), loss))

# lr初始值:0.2 請自改學習率 0.001 0.999 看收斂過程

# 最終目的:找到 loss 最小,即w=-1的最優引數w

#程式執行結果:

after 0 epoch,w is 2.600000,loss is 36.000000

after 1 epoch,w is 1.160000,loss is 12.959999

after 2 epoch,w is 0.296000,loss is 4.665599

after 3 epoch,w is -0.222400,loss is 1.679616

after 4 epoch,w is -0.533440,loss is 0.604662

after 5 epoch,w is -0.720064,loss is 0.217678

after 6 epoch,w is -0.832038,loss is 0.078364

after 7 epoch,w is -0.899223,loss is 0.028211

after 8 epoch,w is -0.939534,loss is 0.010156

after 9 epoch,w is -0.963720,loss is 0.003656

after 10 epoch,w is -0.978232,loss is 0.001316

after 11 epoch,w is -0.986939,loss is 0.000474

after 12 epoch,w is -0.992164,loss is 0.000171

after 13 epoch,w is -0.995298,loss is 0.000061

after 14 epoch,w is -0.997179,loss is 0.000022

after 15 epoch,w is -0.998307,loss is 0.000008

after 16 epoch,w is -0.998984,loss is 0.000003

after 17 epoch,w is -0.999391,loss is 0.000001

after 18 epoch,w is -0.999634,loss is 0.000000

after 19 epoch,w is -0.999781,loss is 0.000000

after 20 epoch,w is -0.999868,loss is 0.000000

after 21 epoch,w is -0.999921,loss is 0.000000

after 22 epoch,w is -0.999953,loss is 0.000000

after 23 epoch,w is -0.999972,loss is 0.000000

after 24 epoch,w is -0.999983,loss is 0.000000

after 25 epoch,w is -0.999990,loss is 0.000000

after 26 epoch,w is -0.999994,loss is 0.000000

after 27 epoch,w is -0.999996,loss is 0.000000

after 28 epoch,w is -0.999998,loss is 0.000000

after 29 epoch,w is -0.999999,loss is 0.000000

after 30 epoch,w is -0.999999,loss is 0.000000

after 31 epoch,w is -1.000000,loss is 0.000000

after 32 epoch,w is -1.000000,loss is 0.000000

after 33 epoch,w is -1.000000,loss is 0.000000

after 34 epoch,w is -1.000000,loss is 0.000000

after 35 epoch,w is -1.000000,loss is 0.000000

after 36 epoch,w is -1.000000,loss is 0.000000

after 37 epoch,w is -1.000000,loss is 0.000000

after 38 epoch,w is -1.000000,loss is 0.000000

after 39 epoch,w is -1.000000,loss is 0.000000

以上內容,如有錯誤,敬請批評指正!謝謝!

pytorch 梯度下降與反向傳播

在模型訓練中,損失函式用來衡量 值與真實值之間的誤差,數值越小表示誤差越小。乙個常見的選擇是平方函式。它在評估索引為 i 的樣本誤差的表示式為 可能有人想問這裡為什麼要除以1 2,其實有沒有1 2對於損失函式來說效果是一樣的,加上1 2是為了求導方便 把平方求導的係數變成1 批量樣本的損失函式是單個...

梯度下降法和隨機梯度下降法

批量梯度下降法 batch gradient descent 在更新引數時使用所有的樣本來進行更新 隨機梯度下降法 stochastic gradient descent 求梯度時沒有用所有的m個樣本的資料,而是僅僅選取乙個樣本j來求梯度。小批量梯度下降法 mini batch gradient d...

深度學習 梯度下降和反向傳播

引用知乎上的乙個回答 那麼導數的本質是什麼?就是變化率唄,例如小王今年賣了100頭豬,去年90頭,前年80頭,也就是說每年增加10頭,這就是變化率,完整點就是說10頭 年,是有單位的,也就是說導數為10,說白點就是自變數單位變化量的基礎上因變數的變化量,自變數就是年份,單位為年,因變數為豬的數量,年...