Udacity深度學習之Dropout

droupout在深度學習中用來防止過擬合，機器學習過擬合現象究竟是什麼呢？具體可以可以周志華老師機器學習西瓜書(第二章)——模型評估與選擇，同時解釋為什麼l1可以做特徵選擇，其係數為0（l2正則類似）來說。

用一張圖來表示就是：

dropout是乙個降低過擬合的正則化技術。它在網路中暫時的丟棄一些單元（神經元），以及與它們的前後相連的所有節點。圖1是dropout的工作示意圖。

tensorflow提供了乙個tf.nn.dropout()函式，你可以用來實現dropout。

讓我們來看乙個tf.nn.dropout()的使用例子。

keep_prob = tf.placeholder(tf.float32) # probability to keep units
hidden_layer = tf.add(tf.matmul(features, weights[0]), biases[0])
hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)
logits = tf.add(tf.matmul(hidden_layer, weights[1]), biases[1])

上面的**展示了如何在神經網路中應用dropout。

tf.nn.dropout()函式有兩個引數：

hidden_layer：你要應用dropout的tensor

keep_prob：任何乙個給定單元的留存率（沒有被丟棄的單元）

keep_prob可以讓你調整丟棄單元的數量。為了補償被丟棄的單元，tf.nn.dropout()把所有保留下來的單元（沒有被丟棄的單元）* 1/keep_prob

在訓練時，乙個好的keep_prob初始值是0.5。

在測試時，把keep_prob值設為1.0，這樣保留所有的單元，最大化模型的能力。

下面的**，**出問題了？

語法沒問題，但是測試準確率很低。

keep_prob = tf.placeholder(tf.float32) # probability to keep units
hidden_layer = tf.add(tf.matmul(features, weights[0]), biases[0])
hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)
logits = tf.add(tf.matmul(hidden_layer, weights[1]), biases[1])
with tf.session() as sess:
sess.run(tf.global_variables_initializer())
for epoch_i in range(epochs):
for batch_i in range(batches):
sess.run(optimizer, feed_dict=)
validation_accuracy = sess.run(accuracy, feed_dict=)

原因：keep_prob在驗證測試準確率的時候應該設定成1。

# quiz solution
# note: you can't run code in this tab
import tensorflow as tf
hidden_layer_weights = [
[0.1, 0.2, 0.4],
[0.4, 0.6, 0.6],
[0.5, 0.9, 0.1],
[0.8, 0.2, 0.8]]
out_weights = [
[0.1, 0.6],
[0.2, 0.1],
[0.7, 0.9]]
# weights and biases
weights = [
tf.variable(hidden_layer_weights),
tf.variable(out_weights)]
biases = [
tf.variable(tf.zeros(3)),
tf.variable(tf.zeros(2))]
# input
features = tf.variable([[0.0, 2.0, 3.0, 4.0], [0.1, 0.2, 0.3, 0.4], [11.0, 12.0, 13.0, 14.0]])
# todo: create model with dropout
keep_prob = tf.placeholder(tf.float32)
hidden_layer = tf.add(tf.matmul(features, weights[0]), biases[0])
hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)
logits = tf.add(tf.matmul(hidden_layer, weights[1]), biases[1])
# todo: print logits from a session
with tf.session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(logits, feed_dict=))

output:
[[ 1.1 6.6000004]
[ 0.9100001 1.016 ]
[33.74 43.38 ]]

ref:

1、

Udacity深度學習之卷積神經網路概念解析

tride filter depth 一提到卷積神經網路，有些概念我們需要解析一下，要不然一說卷積神經網路可能會發懵。第一次聽到patch，這是個啥？我們先看一下史丹福大學對卷積核的乙個動態介紹 1 這裡直接弄乙個圖，動態圖參考文獻 1 圖1 卷積核 image代表的就是的輸入，image圖中的黃色...

udacity上Google的深度學習筆記

udacity上deeplearning這門課是google開的，介紹了常見的幾種深度神經網路模型，同時還附帶了幾個練習，並且練習用的工具都是tensorflow，所以既可以學習一下神經網路的知識，又可以學習tensorflow。我寫的課後練習的都放到了我的githuh上同時也是第一次用git，...

ES6學習筆記 Udacity

part 1 es6 語法 for of 迴圈附首字母大寫方法格式 const digits 0,1,2,3,4,5,6,7,8,9 for const digit of digits tips 建議使用複數物件名稱來表示多個值的集合。這樣，迴圈該集合時，可以使用名稱的單數版本來表示集合中的單個...

Udacity深度學習之Dropout

Udacity深度學習之卷積神經網路概念解析

udacity上Google的深度學習筆記

ES6學習筆記 Udacity

相關推薦