tensoflow隨筆 softmax和交叉熵

softmax函式

softmax函式接收乙個n維向量作為輸入，然後把每一維的值轉換到（0， 1）之間的乙個實數。假設模型全連線網路輸出為a，有c個類別，則輸出為a1,a2,...,ac，對於每個樣本，屬於類別i的輸出概率為：

屬於各個類別的概率和為1。

貼一張形象的說明圖：

如圖將原來輸入的3,1,-3通過softmax函式的作用，對映成為(0,1)的值，而這些值的累和為1（滿足概率的性質），我們可以將它理解成概率，在最後選取輸出結點的時候，我們就可以選取概率值最大的結點，作為我們的**目標。

softmax導數

對softmax求導即：

當i = j 時：

當i ≠ j時：

softmax數值穩定性

傳入資料[1, 2, 3, 4, 5]時

傳入資料[1000, 2000, 3000, 4000, 5000]時

導致輸出是nan的原因是exp(x)對較大的數求指數溢位的問題。

比如：

def softmax(x):
shift_x = x - np.max(x)
exp_x = np.exp(shift_x)
return exp_x / np.sum(exp_x)

交叉熵：用來判定實際的輸出與期望的輸出的接近程度！刻畫的是實際輸出與期望輸出的距離，也就是交叉熵的值越小，兩個概率分布就越接近，假設概率分布p為期望輸出，概率分布q為實際輸出，h(p,q)為交叉熵，則：

或者：

tensorflow中對交叉熵的計算可以採用兩種方式

1.手動實現：

import tensorflow as tf
input = tf.placeholder(dtype=tf.float32, shape=[none, 28*28])
output = tf.placeholder(dtype=tf.float32, shape=[none, 10])
w_fc1 = tf.variable(tf.truncated_normal([28*28, 1024], stddev=0.1))
b_fc1 = tf.variable(tf.constant(0.1, shape=[1024]))
h_fc1 = tf.matmul(input, w_fc1) + b_fc1
w_fc2 = tf.variable(tf.truncated_normal([1024, 10], stddev=0.1))
b_fc2 = tf.variable(tf.constant(0.1, shape=[10]))
logits = tf.nn.softmax(tf.matmul(h_fc1, w_fc2) + b_fc2)
cross_entropy = -tf.reduce_sum(output * tf.log(logits))

output是one-hot型別的實際輸出，logits是對全連線的輸出用softmax進行轉換為概率值的**，最後通過cross_entropy = -tf.reduce_sum(label * tf.log(y))求出交叉熵的。2.tf.nn.softmax_cross_entropy_with_logits：tensorflow已經對softmax和交叉熵進行了封裝

import tensorflow as tf
input = tf.placeholder(dtype=tf.float32, shape=[none, 28*28])
output = tf.placeholder(dtype=tf.float32, shape=[none, 10])
w_fc1 = tf.variable(tf.truncated_normal([28*28, 1024], stddev=0.1))
b_fc1 = tf.variable(tf.constant(0.1, shape=[1024]))
h_fc1 = tf.matmul(input, w_fc1) + b_fc1
w_fc2 = tf.variable(tf.truncated_normal([1024, 10], stddev=0.1))
b_fc2 = tf.variable(tf.constant(0.1, shape=[10]))
logits = tf.matmul(h_fc1, w_fc2) + b_fc2
cross_entropy = -tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=output, logits=logits))

函式的引數logits在函式內會用softmax進行處理，所以傳進來時不能是softmax的輸出。

官方的封裝函式會在內部處理數值不穩定等問題，如果選擇方法1，需要自己在softmax函式裡面新增trick。

tensoflow隨筆 softmax和交叉熵

tensoflow 識別數字

tensoflow實現最簡單的分類

soft 心算能力訓練與測試

tensoflow隨筆 softmax和交叉熵

tensoflow 識別數字

tensoflow實現最簡單的分類

soft 心算能力訓練與測試

相關推薦