softmax的優缺點

引入指數形式的優點

使用指數形式的softmax函式能夠將差距大的數值距離拉的更大。

在深度學習中通常使用反向傳播求解梯度進而使用梯度下降進行引數更新的過程，而指數函式在求導的時候比較方便。比如

import tensorflow as tf
print
(tf.__version__)
# 2.0.0
a = tf.constant([2
,3,5
], dtype = tf.float32)
b1 = a / tf.reduce_sum(a)
# 不使用指數
print
(b1)
# tf.tensor([0.2 0.3 0.5], shape=(3,), dtype=float32)
b2 = tf.nn.softmax(a)
# 使用指數的softmax
print
(b2)
# tf.tensor([0.04201007 0.11419519 0.8437947 ], shape=(3,), dtype=float32)

引入指數形式的缺點

指數函式的曲線斜率逐漸增大雖然能夠將輸出值拉開距離，但是也帶來了缺點，當 zi值非常大的話，計算得到的數值也會變的非常大，數值可能會溢位。

import numpy as np
scores = np.array(
[123
,456
,789])
softmax = np.exp(scores)
/ np.
sum(np.exp(scores)
)print
(softmax)
# [ 0. 0. nan]

當然針對數值溢位有其對應的優化方法，將每乙個輸出值減去輸出值中最大的值。

這裡需要注意一下，當使用softmax函式作為輸出節點的啟用函式的時候，一般使用交叉熵作為損失函式。由於softmax函式的數值計算過程中，很容易因為輸出節點的輸出值比較大而發生數值溢位的現象，在計算交叉熵的時候也可能會出現數值溢位的問題。為了數值計算的穩定性，tensorflow提供了乙個統一的介面，將softmax與交叉熵損失函式同時實現，同時也處理了數值不穩定的異常，使用tensorflow深度學習框架的時候，一般推薦使用這個統一的介面，避免分開使用softmax函式與交叉熵損失函式。

tensorflow提供的統一函式式介面為：

import tensorflow as tf

print(tf.version) # 2.0.0

tf.keras.losses.categorical_crossentropy(y_true, y_pred, from_logits = false)

其中y_true代表了one-hot編碼後的真實標籤，y_pred表示網路的實際**值：

當from_logits設定為true時，y_pred表示未經softmax函式的輸出值；

當from_logits設定為false時，y_pred表示為經過softmax函式後的輸出值；

為了在計算softmax函式時候數值的穩定，一般將from_logits設定為true，此時tf.keras.losses.categorical_crossentropy將在內部進行softmax的計算，所以在不需要在輸出節點上新增softmax啟用函式。

import tensorflow as tf
print
(tf.__version__)
z = tf.random.normal([2
,10])
# 構造2個樣本的10類別輸出的輸出值
y = tf.constant([1
,3])
# 兩個樣本的真是樣本標籤是1和3
y_true = tf.one_hot(y, depth =10)
# 構造onehot編碼
# 輸出層未經過softmax啟用函式,因此講from_logits設定為true
loss1 = tf.keras.losses.categorical_crossentropy(y_true, z, from_logits =
true
)loss1 = tf.reduce_mean(loss1)
print
(loss1)
# tf.tensor(2.6680193, shape=(), dtype=float32)
y_pred = tf.nn.softmax(z)
# 輸出層經過softmax啟用函式,因此講from_logits設定為false
loss2 = tf.keras.losses.categorical_crossentropy(y_true, y_pred, from_logits =
false
)loss2 = tf.reduce_mean(loss2)
print
(loss2)
# tf.tensor(2.668019, shape=(), dtype=float32)

雖然上面兩個過程結果差不多，但是當遇到一些不正常的數值時，將from_logits設定為true時tensorflow會啟用一些優化機制。因此推薦使用將from_logits引數設定為true的統一接

softmax的優缺點

剛構橋的優缺點橋梁的優缺點

演算法的優缺點邏輯回歸演算法的優缺點

Struts的優缺點

softmax的優缺點

剛構橋的優缺點 橋梁的優缺點

演算法的優缺點 邏輯回歸演算法的優缺點

Struts的優缺點

相關推薦

剛構橋的優缺點橋梁的優缺點

演算法的優缺點邏輯回歸演算法的優缺點