Tensorflow建立迴圈神經網路

雖然已經接觸deep learning很長一段時間了，也看了很久rnn相關的**，但是突然想用tensorflow實現一些功能的時候，突然發現絲毫沒有頭緒，找了一些資料，學習了一波，記錄一下。

tensorflow由於不同的版本改動較大，在1.0版本之後，可以使用如下語句來建立乙個cell：
from tensorflow.contrib import rnn
cell_fun = rnn.grucell(rnn_hidden_size)

在tensorflow中，上述grucell的實現如下（可以在github上看到原始碼）：

class
grucell
(rnncell):
"""gated recurrent unit cell (cf. """
def__init__
(self, num_units, input_size=none, activation=tanh):
if input_size is
notnone:
logging.warn("%s: the input_size parameter is deprecated.", self)
self._num_units = num_units
self._activation = activation
@property
defstate_size
(self):
return self._num_units
@property
defoutput_size
(self):
return self._num_units
def__call__
(self, inputs, state, scope=none):
"""gated recurrent unit (gru) with nunits cells."""
with vs.variable_scope(scope or
"gru_cell"):
with vs.variable_scope("gates"): # reset gate and update gate.
# we start with bias of 1.0 to not reset and not update.
r, u = array_ops.split(
value=_linear(
[inputs, state], 2 * self._num_units, true, 1.0, scope=scope),
num_or_size_splits=2,
axis=1)
r, u = sigmoid(r), sigmoid(u)
with vs.variable_scope("candidate"):
c = self._activation(_linear([inputs, r * state],
self._num_units, true,
scope=scope))
new_h = u * state + (1 - u) * c
return new_h, new_h

注意到這裡面有乙個call函式，這個函式表示的意思就是，把類的物件可以當做函式來使用，比如上面的grucell這個類有個物件gru，那麼我們可以直接使用』 gru(input, last_state) 『；

其實一開始並不知道tensorflow中有這個，所以還自己寫了乙個gru的cell，僅供參考:

# -*- coding: utf-8 -*-
# @last modified : 5/23/2017 1:56 pm
# @author : summmersnow
# @description:
import tensorflow as tf
class
gru(object):
def__init__
(self, name, input_len, hidden_len):
self.name = name
self.input_len = input_len
self.hidden_len = hidden_len
defdefine_param
(self):
self.w = tf.variable("_w", self.input_len, 3*self.hidden_len)
self.u = tf.variable("_u", self.hidden_len, 3*self.hidden_len)
self.b = tf.variable("_b", 3*self.hidden_len)
defbuild_net
(self, input_data, last_hidden):
xw = tf.add(tf.matmul(input_data, self.w), self.b)
hu = tf.matmul(last_hidden, self.u)
xw1, xw2, xw3 = tf.split(xw, 3, 1)
hu1, hu2, hu3 = tf.split(hu, 3, 1)
r = tf.sigmoid(xw1 + hu1)
z = tf.sigmoid(xw2 + hu2)
h1 = tf.tanh(xw3, r*hu3)
h = (h1 - last_hidden) * z + last_hidden
return h

# 定義乙個 lstm 結構，lstm 中使用的變數會在該函式中自動被宣告 lstm = tf.contrib .rnn .basiclstmcell(lstm_hidden_size) # 將 lstm 中的狀態初始化為全 0 陣列，batch_size 給出乙個 batch 的大小 state = lstm.zero_state(batch_size, tf.float32) # 定義損失函式 loss = 0.0 # num_steps 表示最大的序列長度 for i in range(num_steps): # 在第乙個時刻宣告 lstm 結構中使用的變數，在之後的時刻都需要服用之前定義好的變數 if i>0: tf.get_variable_scope().reuse_variables() # 每一步處理時間序列中的乙個時刻。將當前輸入（current_input）和前一時刻狀態（state）傳入定義的 lstm 結構就可以得到當前 lstm 結構的輸出 lstm_output 和更新後的狀態 state lstm_output, state = lstm(current_input, state) # 將當前時刻 lstm 結構的輸出傳入乙個全連線層得到最後的輸出 final_output = fully_connected(lstm_output) # 計算當前時刻輸出的損失

loss += calc_loss(final_output, expected_output)

在 tensorflow中實現雙向rnn（birnn），使用 multirnncell：

lstm = tf.contrib
.rnn
.basiclstmcell(lstm_hidden_size)
# 使用 multirnncell 類實現深層迴圈網路中每乙個時刻的前向傳播過程，number_of_layers 表示有多少層
stacked_lstm = tf.contrib
.rnn
.multirnncell([lstm] * number_of_layers)
state = stacked_lstm.zero_state(batch_size, tf.float32)
for i in range(len(num_steps)):
if i>0:
tf.get_variable_scope().reuse_variables()
stacked_lstm_output, state = stacked_lstm(current_input, state)
final_output = fully_connected(stacked_lstm_output)
loss += calc_loss(final_output, expected_output)

# 定義 lstm 結構
lstm = tf.contrib
.rnn
.basiclstmcell(lstm_hidden_size)
dropout_lstm = tf.contrib
.rnn
stacked_lstm = tf.contrib
.rnn
.multirnncell([dropout_lstm] * number_of_layers)

Tensorflow建立迴圈神經網路

tensorflow安裝神坑

Tensorflow訓練迴圈

TensorFlow建立變數

Tensorflow建立迴圈神經網路

tensorflow安裝神坑

Tensorflow訓練迴圈

TensorFlow建立變數

相關推薦