基於深度學習方法的語音識別研究（三）

前些天在師兄的幫助下，在此感謝工大的薛師兄，實現了blstm的語音識別聲學模型的搭建，由於實驗室存在保密協議，只能提供部分**，還望各位同學體諒，**如下：

# -*- coding : utf-8 -*-
# author : zhangwei
import tensorflow as tf
import numpy as np
filename_01 = '/home/zhangwei/data/train_mfcc_800000.txt'
filename_02 = '/home/zhangwei/data/train_label_800000.txt'
filename_03 = '/home/zhangwei/data/test_mfcc.txt'
filename_04 = '/home/zhangwei/data/test_label.txt'
x_train = np.loadtxt(filename_01)
y_train = np.loadtxt(filename_02)
x_test = np.loadtxt(filename_03)
y_test = np.loadtxt(filename_04)
batch_size = 50
n_steps = 1
n_inputs = 39
n_epoch = 100
n_classes = 219
n_hidden_units = 128
lr = 0.01
x = tf.placeholder(dtype=tf.float32 , 
shape=[batch_size , n_steps , n_inputs])
y = tf.placeholder(dtype=tf.float32 , 
shape=[batch_size , n_classes])
keep_prob = tf.placeholder(tf.float32)
def 
get_cell():
n_cell = tf.nn.rnn_cell.lstmcell(num_units=n_hidden_units , 
activation=tf.nn.relu)
input_keep_prob=1.0 
, output_keep_prob=keep_prob)
cell_fw = get_cell()
cell_bw = get_cell()
init_cell_fw = cell_fw.zero_state(batch_size=batch_size , 
dtype=tf.float32)
init_cell_bw = cell_bw.zero_state(batch_size=batch_size , 
dtype=tf.float32)
output , _ = tf.nn.bidirectional_dynamic_rnn(cell_fw=cell_fw , 
cell_bw=cell_bw , 
inputs=x , 
initial_state_fw=init_cell_fw , 
initial_state_bw=init_cell_bw)
w = tf.variable(tf.truncated_normal([2 
, n_hidden_units , n_classes] , 
stddev=0.01))
b = tf.variable(tf.zeros([n_classes]))
output_fw = tf.reshape(output , 
shape=[-1 
, n_hidden_units])
output_bw = tf.reshape(output , 
shape=[-1 
, n_hidden_units])
logist = tf.matmul(output_fw , w[0]) + tf.matmul(output_bw , w[1]) + b
prediction = tf.nn.softmax(logits=logist)
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction , 
labels=y))
train_op = tf.train.adamoptimizer(0.01).minimize(loss_op)
correct_prediction = tf.equal(tf.argmax(prediction , 
1) , tf.argmax(y , 
1))accuracy = tf.reduce_mean(tf.cast(correct_prediction , tf.float32))
init = tf.global_variables_initializer()
with tf.session() as sess:
sess.run(init)
for i in range(n_epoch):
print 
'iter : ' + str(i) + ' ; loss : ' + str(loss) + ' ; train acc : ' + str(train_acc) + ' ; test acc : ' + str(test_acc)

SR彙總基於深度學習方法

1 srcnn fsrcnn learning a deep convolutional network for image super resolution,eccv2014 accelerating the super resolution convolutional neural networ...

深度學習方法

目前大多數深度估計方法是通過2d的到2.5d的表面形狀場景深度比較成功的基於幾何影象方法包括 structure from motion，shape from x，monocular stereo，binocular stereo和multi view stereo 其中shape from x...

基於深度學習的中文語音識別系統框架學習筆記

2 使用原文提供的聲學模型和語言模型測試結果，資料標籤整理在data路徑下，其中primewords st cmd目前未區分訓練集測試集。若需要使用所有資料集，只需解壓到統一路徑下，然後設定utils.py中datapath的路徑即可。我測試時只使用了thches30語音庫，解壓到data資料夾，修...

基於深度學習方法的語音識別研究（三）

SR彙總 基於深度學習方法

深度學習方法

基於深度學習的中文語音識別系統框架學習筆記

相關推薦

SR彙總基於深度學習方法