pytorch分別用MLP和RNN擬合sinx

理論上帶有乙個非線性函式的網路能夠擬合任意函式。那顯然mlp和rnn是科研擬合sinx的。

開頭先把結果給展示出來，然後是**，最後是我的過程。懶得看的直接看前半部分行了，過程給有興趣的人看看。

注：每次訓練torch初始化有不同，所以結果有出入。

乍一看挺多的，實際上簡單得一批。只不過是定義了兩個網路，訓練了兩次，展示的重複**而已。具體**已經注釋。

import torch
import math
import matplotlib.pyplot as plt
class
mlp(torch.nn.module)
:def
__init__
(self)
:super()
.__init__(
) self.layer1=torch.nn.linear(1,
16)self.layer2=torch.nn.linear(16,
16)self.layer3=torch.nn.linear(16,
1)defforward
(self,x)
: x=self.layer1(x)
x=torch.nn.functional.relu(x)
x=self.layer2(x)
x=torch.nn.functional.relu(x)
x=self.layer3(x)
return x
# rnn takes 3d input while mlp only takes 2d input
class
recnn
(torch.nn.module)
:def
__init__
(self)
:super()
.__init__(
) self.rnn=torch.nn.lstm(input_size=
1,hidden_size=
2,num_layers=
1,batch_first=
true
)#至於這個線性層為什麼是2維度接收，要看最後網路輸出的維度是否匹配label的維度
self.linear=torch.nn.linear(2,
1)defforward
(self,x)
:# print("x shape: {}".format(x.shape))
# x [batch_size, seq_len, input_size]
output,hn=self.rnn(x)
# print("output shape: {}".format(output.shape))
# out [seq_len, batch_size, hidden_size]
x=output.reshape(-1
,2)# print("after change shape: {}".format(x.shape))
x=self.linear(x)
# print("after linear shape: {}".format(x.shape))
return x
defplotcurve
(mlp, rnn, input_x, x)
:# input_x 是輸入網路的x。
# sin_x 是列表，x的取值，一維資料、
# 雖然他們的內容（不是維度）是一樣的。可以print shape看一下。
mlp_eval = mlp.
eval()
rnn_eval = rnn.
eval()
mlp_y = mlp_eval(input_x)
rnn_y = rnn_eval(input_x.unsqueeze(0)
) plt.figure(figsize=(6
,8))
plt.subplot(
211)
plt.plot(
[i +
1for i in
range
(epoch)
], mlp_loss, label=
'mlp'
) plt.plot(
[i +
1for i in
range
(epoch)
], rnn_loss, label=
'rnn'
) plt.title(
'loss'
) plt.legend(
) plt.subplot(
212)
plt.plot(x, torch.sin(x)
, label=
"original"
, linewidth=3)
plt.plot(x,
[y[0
]for y in mlp_y]
, label=
'mlp'
) plt.plot(x,
[y[0
]for y in rnn_y]
, label=
'rnn'
) plt.title(
'evaluation'
) plt.legend(
) plt.tight_layout(
) plt.show(
)#常量都取出來，以便改動
epoch=
1000
rnn_lr=
0.01
mlp_lr=
0.001
left,right=-10
,10pi=math.pi
if __name__ ==
'__main__'
: mlp=mlp(
) rnn=recnn(
)# x,y 是普通sinx 的torch tensor
x = torch.tensor(
[num * pi /
4for num in
range
(left, right)])
y = torch.sin(x)
# input_x和labels是訓練網路時候用的輸入和標籤。
input_x=x.reshape(-1
,1) labels=y.reshape(-1
,1)#訓練mlp
mlp_optimizer=torch.optim.adam(mlp.parameters(
), lr=mlp_lr)
mlp_loss=
for epoch in
range
(epoch)
: preds=mlp(input_x)
loss=torch.nn.functional.mse_loss(preds,labels)
mlp_optimizer.zero_grad(
) loss.backward(
) mlp_optimizer.step())
)#訓練rnn
rnn_optimizer=torch.optim.adam(rnn.parameters(
),lr=rnn_lr)
rnn_loss=
for epoch in
range
(epoch)
: preds=rnn(input_x.unsqueeze(0)
)# print(x.unsqueeze(0).shape)
# print(preds.shape)
# print(labels.shape)
loss=torch.nn.functional.mse_loss(preds,labels)
rnn_optimizer.zero_grad(
) loss.backward(
) rnn_optimizer.step())
) plotcurve(mlp, rnn, input_x, x)

有些人的**是多加了dalaloader來做了資料集的loader，我個人認為沒啥必要，這麼簡單的東西。當然加了loader或許更加符合習慣。

為什麼資料只取了20個（從left到right只有sinx的20個資料）？我一開始是從-128附近取到了128附近，但是發現訓練效果奇差無比，懷疑人生了都。這僅僅取了20個資料，都需要1000次訓練，更大的資料集的時間代價可見一斑。

rnn的lr是0.01，mlp的是0.001？這個也是根據loss的圖來調節的，0.001在我這個rnn裡並不適合，訓練太慢了。而且為了和mlp的epoch保持一致，就換了0.01的學習率。但是為什麼rnn比mlp下降的慢？這個有待進一步討論（當然是因為我太菜了）。

關於loss function，為什麼用mse loss？隨便選的。我又換了l1_loss和其他的loss試了，效果差不多，畢竟這麼簡單的函式擬合，什麼損失函式無所謂了。

**指出，rnn系列網路比mlp擬合時間序列資料能力更強，為什麼這次訓練反而比mlp下降更慢？不僅如此，其實如果多次比較mlp和rnn的擬合效果，發現mlp效果更穩定更好一些，這又是為什麼呢？有待進一步**。

pytorch分別用MLP和RNN擬合sinx

分別用陣列和鍊錶實現堆

佇列類（分別用列表和鍊錶實現）

DayOne ,分別用迴圈和遞迴的方式實現階乘

pytorch分別用MLP和RNN擬合sinx

分別用陣列和鍊錶實現堆

佇列類（分別用列表和鍊錶實現）

DayOne ,分別用迴圈和遞迴的方式實現階乘

相關推薦