pytorch學習筆記（二） gradient

在bp的時候，pytorch是將variable的梯度放在variable物件中的，我們隨時都可以使用variable.grad得到對應variable的grad。剛建立variable的時候，它的grad屬性是初始化為0.0的。

import torch
from torch.autograd import variable
w1 = variable(torch.tensor([1.0,2.0,3.0]),requires_grad=true)#需要求導的話，requires_grad=true屬性是必須的。
w2 = variable(torch.tensor([1.0,2.0,3.0]),requires_grad=true)
print(w1.grad)
print(w2.grad)

variable containing:
0 00[torch.floattensor of size 3]
variable containing:
0 00[torch.floattensor of size 3]

從下面這兩段**可以看出，使用d.backward()求variable的梯度的時候，variable.grad是累加的即:variable.grad=variable.grad+new_grad

d = torch.mean(w1)
d.backward()
w1.grad

variable containing:
0.3333
0.3333
0.3333
[torch.floattensor of size 3]

d.backward()
w1.grad

variable containing:
0.6667
0.6667
0.6667
[torch.floattensor of size 3]

既然累加的話，那我們如何置零呢？

w1.grad.data.zero_()
w1.grad

variable containing:
0 00[torch.floattensor of size 3]

# 獲得梯度後，如何更新 learning_rate = 0.1 #w1.data -= learning_rate * w1.grad.data 與下面式子等價 w1.data.sub_(learning_rate*w1.grad.data)# w1.data是獲取儲存weights的tensor

這裡更新的時候為什麼要用tensor更新，為什麼不直接用variable？

variable更多是用在feedforward中的，因為feedforward是需要記住各個tensor之間聯絡的，這樣，才能正確的bp。tensor不會記錄路徑。而且，如果使用variable操作的話，就會造成迴圈圖了（猜測）。

如果每個引數的更新都要w1.data.sub_(learning_rate*w1.grad.data)，那就比較頭疼了。還好，pytorch為我們提供了torch.optim包，這個包可以簡化我們更新引數的操作。

import torch.optim as optim
# create your optimizer
optimizer = optim.sgd(net.parameters(), lr = 0.01)
# in your training loop:
for i in range(steps):
optimizer.zero_grad() # zero the gradient buffers，必須要置零
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step() # does the update

注意：torch.optim只用於更新引數，不care梯度的計算。

backward(gradient=none, retain_variables=false)

引數：

gradient (tensor) – gradient of the differentiated function w.r.t. the data. required only if the data has more than one element

z.backword(gradient=grads)

上面**應該怎麼解釋呢？ ∂

obj∂

z∂z∂

w=gr

ads∗

∂z∂w

pytorch學習筆記（二） gradient

PyTorch學習筆記（二）變數

Pytorch 學習筆記

Pytorch學習筆記

pytorch學習筆記（二） gradient

PyTorch學習筆記（二） 變數

Pytorch 學習筆記

Pytorch學習筆記

相關推薦

PyTorch學習筆記（二）變數