機器學習筆記11 隨機梯度下降法

2.隨機梯度下降法的實現

3.scikit-learn中的sgd

4總結批量梯度下降為：

即：

每次只選取乙個樣本進行梯度下降。

批量梯下降法計算耗時過大，隨機梯度法算量小，時間複雜度小。

每次尋找（迭代）改變步長η

\eta

η，為模擬退火的思想。

其中,a,b為超引數。

import numpy as np
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import pyplot as plt
# np.random.seed(665)
m =10000
x =2
* np.random.normal(size=m)
x = x.reshape(-1
,1)y = x *3.
+4.+ np.random.normal(size=m)
defj
(theta, x_b, y)
:'''
loss function
'''try:
return np.
sum(
(y-x_b.dot(theta))**
2)/len
(x_b)
except
:return
float
('inf'
)def
dj_sgd
(theta, x_b_i, y_i)
:return x_b_i.t.dot(x_b_i.dot(theta)
- y_i)*2
.

def
sgd(x_b, y, initial_theta, n_iters)
: t0 =
5 t1 =
50def
learning_rate
(t):
return t0 /
(t + t1)
# 損失函式不一定一直減小,所以只限制迭代次數
theta = initial_theta
for cur_iter in
range
(n_iters)
: rand_i = np.random.randint(
len(x_b)
) gradient = dj_sgd(theta, x_b[rand_i]
, y[rand_i]
) theta = theta - learning_rate(cur_iter)
* gradient
return theta

%
%time
x_b = np.hstack(
[np.ones(
(len
(x),1)
),x]
)initial_theta = np.zeros(x_b.shape[1]
)theta = sgd(x_b, y, initial_theta, n_iters=
len(x_b)//3
)

wall time: 24 ms

theta

array([3.96962099, 2.91398727])

from sklearn.linear_model import sgdregressor
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.preprocessing import standardscaler
boston = datasets.load_boston(
)x = boston.data
y = boston.target
x = x[y <
50.0
]# 因為上限為50.0，超過50.0的部分也按50算
y = y[y <
50.0
]from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x,y,random_state=1)
standard = standardscaler(
)standard.fit(x_train)
x_train_standard = standard.transform(x_train)
standard.fit(x_test)
x_test_standard = standard.transform(x_test)

sgd_reg = sgdregressor(
)%time sgd_reg.fit(x_train_standard, y_train)
sgd_reg.score(x_test_standard, y_test)

wall time: 3.99 ms 0.7775560898753987

學習筆記隨機梯度下降法

神經網路和深度學習梯度下降法中，目標函式是整個訓練集上的風險函式，這種方式稱為批量梯度下降法 batch gradient descent，bgd 批量梯度下降法在每次迭代時需要計算每個樣本上損失函式的梯度並求和當訓練集中的樣本數量n很大時，空間複雜度比較高，每次迭代的計算開銷也很大。真正的優...

梯度下降法和隨機梯度下降法

批量梯度下降法 batch gradient descent 在更新引數時使用所有的樣本來進行更新隨機梯度下降法 stochastic gradient descent 求梯度時沒有用所有的m個樣本的資料，而是僅僅選取乙個樣本j來求梯度。小批量梯度下降法 mini batch gradient d...

機器學習之梯度下降法梯度下降法分析

梯度下降法的基本思想是函式沿著其梯度方向增加最快，反之，沿著其梯度反方向減小最快。在前面的線性回歸和邏輯回歸中，都採用了梯度下降法來求解。梯度下降的迭代公式為 j j j j 在回歸演算法的實驗中，梯度下降的步長為0.01，當時也指出了該步長是通過多次時間找到的，且換一組資料後，演算法可能不收斂。...

機器學習筆記11 隨機梯度下降法

學習筆記 隨機梯度下降法

梯度下降法和隨機梯度下降法

機器學習之梯度下降法 梯度下降法分析

相關推薦

學習筆記隨機梯度下降法

機器學習之梯度下降法梯度下降法分析