梯度下降法python numpy實現

批量梯度下降法(batch gradient descent, bgd)：使用所有樣本在當前點的梯度值來對變數引數進行更新操作。

隨機梯度下降法(stochastic gradient descent, sgd)：在更新變數引數的時候，選取乙個樣本的梯度值來更新引數。

小批量梯度下降法(mini-batch gradient descent, mbgd)：集合bgd和sgd的特性，從原始資料中，每次選擇n個樣本來更新引數值。

以下為分別採用bgd、sgd、mbgd擬合

import time
import numpy as np
# 樣本數為100條，特徵數為二維
def get_data(sample_num=100):
x1 = np.linspace(0, 9, sample_num)
x2 = np.linspace(4, 13, sample_num)
x = np.concatenate(([x1], [x2]), axis=0).t
y = np.dot(x, np.array([3, 4]).t) 
return x, y
# bgd
def bgd(x, y, step_size=0.01, max_iter_count=10000):
w = np.ones((x.shape[1],))
x1 = x[:, 0]
x2 = x[:, 1]
loss = 10
iter_count = 0
while abs(loss) > 0.0001 and iter_count < max_iter_count:
w[0] -= step_size * \
np.sum((w[0] * x1 + w[1] * x2 - y) * x1) / x.shape[0]
w[1] -= step_size * \
np.sum((w[0] * x1 + w[1] * x2 - y) * x2) / x.shape[0]
loss = np.sum(w[0] * x1 + w[1] * x2 - y)
iter_count += 1
print("iter_count:%d the loss:%f" % (iter_count, loss))
return w
# sgd
def sgd(x, y, step_size=0.01, max_iter_count=10000):
w = np.ones((x.shape[1],))
x1 = x[:, 0]
x2 = x[:, 1]
loss = 10
iter_count = 0
while abs(loss) > 0.00001 and iter_count < max_iter_count:
i = np.random.randint(x.shape[0])
w[0] -= step_size * (w[0] * x1[i] + w[1] * x2[i] - y[i]) * x1[i]
w[1] -= step_size * (w[0] * x1[i] + w[1] * x2[i] - y[i]) * x2[i]
loss = np.sum(w[0] * x1 + w[1] * x2 - y)
iter_count += 1
print("iter_count:%d the loss:%f" % (iter_count, loss))
return w
# mbgd
def msgd(x, y, batch_size, step_size=0.01, max_iter_count=10000):
w = np.ones((x.shape[1],))
x1 = x[:, 0]
x2 = x[:, 1]
loss = 10
iter_count = 0
while abs(loss) > 0.00001 and iter_count < max_iter_count:
i = np.random.randint(x.shape[0], size=batch_size)
w[0] -= step_size * \
np.sum((w[0] * x1[i] + w[1] * x2[i] - y[i]) * x1[i]) / batch_size
w[1] -= step_size * \
np.sum((w[0] * x1[i] + w[1] * x2[i] - y[i]) * x2[i]) / batch_size
loss = np.sum(w[0] * x1 + w[1] * x2 - y)
iter_count += 1
print("iter_count:%d the loss:%f" % (iter_count, loss))
return w
if __name__ == '__main__':
time1 = time.time()
x, y = get_data()
# print(bgd(x, y))
# print(sgd(x, y))
print(msgd(x, y, 10))
time2 = time.time()
print(time2 - time1)

執行結果截圖：

梯度下降法和隨機梯度下降法

批量梯度下降法 batch gradient descent 在更新引數時使用所有的樣本來進行更新隨機梯度下降法 stochastic gradient descent 求梯度時沒有用所有的m個樣本的資料，而是僅僅選取乙個樣本j來求梯度。小批量梯度下降法 mini batch gradient d...

梯度下降法

梯度下降法是乙個一階最優化演算法通常也稱為最速下降法我之前也沒有關注過這類演算法。最近，聽史丹福大學的機器學習課程時，碰到了用梯度下降演算法求解線性回歸問題，於是看了看這類演算法的思想。今天只寫了一些入門級的知識。我們知道，函式的曲線如下程式設計實現 c code cpp view pl...

梯度下降法

回歸 regression 梯度下降 gradient descent 發表於332 天前技術,科研被圍觀 1152 次前言這個系列主要想能夠用數學去描述機器學習，想要學好機器學習，首先得去理解其中的數學意義，不一定要到能夠輕鬆自如的推導中間的公式，不過至少得認識這些式子吧，不然看一些相關...

梯度下降法python numpy實現

梯度下降法和隨機梯度下降法

梯度下降法

梯度下降法

相關推薦