梯度下降演算法的python實現

2021-06-17 18:46:24 字數 3287 閱讀 8058

import sys

#training data set

#each element in x represents (x0,x1,x2)

x = [(1,0.,3) , (1,1.,3) ,(1,2.,3), (1,3.,2) , (1,4.,4)]

#y[i] is the output of y = theta0 * x[0] + theta1 * x[1] +theta2 * x[2]

y = [95.364,97.217205,75.195834,60.105519,49.342380]

epsilon = 0.0001

#learning rate

alpha = 0.01

diff = [0,0]

max_itor = 1000

error1 = 0

error0 =0

cnt = 0

m = len(x)

#init the parameters to zero

theta0 = 0

theta1 = 0

theta2 = 0

while true:

cnt = cnt + 1

#calculate the parameters

for i in range(m):

diff[0] = y[i]-( theta0 + theta1 * x[i][1] + theta2 * x[i][2] )

theta0 = theta0 + alpha * diff[0] * x[i][0]

theta1 = theta1 + alpha * diff[0]* x[i][1]

theta2 = theta2 + alpha * diff[0]* x[i][2]

#calculate the cost function

error1 = 0

for lp in range(len(x)):

error1 += ( y[i]-( theta0 + theta1 * x[i][1] + theta2 * x[i][2] ) )**2/2

if abs(error1-error0) < epsilon:

break

else:

error0 = error1

print ' theta0 : %f, theta1 : %f, theta2 : %f, error1 : %f'%(theta0,theta1,theta2,error1)

print 'done: theta0 : %f, theta1 : %f, theta2 : %f'%(theta0,theta1,theta2)

上面這段用的是隨機梯度下降演算法,原來無法收斂是因為1)1樓提到的error1沒有復位置零和下標問題;2)步長alpha和誤差epsilon的取值不合適。aplha和epsilon的取值感覺比較關鍵,我用原先的輸入資料,怎麼調整這兩個引數也沒法收斂,換了一組數(也就是上面這段**裡的訓練資料)很快就得到結果了。

下面這段基本借鑑的是乙個國外網友blog上貼出來的**段,位址忘了,在此致謝!這段**裡用的是批量梯度下降演算法,而且判斷收斂的準則也不同:

import sys

#training data set

data1 = [(0.000000,95.364693) ,

(1.000000,97.217205) ,

(2.000000,75.195834),

(3.000000,60.105519) ,

(4.000000,49.342380),

(5.000000,37.400286),

(6.000000,51.057128),

(7.000000,25.500619),

(8.000000,5.259608),

(9.000000,0.639151),

(10.000000,-9.409936),

(11.000000, -4.383926),

(12.000000,-22.858197),

(13.000000,-37.758333),

(14.000000,-45.606221)]

data2 = [(2104.,400.),

(1600.,330.),

(2400.,369.),

(1416.,232.),

(3000.,540.)]

def create_hypothesis(theta1, theta0):

return lambda x: theta1*x + theta0

def linear_regression(data, learning_rate=0.001, variance=0.00001):

""" takes a set of data points in the form: [(1,1), (2,2), ...] and outputs (slope, y0). """

#init the parameters to zero

theta0_guess = 1.

theta1_guess = 1.

theta0_last = 100.

theta1_last = 100.

m = len(data)

while (abs(theta1_guess-theta1_last) > variance or abs(theta0_guess - theta0_last) > variance):

theta1_last = theta1_guess

theta0_last = theta0_guess

hypothesis = create_hypothesis(theta1_guess, theta0_guess)

theta0_guess = theta0_guess - learning_rate * (1./m) * sum([hypothesis(point[0]) - point[1] for point in data])

theta1_guess = theta1_guess - learning_rate * (1./m) * sum([ (hypothesis(point[0]) - point[1]) * point[0] for point in data])

return ( theta0_guess,theta1_guess )

points = [(float(x),float(y)) for (x,y) in data1]

res = linear_regression(points)

print res

梯度下降演算法及Python實現

梯度下降是乙個用來求函式最小值的演算法,其背後的思想是 開始時我們隨機選擇乙個引數的組合,計算代價函式,然後我們尋找下乙個能讓代價函式值下降最多的引數組合。我們持續這麼做直到到達乙個區域性最小值,因為我們並沒有嘗試完所有的引數組合,所以不能確定我們得到的區域性最小值是否便是全域性最小值,選擇不同的初...

用Python實現梯度下降演算法

梯度下降實際就是導數值下降 梯度下降演算法是乙個方法,是幫助我們找極值點的方法cost 接下來用python實現最土的梯度下降演算法,首先引入numpy模組和圖形介面模組 import matplotlib.pyplot as plt import numpy as np 定義三個函式 乙個數目標函...

梯度下降演算法實現

梯度下降演算法的原理及實現。一.梯度下降的演算法方程式為 二.方程式詳解 引數 1.表示網路中需要訓練的引數。2.表示學習率。表示影象中一點的斜率。含義 假設乙個二次函式,初始位置在曲線上藍色點,如果學習率 設定過大,則 的每一次更新幅值將會很大。如此,若藍點已非常接近最低點,則下一次引數更新的更新...