對比梯度下降和正規方程解效能

首先再來回顧一下梯度下降法的基礎

現在寫乙個sklearn中的正規方程的線性回歸模型沒有實現fit方法

import numpy as np
from .metrics import r2_score
class linearregression:
def __init__(self):
"""初始化linear regression模型"""
self.coef_ = none
self.intercept_ = none
self._theta = none
def predict(self, x_predict):
"""給定待**資料集x_predict，返回表示x_predict的結果向量"""
assert self.intercept_ is not none and self.coef_ is not none, \
"must fit before predict!"
assert x_predict.shape[1] == len(self.coef_), \
"the feature number of x_predict must be equal to x_train"
x_b = np.hstack([np.ones((len(x_predict), 1)), x_predict])
return x_b.dot(self._theta)
def score(self, x_test, y_test):
"""根據測試資料集 x_test 和 y_test 確定當前模型的準確度"""
y_predict = self.predict(x_test)
return r2_score(y_test, y_predict)
def __repr__(self):
return "linearregression()"

因為這裡是用線性回歸做測試, 線性回歸的正規方程解如下

此公式網上搜的, 推導過程可以看下這位老兄的部落格 (未經允許直接引用了哈)

實現正規方程解的fit如下:

def fit_normal(self, x_train, y_train):
"""根據訓練資料集x_train, y_train訓練linear regression模型"""
assert x_train.shape[0] == y_train.shape[0], \
"the size of x_train must be equal to the size of y_train"
x_b = np.hstack([np.ones((len(x_train), 1)), x_train])
self._theta = np.linalg.inv(x_b.t.dot(x_b)).dot(x_b.t).dot(y_train)
self.intercept_ = self._theta[0]
self.coef_ = self._theta[1:]
return self

要用梯度下降法, 就得找到損失函式, 然後計算梯度

下面是多元線性回歸的瞬時函式

線性回歸的lose

接著實現上圖面試的梯度下降法的fit_gd如下

def fit_gd(self, x_train, y_train, eta=0.01, n_iters=1e4):
"""根據訓練資料集x_train, y_train, 使用梯度下降法訓練linear regression模型"""
assert x_train.shape[0] == y_train.shape[0], \
"the size of x_train must be equal to the size of y_train"
def j(theta, x_b, y):
try:
return np.sum((y - x_b.dot(theta)) ** 2) / len(y)
except:
return float('inf')
def dj(theta, x_b, y):
return x_b.t.dot(x_b.dot(theta) - y) * 2. / len(y)
def gradient_descent(x_b, y, initial_theta, eta, n_iters=1e4, epsilon=1e-8):
theta = initial_theta
cur_iter = 0
while cur_iter < n_iters:
gradient = dj(theta, x_b, y)
last_theta = theta
theta = theta - eta * gradient
if (abs(j(theta, x_b, y) - j(last_theta, x_b, y)) < epsilon):
break
cur_iter += 1
return theta
x_b = np.hstack([np.ones((len(x_train), 1)), x_train])
initial_theta = np.zeros(x_b.shape[1])
self._theta = gradient_descent(x_b, y_train, initial_theta, eta, n_iters)
self.intercept_ = self._theta[0]
self.coef_ = self._theta[1:]
return self

上面的方法都封裝在乙個linearregression.py中

現在開始測試

準備一些隨機資料, 維度為5000, 樣本數量1000

m = 1000
n = 5000
big_x = np.random.normal(size=(m,n))
true_theta = np.random.uniform(0.0, 100.0, size=n+1)
big_y = big_x.dot(true_theta[1:]) + true_theta[0] + np.random.normal(0, 10.0, size=m)
print(big_x.shape, big_y.shape)
%run linearregression.py

輸出:

(1000, 5000) (1000,)

使用正規方程解訓練線性回歸

big_reg1 = linearregression()
%time big_reg1.fit_normal(x_train, y_train)
big_reg1.score(x_test, y_test)

輸出:

cpu times: user 26.8 s, sys: 921 ms, total: 27.7 s wall time: 11.1 s linearregression()

使用梯度下降法訓練

big_reg2 = linearregression()
%time big_reg2.fit_gd(big_x, big_y)

輸出:

cpu times: user 7.44 s, sys: 97.2 ms, total: 7.54 s wall time: 4.15 s linearregression()

結

機器學習線性回歸，梯度下降演算法與正規方程

個人對這方面的理解，文字純手打，來自於coursera的課件 1.線性回歸的定義給出若干的訓練集訓練集中x j i 表示樣本j中第i個項然後擬合為一條直線，使得cost最小不妨先看乙個例子，拿課程中的例子，賣房現在已經知道了若干的房子的大小以及賣出去的現在跟著這些資訊，來推斷一些房子的 ...

梯度下降隨機梯度下降和批量梯度下降

對比梯度下降和隨機梯度下降和批量梯度下降之前看的知識比較零散，沒有乙個系統的解釋說明，看了一些網上的博主的分析，總結了一下自己的理解。例子這裡我參照其他博主的例子做了一些修改，首先是梯度下降 coding utf 8 import random this is a sample to simula...

梯度下降法和隨機梯度下降法

批量梯度下降法 batch gradient descent 在更新引數時使用所有的樣本來進行更新隨機梯度下降法 stochastic gradient descent 求梯度時沒有用所有的m個樣本的資料，而是僅僅選取乙個樣本j來求梯度。小批量梯度下降法 mini batch gradient d...

對比梯度下降和正規方程解效能

機器學習 線性回歸，梯度下降演算法與正規方程

梯度下降 隨機梯度下降和批量梯度下降

梯度下降法和隨機梯度下降法

相關推薦

機器學習線性回歸，梯度下降演算法與正規方程

梯度下降隨機梯度下降和批量梯度下降