針對二分類問題的對數機率模型

2021-08-07 04:24:05 字數 3810 閱讀 6210

針對二分類問題的對數機率模型

以下**片內容為周志華著《機器學習》習題3.3的程式(關於二分類問題的對數機率模型)。

# 周志華,機器學習,習題3.3,對數機率分類

# 導入庫和自編函式

from functionsmyself import newton

import sympy as sp

import numpy as np

import matplotlib.pyplot as plt

# 存入訓練集

attrset = np.matrix([[0.697,0.460,1],[0.774,0.376,1],[0.634,0.264,1],[0.608,0.318,1],

[0.556,0.215,1],[0.403,0.237,1],[0.481,0.149,1],[0.437,0.211,1],

[0.666,0.091,1],[0.243,0.267,1],[0.245,0.057,1],[0.343,0.099,1],

[0.639,0.161,1],[0.657,0.198,1],[0.360,0.370,1],[0.593,0.042,1],[0.719,0.103,1]]).t

flagset = np.matrix(np.concatenate((np.ones(8),np.zeros(9)))).t

numsam = flagset.shape[0]

# 構造對數機率回歸的目標函式

x1,x2,x3,y = sp.symbols('x1 x2 x3 y')

beta,y = np.matrix([[x1],[x2],[x3]]),0*x1

for m in range(numsam):

mid = np.dot(beta.t,attrset[:,m])

y = y - flagset[m,0]*mid[0,0] + sp.log(1+sp.exp(1)**(mid[0,0]))

# 求解對數機率回歸的目標函式

fucarray = np.matrix([[x1],[x2],[x3],[y]])

errset = 1e-14

timesset = 1e2

xcurr = np.matrix([[np.random.random() for m in range(1)] for n in range(fucarray.shape[0]-1)])

betacal = newton(fucarray,errset,timesset,xcurr)

# 觀察習得模型的準確性

plt.close('all')

plt.figure(1)

indexgood,indexbad = ,

for m in range(numsam):

plt.scatter(np.array(attrset[0,indexgood]).reshape(len(indexgood),order='c'),np.array(attrset[1,indexgood]).reshape(len(indexgood),order='c'),marker='o',color='k',label='esgood')

plt.scatter(np.array(attrset[0,indexbad]).reshape(len(indexbad),order='c'),np.array(attrset[1,indexbad]).reshape(len(indexbad),order='c'),marker='o',color='r',label='esbad')

plt.xlabel('density')

plt.ylabel('sugar')

plt.legend(loc='upper left')

plt.title('exercise set')

plt.figure(2)

indexgood,indexbad = ,

for m in range(numsam):

plt.scatter(np.array(attrset[0,indexgood]).reshape(len(indexgood),order='c'),np.array(attrset[1,indexgood]).reshape(len(indexgood),order='c'),marker='o',color='k',label='esgood')

plt.scatter(np.array(attrset[0,indexbad]).reshape(len(indexbad),order='c'),np.array(attrset[1,indexbad]).reshape(len(indexbad),order='c'),marker='o',color='r',label='esbad')

plt.xlabel('density')

plt.ylabel('sugar')

plt.legend(loc='upper left')

plt.title('es result')

plt.show()

# 牛頓法函式

def newton(fucarray,errset,timesset,xcurr):

# fucarray為自變數和因變數組成的(numx+1)*1的符號矩陣,最後乙個元素為因變數,numx為自變數的個數

# errset為函式導數模值的允許誤差範圍

# timesset為牛頓法迭代的最大次數

# xcurr為牛頓法的初始點,是乙個numx*1的矩陣

import sympy

import numpy

numx = fucarray.shape[0]-1

diff1 = numpy.matrix([[sympy.diff(fucarray[numx,0],fucarray[n,0],1)] for n in range(numx)])

diff2 = numpy.matrix([[sympy.diff(diff1[n,0],fucarray[m,0],1) for m in range(numx)] for n in range(numx)])

numdiff1 = numpy.matrix([[0.0 for m in range(1)] for n in range(numx)])

numdiff2 = numpy.matrix([[0.0 for m in range(numx)] for n in range(numx)])

times = 0

while true:

for n in range(numx):

numdiff1[n,0] = diff1[n,0].subs([(fucarray[nn,0],xcurr[nn,0]) for nn in range(numx)])

for m in range(numx):

numdiff2[n,m] = diff2[n,m].subs([(fucarray[nn,0],xcurr[nn,0]) for nn in range(numx)])

if numpy.linalg.norm(numdiff1)

timesset:

break

times = times + 1

xcurr = xcurr - numpy.dot(numdiff2.i,numdiff1)

print('times = ',times)

print('xcurr = ',xcurr)

return xcurr

針對二分類問題的線性判別分析模型

針對二分類問題的線性判別分析模型 以下 片內容為周志華著 機器學習 習題3.5的程式 關於二分類問題的線性判別分析模型 周志華,機器學習,習題3.5,線性判別分類 導入庫 import numpy as np import matplotlib.pyplot as plt 存入訓練集 attrset...

二分類問題模型指標

正如下圖所示,f1的值同時受到p r的影響,單純地追求p r的提公升並沒有太大作用。在實際業務工程中,結合正負樣本比,的確是一件非常有挑戰的事。auc是roc的積分 曲線下面積 是乙個數值,一般認為越大越好,數值相對於曲線而言更容易當做調參的參照。pr曲線會面臨乙個問題,當需要獲得更高recall時...

二分類模型評估

分類演算法最常見的指標是分類準確率 accuracy 而當樣本中的分類極度不均衡時,accuracy不能說明問題 例如在100個觀測樣本中,有95個0,5個1,全部 為0,accuracy是95 已經很高了 一般我們用混淆矩陣 confusion matrix 來描述二分類的好壞,也通過此矩陣衍生出...