yolov3訓練資料集GPU CPU速度對比

2021-10-08 16:03:30 字數 2840 閱讀 4247

配置:gpu:gtx 1650

cpu:i7 9750h

image shape:(720,1160,3)

下面做個速度對比:

batchsize = 8

loading weights into state dict...

device: cpu

finished!

epoch:1/25

iter:0/225 || total loss: 8247.7607 || 20.8434s/step

epoch:1/25

iter:1/225 || total loss: 7623.9658 || 9.8416s/step

epoch:1/25

iter:2/225 || total loss: 7024.4570 || 10.1843s/step

epoch:1/25

iter:3/225 || total loss: 6414.3638 || 9.9309s/step

epoch:1/25

iter:4/225 || total loss: 5865.8257 || 10.0282s/step

epoch:1/25

iter:5/225 || total loss: 5380.3311 || 10.8697s/step

batchsize = 8
loading weights into state dict...

device: cuda

finished!

epoch:1/25

iter:0/225 || total loss: 7440.8184 || 14.0086s/step

epoch:1/25

iter:1/225 || total loss: 6932.9785 || 0.4398s/step

epoch:1/25

iter:2/225 || total loss: 6362.1045 || 0.4192s/step

epoch:1/25

iter:3/225 || total loss: 5775.0967 || 0.4355s/step

epoch:1/25

iter:4/225 || total loss: 5259.7769 || 0.4385s/step

epoch:1/25

iter:5/225 || total loss: 4812.0654 || 0.4408s/step

第一次迭代時間比較長,是因為要載入傳輸資料,這個耗費了比較長的時間。

batchsize = 16

loading weights into state dict...

device: cpu

finished!

epoch:1/25

iter:0/112 || total loss: 7751.8955 || 29.5190s/step

epoch:1/25

iter:1/112 || total loss: 7198.8145 || 18.0329s/step

epoch:1/25

iter:2/112 || total loss: 6630.2368 || 17.9401s/step

epoch:1/25

iter:3/112 || total loss: 6030.8477 || 18.0231s/step

epoch:1/25

iter:4/112 || total loss: 5493.2349 || 17.8391s/step

epoch:1/25

iter:5/112 || total loss: 5020.0029 || 17.8281s/step

batchsize = 16
loading weights into state dict...

device: cuda

finished!

epoch:1/25

iter:0/112 || total loss: 8825.2871 || 15.5444s/step

epoch:1/25

iter:1/112 || total loss: 8157.2827 || 0.6872s/step

epoch:1/25

iter:2/112 || total loss: 7525.5874 || 0.6932s/step

epoch:1/25

iter:3/112 || total loss: 6867.6533 || 0.6735s/step

epoch:1/25

iter:4/112 || total loss: 6265.2026 || 0.6701s/step

epoch:1/25

iter:5/112 || total loss: 5735.0771 || 0.6900s/step

由於我的顯示卡只允許我把batchsize調到16,無法再大了,所有就用這四組資料做個簡單的對比:

我這裡可以做個推斷,隨著batchsize的增大,gpu的計算優勢更加顯著。這個已經有人做了更多的實驗證明過。

我用gtx1650訓練了2000張,用了約50分鐘,25個epoch,平均每個epoch用了120s,粗略估計一下,如果用cpu i79750h花費時間21.9h。這裡就體現gpu的重要性了。

YOLOV3訓練VOC資料集

darknet master build darknet x64 data.darknet master build darknet x64 data voc3.命令列cd進入 darknet master build darkne x64 data voc,然後執行python voc label...

yolov3訓練自己的資料集

6.測試訓練出的網路模型 訓練好後可以在 backup看到權重檔案 嘗試test前要修改cfg檔案,切換到test模式。可以重新建立乙個測試cfg檔案,如yolov3 voc ball test.cfg 測試 darknet detector test cfg voc ball.data cfg y...

YOLOv3訓練自己的VOC資料集

yolo 一 安裝darknet並借助預訓練權重進行檢測 1.安裝darknetgit clone https pjreddie.com media files yolov3.weights3.執行檢測.darknet detect cfg yolov3.cfg yolov3.weights dat...