（深度學習）為什麼GPU比CPU慢？

gpu由於擅長矩陣運算，在深度學習尤其是計算機視覺方面得到了廣泛的應用。

前幾天在我廢了好大勁在我的的電腦上安裝了tensorflow 2.0 - gpu，然後就迫不及待地去體驗一下gpu的速度。

我去tensorflow官網上直接複製了一段**，就是最簡單的神經網路識別mnist手寫數字資料集。然後分別用gpu和cpu跑了以下，結果讓我大吃一驚。之前聽別人說用gpu通常會比cpu快好幾倍，而我經過嘗試發現gpu竟然比cpu還要慢了好多！

經過請教別人和上網查資料得出結論：是因為模型規模過小，沒有體現出gpu的優勢。

下面先看一下我的電腦的cpu和gpu的配置：

硬體型號

cpu第六代英特爾酷睿i5-6200u處理器

gpunvidia geforce 940mnvidia geforce 940m

下面看**。大家可以跑一下試試（不同硬體配置結果可能不同）

#tensorflow and tf.keras
import tensorflow as tf
#helper libraries
import numpy as np
import matplotlib.pyplot as plt
from time import time
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
#用cpu運算
starttime1 = time()
with tf.device('/cpu:0'):
model = tf.keras.models.sequential([
tf.keras.layers.flatten(input_shape=(28, 28)),
tf.keras.layers.dense(128, activation='relu'),
tf.keras.layers.dropout(0.2),
tf.keras.layers.dense(10, activation='softmax')
])model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test)
t1 = time() - starttime1
#用gpu運算
starttime2 = time()
with tf.device('/gpu:0'):
model = tf.keras.models.sequential([
tf.keras.layers.flatten(input_shape=(28, 28)),
tf.keras.layers.dense(128, activation='relu'),
tf.keras.layers.dropout(0.2),
tf.keras.layers.dense(10, activation='softmax')
])model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test)
t2 = time() - starttime2
#列印執行時間

測試結果：

測試結果分析

gpu比cpu慢的原因大致為：

資料傳輸會有很大的開銷，而gpu處理資料傳輸要比cpu慢，而gpu的專長矩陣計算在小規模神經網路中無法明顯體現出來。

#tensorflow and tf.keras
import tensorflow as tf
#helper libraries
import numpy as np
import matplotlib.pyplot as plt
from time import time
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
#cpu執行
starttime1 = time()
with tf.device('/cpu:0'):
model = tf.keras.models.sequential([
tf.keras.layers.flatten(input_shape=(28, 28)),
tf.keras.layers.dense(1000, activation='relu'),
tf.keras.layers.dropout(0.2),
tf.keras.layers.dense(1000, activation='relu'),
tf.keras.layers.dropout(0.2),
tf.keras.layers.dense(10, activation='softmax')
])model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test)
t1 = time() - starttime1
#gpu執行
starttime2 = time()
with tf.device('/gpu:0'):
model = tf.keras.models.sequential([
tf.keras.layers.flatten(input_shape=(28, 28)),
tf.keras.layers.dense(1000, activation='relu'),
tf.keras.layers.dropout(0.2),
tf.keras.layers.dense(1000, activation='relu'),
tf.keras.layers.dropout(0.2),
tf.keras.layers.dense(10, activation='softmax')
])model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test)
t2 = time() - starttime2
#列印執行時間

測試結果

以上，希望能給大家帶來幫助！

（深度學習）GPU比CPU慢？快看這裡！

gpu由於擅長矩陣運算，在深度學習尤其是計算機視覺方面得到了廣泛的應用。前幾天在我廢了好大勁在我的的電腦上安裝了tensorflow 2.0 gpu，然後就迫不及待地去體驗一下gpu的速度。我去tensorflow官網上直接複製了一段就是最簡單的神經網路識別mnist手寫數字資料集。然後分別用gp...

FPGA為什麼比CPU和GPU快

2018 03 05 11 28 cpu和gpu都屬於馮諾依曼結構，指令解碼執行，共享記憶體。fpga之所以比cpu gpu更快，本質上是因為其無指令，無共享記憶體的體系結構所決定的。馮氏結構中，由於執行單元可能執行任意指令，就需要有指令儲存器解碼器各種指令的運算器分支跳轉處理邏輯。而fpg...

為什麼GPU對於深度學習如此重要

計算機發展到今天，已經大大改變了我們的生活，我們已經進入了智慧型化的時代。但要是想實現影視作品中那樣充分互動的人工智慧與人機互動系統，就不得不提到深度學習。2015年4月15日，nvidia在北京舉行 gpu計算開啟深度學習的大門主題分享會，與廣大分享了其在gpu研發方面取得的成績和最新的研究成...

（深度學習）為什麼GPU比CPU慢？

（深度學習）GPU比CPU慢？快看這裡！

FPGA為什麼比CPU和GPU快

為什麼GPU對於深度學習如此重要

相關推薦