CUDA小記（2）執行第乙個CUDA程式

相信搭建完環境之後，你一定會迫不及待想要開始程式設計。。。

等一下，讓我們先來看看官方的樣例**，先對如何編譯**有個大致了解。

1.看乙個簡單程式，asyncapi（0——******）檢視顯示卡型號，比較cpu與gpu效能。

只有乙個.cu檔案記錄**，我們先看一下.cu檔案。

// includes, system

#include

// includes cuda runtime

#include

// includes, project

#include

#include // helper utility functions

//前面有限定符，稱之為核函式

__global__ void increment_kernel(int *g_data, int inc_value)

bool correct_output(int *data, const int n, const int x)

return true;

}//主函式

int main(int argc, char *argv)

checkcudaerrors(cudaeventelapsedtime(&gpu_time, start, stop));

// print the cpu and gpu times

printf("time spent executing by the gpu: %.2f\n", gpu_time);

printf("time spent by cpu in cuda calls: %.2f\n", sdkgettimervalue(&timer));

printf("cpu executed %lu iterations while waiting for gpu to finish\n", counter);

// check the output for correctness

bool bfinalresults = correct_output(a, n, value);

// release resources

checkcudaerrors(cudaeventdestroy(start));

checkcudaerrors(cudaeventdestroy(stop));

checkcudaerrors(cudafreehost(a));

checkcudaerrors(cudafree(d_a));

exit(bfinalresults ? exit_success : exit_failure);

}2.makefile編譯，在該檔案下命令列輸入make，.o是生成的中間**，還有乙個可執行程式。

3.$./可執行檔案，如下顯示結果：

jlurobot@jlurobot:~/nvidia_cuda-8.0_samples/0_******/asyncapi$ ./asyncapi

[./asyncapi] - starting...

gpu device 0: "geforce gtx titan x" with compute capability 5.2

cuda device [geforce gtx titan x]

time spent executing by the gpu: 41.77

time spent by cpu in cuda calls: 0.04

cpu executed 97962 iterations while waiting for gpu to finish

4.這樣第乙個程式就執行成功了。

（注：這裡我們沒有詳細分析這個樣例，涉及的api及細節將在以後討論）

CUDA 第乙個CUDA程式 addVector

本文主要通過對兩個浮點陣列中的資料進行相加，並將其結果放入第三個陣列中。其演算法分別在cpu gpu上分別執行，並比較了所需時間，強烈感受到gpu的平行計算能力。這裡，每個陣列的元素大小為30000000個。include include include include for the cuda r...

cuda筆記第乙個cuda程式

釋放gpu中的記憶體cudafree cuda函式的定義 global 定義在gpu上，可以在cpu上呼叫的函式 device 定義在gpu上，由gpu呼叫函式 host 在cpu上定義的函式，一般與 device 一起用在gpu上開闢空間 cudamalloc devptr,byte size ...

CUDA程式設計（一）第乙個CUDA程式

cuda compute unified device architecture 是顯示卡廠商nvidia推出的運算平台。是一種通用平行計算架構，該架構使gpu能夠解決複雜的計算問題。說白了就是我們可以使用gpu來並行完成像神經網路影象處理演算法這些在cpu上跑起來比較吃力的程式。通過gpu和高並...

CUDA小記（2）執行第乙個CUDA程式

CUDA 第乙個CUDA程式 addVector

cuda筆記 第乙個cuda程式

CUDA程式設計（一）第乙個CUDA程式

相關推薦

cuda筆記第乙個cuda程式