怎么用GPU编写Hello World_it知识_综合知识

函数声明

在GPU编程中，有三种函数的声明：

	Executed on	Only callable from
__global__ void KernelFunc()	device	host
__device__ float DeviceFunc()	device	device
__host__ float HostFunt()	host	host

这里的host端就是指CPU，device端就是指GPU；使用__global__声明的核函数是在CPU端调用，在GPU里执行；__device__声明的函数调用和执行都在GPU中；__host__声明的函数调用和执行都在CPU端。

并行优化定理

在讲GPU并行计算之前，我们先讲一下使用GPU后能提高性能的理论值，即Amdahld定理，也就是相对串行程序而言，并行程序的加速率。

假设程序中可并行代码的比例为p，并行处理器数目是n，程序并行化后的加速率为：

GPU Hello World

Hello World程序是我们学习任何编程语言时，第一个要完成的，虽然cuda c并不是一门新的语言，但我们还是从Hello World开始Cuda编程。

#include <stdio.h>
#include "cuda_runtime.h"
#include "device_launch_parameters.h"

__global__ void hello_world(void)
{
    printf("GPU: Hello world! Thread id : %d\n", threadIdx.x);
}

int main(){
    printf("CPU: Hello world!\n");
    hello_world <<<1, 10>>>();
    // cudaDeviceReset must be called before exiting in order for profiling and
    // tracing tools such as Nsight and Visual Profiler to show plete traces.
    cudaDeviceReset();
    return 0;
}

程序中的具体语法我们后面会讲到，这里只要记住<<<1, 10>>>是调用了10个线程即可，执行上面的程序，会打印出10个GPU的Hello World，这个就是SIMD，即单指令多线程，多个线程执行相同的指令，就像程序中的这个10个线程同时执行打印Hello Wolrd的这个指令一样。

• Linux Ecdsa密钥长度选择有何依据

在Linux

• Linux Khook在内核监控中的应用如何

Linux

• Linux Gsoap是否支持异步通信

GSOAP是

• Linux Coremail如何提升用户体验

提升Linu

• Linux Ecdsa算法有哪些局限性

ECDSA