Cuda GPU Computing

by Ashok Regar

Here at Imaginea Labs, we are exploring the true power of gpu-computing by analysing its way of launching a kernel, running 1000’s of thread blocks simultaneously, threads doing memory accesses in terms of global/shared and the optimized way to do so, doing atomic operation in threads when needed, optimizing threads to run more efficiently and faster.

In this post, we talk about cuda architecture and various experiments done on a use case with a brute-force approach to test and explore the gpu computation limits. We’ve had modest success in bringing out the best of gpu and faced some intriguing situations and results along the way.