GPU - Throughput

07

Aug

PMPP Chapter 3: Scalable Parallel Execution

CUDA Thread Organization All CUDA threads in a grid execute the same kernel function; they rely on coordinates to distinguish

07 Aug 2025

4 min read

05

Aug

Data Parallelism When modern software applications run slowly, the problem is usually having too much data to be processed. * Image

05 Aug 2025

5 min read

04

Aug

Traditionally (before 2003), microprocessors are based on single central processing unit (CPU), but due to energy consumption and heat dissipation

04 Aug 2025

4 min read

03

Aug

Programming Massively Parallel Processors is an excellent book on GPU programming. Here is the collection of my notes and related

03 Aug 2025

2 min read

01

Aug

Modern AI and HPC workloads often require different types of accelerators depending on the specific use case. Running a heterogeneous

01 Aug 2025

4 min read