A collection of CUDA programs demonstrating GPU computing techniques, primarily focused on matrix multiplication algorithms.
cuda_device_info.cu- CUDA device information and capabilitiesmatrix_multiplication_basic.cu- Basic GPU matrix multiplicationmatrix_multiplication_benchmark.cu- Performance benchmarking with different block sizesmatrix_multiplication_optimized.cu- Optimized matrix multiplication with timingmatrix_multiplication_performance.cu- Performance-focused implementationtiled_matrix_multiplication.cu- Tiled algorithm using shared memorytiled_matrix_multiplication_advanced.cu- Advanced tiled implementation
nvcc -o program_name program_name.cu- GPU vs CPU performance comparison
- Multiple matrix sizes (100x100 to 1500x1500)
- Different thread block configurations
- Shared memory optimization
- Memory transfer timing
- Result validation