PMPP SS2009 - Exercise 1
-
Implement a matrix multiplication (or improve your solution from exercise 0) that
- allows arbitrary matrix dimensions (i.e multiply a nxm matrix with a mxl matrix where n,m,l \in N),
- makes efficient use of shared memory and
- uses coalesced global memory access.
-
Implement an additional version of the matrix multiplication utilizing the CUBLAS library that
- allows arbitrary matrix dimensions (i.e multiply a nxm matrix with a mxl matrix where n,m,l \in N) and
- makes use of the appropriate methods given by the library.
Basically the task is to write a wrapper function for the matrix multiplication, that is provided by the library. The CUBLAS library supports vector-vector, vector-matrix and matrix-matrix operations. You can find further details in the CUBLAS documentation.
Hint: Keep in mind that CUBLAS uses column-major storage
-
Compare your results from 1) in terms of both correctness and execution time to
- the cpu implementation,
- your solution for exercise 0 and
- the CUBLAS implementation
Please include your results from 3) in your submission. A simple textfile with your results and your thoughts on correctness and execution time is sufficient.
This exercise is due until May 12th, 2pm (send your solution to thorsten.franzel@gris.informatik.tu-darmstadt.de). You can obtain one bonus point if, and only if you solved all items of this exercise.
Note: A similar matrix multiplication is already part of the SDK example projects. While it is certainly reasonable to have an inspiring look at it, you should really implement it by yourself. The techniques used here are an integral part of programming with CUDA and you will need this knowledge for the final projects at the latest.
