As I needed to record the execution time of matrix multiplication in order to show the performance of streaming on my paper, I got back to the GeForce 8800 GT machine again and ran the application. Once the results came out, it looked like this:
GeForce 8800 GT
MatDim: 16
Total with non stream: 0.11
Total with 8 streams: 47001.85
How come it took so long like that! So I ran the same application on Tesla C870 and the results looked like:
Tesla C870
MatDim: 16
Total with non stream: 0.05
Total with 8 streams: 0.16
There must be something wrong with my 8800 but what is it? :(
No comments:
Post a Comment