I have asked Paulius, who work for NVIDIA about what I have found out from running simpleStreams on GeForce 8800 and Tesla C870. He told me that the ability of overlapping memory copy with kernel execution is on GPUs with compute capability 1.1 and higher, but Tesla C870 is compute capability 1.0. Got it!
The information about compute capability is available in Programming Guide. Below was excerpted from Version 2.0.
The compute capability of a device is defined by a major revision number and a minor revision number.
Devices with the same major revision number are of the same core architecture. The devices listed in Appendix A are all of compute capability 1.x (Their major revision number is 1).
The minor revision number corresponds to an incremental improvement to the core architecture, possibly including new features. The technical specifications of the various compute capabilities are given in Appendix A.
Appendix A is about Technical Specifications and all those specifications of each revision of compute capability can be found there.
From the meeting with Dr. Box today, he told us that we are going to have a new GPU from AMD. So he told us to start looking at this. http://ati.amd.com/technology/streamcomputing/resources.html And I will have to work with Clayton and Himanshu to set up the system.
Data scheduling on GPGPU (CUDA) is the key word for my research. :) It might be extend into check point which is related to Mon's research in the future.
No comments:
Post a Comment