Thursday, October 01, 2009

8800 GT vs GTX 295

As Himanshu has replaced the NVIDIA GeForce GTX 295 to the old Tesla card, I have been playing with it.

GTX 295 has 2 devices. Then when running deviceQuery, it returns that there are device 0 and 1 that have the properties as follow:

CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 939261952 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.24 GHz
Concurrent copy and execution: Yes


In simpleStreams sample code:

Device name : GeForce GTX 295
CUDA Capable SM 1.3 hardware with 30 multi-processors
scale_factor = 1.0000
array_size = 16777216
memcopy: 19.75
kernel: 19.42
non-streamed: 39.09 (39.17 expected)
4 streams: 20.88

Device name : GeForce 8800 GT
CUDA Capable SM 1.1 hardware with 14 multi-processors
scale_factor = 1.0000
array_size = 16777216
memcopy: 41.44
kernel: 39.59
non-streamed: 80.92 (81.03 expected)
4 streams: 43.94




No comments:

Post a Comment