NVIDIA Updates CUDA Toolkit to Version 2.3
NVIDIA today updated its CUDA Toolkit and SDK for GPU computing to version 2.3, introducing many significant new features. According to NVIDIA, the new toolkit marks a major upgrade in the ability to squeeze the best possible performance out of its CUDA-enabled GPUs.
In addition to improvements in performance and more support for the CUDA-GDB hardware debugger, the CUDA Toolkit 2.3 also features:
Support for double-precision transforms and improved performance for single-precision transforms in the CUFFT Library
CUDA-GDB debugger for all supported Linux distros and inclusion of the hardware debugger and CUDA Visual Profiler in the CUDA Toolkit installer
Individual numbering of each GPU in an SLI group to allow applications to take advantage of multi-GPU performance even for graphics enabled SLI
32-bit application compilation support even in the 64-bit versions of the CUDA Toolkit
Ease of data storage in fp16 format with computation in fp32 format through support for fp16 <->fp32 conversion. This would help reduce memory space and bandwidth consumption as fp16 format is great for applications that require higher numerical range than 16-bit integer but less precision than fp32
Talking about the CUDA SDK, the new features include:
An all new pitchLinearTexture code sample that depicts how to efficiently texture from pitch to linear memory.
Easy understanding on usage of cuModuleLoadDataEx() to load PTX source from memory instead of a file through a new PTXJIT code sample
Illustrations on how to use the NVCUVID library for decoding MPEG-2, VC-1, and H.265 content and passing frames to OpenGL or Direct3D for display
Better guidance for properly aligning CUDA kernel function parameters to ensure the same code works on both x32 and x64 systems
There have been many improvements made in the Visual Profiler too; significant among these include:
Complete reporting of all memory transfer API calls
Support for profiling multiple contexts per GPU
Perfect requesting of start time on the CPU and start/end times on the GPU for all kernel launches and memory transfer through synchronised clocks
Global memory load and store efficiency metrics for GPUs with compute capability 1.2 and higher
In addition to the above new features, NVIDIA has also packaged the CUDA Driver for Mac OS separately from the CUDA Toolkit. If you are a developer, then you can head over to http://forums.nvidia.com/index.php?showtopic=102548 to download the latest CUDA Toolkit, SDK and drivers.