nvidia-logoAnyone who’s done any CUDA or OpenCL Programming has dealt with the problem of moving data.  First you move it from Disk to Main Memory, then from Main Memory to the CPU Memory, then the CPU Memory to the GPU Memory, where you finally do some work.  Then you reverse the whole process. It’s time consuming and becomes the primary bottleneck of GPGPU codes.  A new partnership between Infiniband provider Mellanox and NVidia aims to reduce some of this by granting the GPU direct high-speed access to IO.

The system architecture of a GPU-CPU server requires the CPU to initiate and manage memory transfers between the GPU and the InfiniBand network. The new software solution will enable Tesla GPUs to transfer data to pinned system memory that a Mellanox InfiniBand solution is able to read and transmit over the network. The result is increased overall system performance and efficiency.

via HPCwire: NVIDIA, Mellanox Increase Cluster Performance.