I’ve just received an interesting rumor from a reputable source about some of the goings-on inside NVidia.  It seems that, due to lack of a market, they will be trimming their drivers to only support 8 GPUs per system image instead of 16.

If you’re not aware, NVidia’s drivers right now allow you to run 16 GPU’s on a single computer simultaneously.  The only real reason you’ld want to do this is for massive GPGPU work, and in the HPC arena there are some interesting uses.  Companies like ScaleMP allow you to cluster multiple computers into a “single system image”, effectively turning a small cluster into one giant SMP machine, coagulating all of the resources together. Companies like SGI with their UltraViolet offering do it as well, but in hardware, allowing you to take a giant Rack of equipment and run it as one giant computer, or lots of individual computers, depending on the application.  Right now, users can slap 16 GPU’s in an SGI UltraViolet machine and access all 16 of them from their 1 OS install, reaping the benefits of simple pthreads and shared memory (rather than something more complex like MPI).  In fact, several of the larger labs have been pushing NVidia to raise the limit so that they can add more GPU’s to their machines.  Now, NVidia is going the other direction.

This is a tricky subject.  Yes, the market for 16+ GPU’s in a Single System image is small, but it’s a market with very deep pockets and a market that is at the forefront of what you can do with GPGPU computing.  While they’ll probably save money on reduced driver development , they’re going to burn a bit of their credibility with the HPC community if this really happens.

Also, they might wind up alienating a few of their vendor partners.  Folks like NextIO are preparing big PCI Express Expansion Chassis’ that will allow up to 24 GPU’s to be connected to a single PC.  I’m sure they were hoping the driver limit would go up rather than down as well.

I’ve got a call out to NVidia for any more details.. I’ll update when I hear more.

Update 8:14pm: Lots of information from multiple sources.

First off, seems I might have been mistaken in some of my comments above.  Currently, only 8 GPUs are supported in most platforms but vendors have been promising 12 and eventually 16 in the near future. So the “Reduction” to 8 is really just a freeze at the current capabilities.

Then, an official response from Nvidia (which isn’t really surprising given the above correction):

NVIDIA has never offered support for 16 GPUs within a single image, so nothing has been ‘trimmed’.

As NVIDIA has demonstrated in the past, we pay close attention to the needs of our customers. If we see significant market demand for this functionality, then we will re-evaluate the need to support it.

So, it’s disappointing to see things stopping at 8 GPUs per image, although the usage of such systems is pretty small.  Hopefully, tho, more systems will come online (either in super-workstations, UltraViolets, or PCI Expansion chassis) and they’ll re-evaluate this.