The Thomas Jefferson National Accelerator Facility, home of the Continuous Electron Beam Accelerator Facility, just upgraded their main data-crunching supercomputing with 480 GPU’s, making it approximately 1-million times faster.
Researchers at the lab, officially known as Thomas Jefferson National Accelerator Facility, took the same approach, only they applied it to nuclear physics.
They turbocharged 266 central processing units, the part of a computer that functions like a brain, with 480 graphics processing units. As a result, the system absorbs information 1 million times faster than a standard computer.
The system helps researchers better predict and analyze experiments performed at the lab’s Continuous Electron Beam Accelerator Facility. The beam, like the Large Hadron Collider in Switzerland, smashes atoms together to decipher quarks, dark matter and other particles that make up the basic building blocks of matter.
The work was funded by $5 Million from the ARRA (that always annoys me a bit), and was done with NVidia graphics cards.
However, I must report that I don’t entirely trust the numbers in the report. They talk about 480 graphics units, but it’s a GTX480 in the system (From the photo above), so someone may have been confused. Also, 1-million times faster I think is supposed to be compared to a single desktop, a relatively useless comparison. More useful would be to see how it performed comparing the system before and after the GPUs were installed.
via Video game parts help build supercomputer at Jefferson Lab – WTKR.
@ Chip Watson Awesome, Thanks for providing the details!
Yes, these are GTX480s which sustain >250 GFlops on our numerical kernel (inverting a very large matrix of complex structures). Typical jobs use 4 GPUs in one box (sustaining over 1 Teraflops), and some jobs use multiple boxes connected with 40 Gbps Infiniband. Yes, the comparison is to a not so impressive desktop and thus slightly hyperbolic. For technical types, the algorithm actually does half single precision and data compression to squeeze as much performance from the memory bandwidth as possible (wasting flops, which are abundant). A quad GPU server outperforms a cluster of 50 modern dual Xeon servers (the GPU host in this case) for a mere 50% increase in cost. While most of our heavy lifting can run on the gaming cards (we have a quick check for correctness), 20% of our cards are C2050s (which are slower for our main kernel), for calculations in which the ECC is essential or where double precision is essential.