Nvidia has been in the news lately with the release of their GeForce GTX 480 and 470 graphics cards. One of the things that we have said about these cards is that they seem to be designed more for general purpose computing on graphics cards (GPGPU) than for gaming. Sure, they perform well at gaming, but they rock at GPGPU.
Nvidia has posted an article on using GPGPU on their graphics cards to speed up weather forecasting. Imagine what they could have done with the latest GeForce GTX 480 cards. Now if they could only make accurate weather forecasts.
A research group led by Professor Takayuki Aoki of theTokyo Institute of Technology has succeeded in 100% utilization of GPUs in the next-generation weather forecasting model, codenamed ASUCA, currently being developed by the Japan Meteorological Agency. ASUCA has a similar feature set to WRF, but because it is fully GPU-optimized, ASUCA runs 80 times faster than weather models running on CPUs alone or on CPU/GPU combinations. In short, it is the fastest solution available today.
Thanks to NVIDIA Tesla and CUDA parallel processing architecture, ASUCA simulates a 6 hour event (with 2km mesh size in a 3164x3028x48 grid) in 70 minutes on 120 GPUs, a calculation that would have taken 5600 minutes using CPUs.
via nTersect Blog – Tokyo Tech Weather Forecasting Model Gets 80X Perf Boost Through GPU Acceleration.
@ Charlie Barkin
Charlie of S|A predictions have been uncannily accurate, given that NVIDIA has not been forethcoming about Fermi development and specs. Most unbiased observers realised something was not right with Fermi after woodscrewgate at GTC. In fact NVIDIA has been spinning one line to the public whilst telling its trusted partners others, but then delivering something else entirely. The story about Fermi as its specs have been scaled back in order to bring a functioning product to market, Charlie’s reports have reflected these changes.
Charlie of SemiAccurate, predicted that Fermi would be a failure for NVIDIA and he was right. NVIDIA originally planned for Fermi to be a 512SP part. Instead, they had to drastically scale back their ambitions because of the poor yields they got from TSMC, in part becasue of the large die size and poor design of Fermi, lacking via redundancy. NVIDIA failed to heed TSMCs design guidelines about its 40nm process. NVIDIA has paid the price of its arrogance. In contrast AMD did heed TSMC guidance and has recieved economic yields of functioning dies and so brought 40nm products to market six months earlier than NVIDIA.
Re: Floorsweeping.
You presumably mean chip performance testing and binning. All manufactures do this. But the problem for NVIDIA is that when you have next to no fucntioning dies from a wafer, no matter how much you bin, something of next-to-nothing is still almost nil. NVIDIAs yields of functionining dies, never mind 480SP parts is so abysmal that it makes no difference. To get any functioning dies in sufficient quantity to bring a product to market, they have had to up the voltages, power and clock frequencies to the silly values we have to today, hence the abysmal power consumption and heat output of Fermi parts. S|A source indicate that yields of functioning Fermi parts are in single figures. This makes the product uneconomic to manufacture. Since all the parts that are being brought to market are from the risk wafers, sources indicate that there are unlikely to be any production wafers ordered.
OpenCL/GPGPU use:
Sources suggest that things will soon change on this front wrt AMD.
@ Nick
Nick, when has SemiAccurate ever been correct? Charlie @ SemiAccurate ‘predicted’ that the highest performing GTX 480 would have 448SP. He was wrong, the highest performing GTX 480 has 480SP. And frankly, there’s nothing wrong with floorsweeping 6.25% of a chip if it means better yields and lower prices for consumers. NVIDIA, ATI/AMD, and Intel all practice this technique.
Also, NVIDIA has embraced OpenCL much more than ATI both with drivers and hardware. Even so, CUDA has features which simply aren’t available in OpenCL, so some types of computing problems aren’t easily portable to OpenCL without a performance hit.
“Imagine what they could have done with the latest GeForce GTX 480 cards.
Here, “Imagine” and “Could” are very much the operative words.
If the reports on SemiAccurate are correct (and they have been so far) then Fermi GTX480 cards will be only be available in homeopathic quantities, 512SP Fermi Tesla cards will be like hen’s teeth, if they ever gop into production. Like many, it now looks as if TIoT has fallen for NVIDIAs PR BS and so has hitched its wagon to the wrong horse for it. This illustrates the danger of being reliant on one brand of hardware and a closed source language that can only be used for programming a particular hardware brand.
Expect TIoT to port ASUCA to OpenCL/CAL/IL on more perfromant and available ATI GPUs as soon as they realise there will be next-to-no Fermi GPU parts coming from NVIDIA.