More details on NVidia’s GT300
It seems the GT300 news was actually under embargo until the GTC keynote was finished (guess BSN decided to ignore it, oh well), and now that it’s over the information is pouring out. One great resource is an article from John West over at InsideHPC touting the new HPC-centric features of the design.
The new design features a dedicated 64KB L1 cache per Streaming Multiprocessor (GPU cores are organized hierarchically into “Streaming Multiprocessors,” or SMs; 32 cores form an SM, and there are 16 SMs on a board), and a 768KB L2 cache shared among all SMs. NVIDIA calls this the “Parallel DataCache Hierarchy,” and Sumit Gupta, senior manager in the Tesla GPU Computing group, says that this feature is very important not only to sparse matrix and physics calculations (for gaming), but also for traditional graphics applications like ray tracing. Application engineers should now see a much more familiar programming environment when porting code from CPUs.