A new whitepaper from Intel brings in some statistics and stories from Luxology, Luxion, and Modo on the power of CPU’s for ray-tracing and how they can smoke any GPU on the market with CPU-only solutions.
“Modern GPUs offer a brute force solution to ray tracing, but the memory available to GPUs is relatively limited compared to the system memory available to 64-bit CPUs such as Intel Core i7 and Xeon processors. That means that GPUs typically can’t handle the huge scene files required in full-scale production rendering, which may involve tens of millions of polygons and hundreds of high-resolution texture maps. And CPUs offer greater flexibility in terms of shading complexity and plug-in shaders, which may or may not have been ported to run on a GPU.”
These are the same arguments I’ve been hearing for the last year or so. And I have to admit they’re right, if not a bit short-sighted. It’s my belief that most of the arguments they use are going to fall apart soon.
- They always talk about the power of Moore’s law in CPU’s. Well, that same law applies to GPU’s too, they’re going to get faster just like CPU’s will. Even more so, most likely, as they not only optimize individual cores but add more cores as a rate exponential to CPU’s.
- They always talk about Memory limitations. There was a time where CPU’s had rather restrictive memory limitations (the fabled “640k is enough for anyone” comment?). GPU’s will continue to grow in memory. In fact, Sandy Bridge and Fusion offer the first step towards eliminating the distinction between GPU and CPU memory.
- They always talk about the limited instruction set. This one isn’t likely to change, and will always be a hindrance to GPU computing. However, newer algorithms come along at a steady pace showing that you don’t really need the type of complex branching mechanisms of CPU’s, since the GPU has enough horsepower to just compute both sides of the condition and drop the unnecessary one.
In fact, I think within the next 5 years we may see the distinction between CPU and GPU disappear almost entirely, as they both wind up on the same die (similar to how Processor and Math Co-Processor eventually merged several years ago).
It’s a good whitepaper tho, full of some concrete numbers on attempts to GPU-ize code unsuccessfully and benefits achieved from using some of Intel’s newest CPU-optimization technology.
Check it out, and see what you think?
> a pure FLOPS/Watt or FLOPS/$ basis,
There is no such thing as “pure FLOPS” besides marketing hype. Any performance benchmark number ALWAYS represents the _specific_ algorithm used to obtain the result; therefore the “numbers” are meaningful only in context with applied algorithms. For example: simultaneous MUL_ADD operation upon wide array of data is one of such SIMD friendly algorithm but once code path depends on intermediate result in data array the performance of SIMD machine goes down to its scalar speed and effectively slows down proportionality to width of SIMD unit.
@ Stefan
If MIMD was able to outperform SIMD on a pure FLOPS/Watt or FLOPS/$ basis, it would be a marketing issue, but as it stands now, GPUs are able to outperform CPUs in tasks that are highly parallel (and for graphics applications, that’s fairly common).
The major limitation of GPU is its SIMD nature; it’s remarkable how such basic and fundamental handicap is rarely pointed; GPU is only good for data parallelism – single instruction applied upon wide array of numbers; the thousands threads in GPU is a total marketing-BS. Massive MIMD is the future and multi-core CPU race leads there…
It’s neat. I can’t find reporting on this article anywhere else on the web. So Larrabee is still secretly trying to churn out graphics, instead of focusing on HPC?