While at GTC, I managed to slip off the grid for a few hours and head to the other camp, the realm where GPU’s do not reign supreme and companies dedicate themselves to squeezing every clock cycle of performance from their algorithms.  In a nondescript office building, I met with Fovia‘s CEO Kenneth Fineman and President & CTO George Buyanovsky for a demonstration of their ‘High Definition Volume Rendering®’ product, and I have to admit I’m impressed.

The product is essentially an SDK or Library for integrating high-speed and high quality volume rendering into other applications, and as such they’ve already got an impressive customer list including biomedical imaging companies like GE and Pfizer, dental imaging companies like 3M and iDent, along with some classic standbys of such technology like NASA and the US Military.  Running entirely on the CPU, I was privy to a demonstration of their test bed application running on an 8-core system (with Hyperthreading enabled for 16 logical cores) showing various biomedical datasets on a 1920×1080 display, nearly fullscreen.  The visuals were beautiful, and easily operating from 8 to 30 fps depending on number of concurrently running clients and rendering complexity.

Fovia was founded by Ken and George, both former employees from a high-end computer graphics card manufacturer, back in 2003.  Unsatisfied with the current “put the graphics in hardware” designs from their employer and the likes of NVidia and ATI, they dedicated themselves to demonstrating that the same work can be done in the CPU, and done faster and better.  The result is the HDVR® product they now license.  They make use of the most modern instruction sets for high-speed vector computation and parallelism across cores. Once liberated from the restrictive instruction sets of most current GPU designs, they were able to create vastly more complex visualizations using adaptive ray-sampling, adaptive step sizes, and many other optimizations not easily implemented in GPU algorithms.

Read more about Fovia & HDVR® after the break.


During the course of the 90-minute demonstration, I saw several datasets of varying sizes loaded and visualized interactively.  The largest was a full 2300 slice CT scan visualized interactively.  They state the HDVR product can interactively render up to a 4K cube with sufficient memory, but the 12G of ram in the system would max out around a 1.8K cube, still well beyond most scanners.  The transfer functions (of which they can support 8 simultaneously) can easily be adjusted interactively to define colors and opacity, as well as faux lighting models.  Their transfer model support made it trivial to visualize both bone and muscle at the same time, with lighting on the well-defined bones and no lighting on the rough muscle.

As the demo went on, they demonstrated a special client-server model they support where the one Windows server connected to two Macbook Air laptops over a simply 802.11N wireless network for full interactivity.  (They told me that they support any combination of Window, Linux and Mac – on both the client and server side.) Honestly, you would never know it was done remotely as both systems were just as fully interactive as the local demonstrations.  Each of the two laptops loaded a different dataset and interacted with it independently, however they stated that the same system could also be used for collaboration with a single dataset.

Toward the end, they showed some of the other features they’ve integrated, such as support for polygonal geometry rendering.  This one could hold special interest for visualization scientists who are currently running into problems with large geometric models coming from the classic isosurface algorithms used on large data.  With their raycasting system, the resulting framerate is related to frame-size (and ray detail), not on geometry size.  As such, you can load some truly huge models and interact with them with ease.  Also, due to their algorithm, you can easily render semitransparent geometry or mix polygonal geometry with volumetric geometry with no performance penalty, unlike most GPU solutions, which require either depth-sorting or depth-peeling algorithms.  Currently, most large-scale visualization systems (like VisIt, ParaView, and EnSight) are trying to use polygonal geometry systems like Mesa along with frame compositing to do this work, but raycasting solutions are far simpler to parallelize (each ray can be run individually) and typically result in higher quality visuals anyway.  Fovia has a leg up on the competition in this regard, however they do require the entire model be able to fit in RAM on the machine, which is a deal-killer for the extreme viz coming out of systems in use by the DoD and DoE. People working in more normal environments might find this incredibly useful, tho.

The demo is impressive, and the technology is already integrated into various medical imaging machines.  It seems a perfect fit, as embedded computers are more prolific and easier to work with than embedded graphics chips, and the performance makes near-immediate visualization of the captured scans possible.  However, there are a few points that I feel compelled to make:

  • Fovia is very proud (and rightfully so) of the speed and detail of their solution.  While GPU solutions can probably match them on speed, those solutions typically lack in detail.  It’s easy to load up a dataset as a 3D Texture in video memory and render it in hardware.  However, the result is not going to look as good as the Fovia solution.  The argument is a combination of Speed and Detail.  A GPU could probably win in raw speed, but lose in the resulting detail.  GPU solutions are improving, however, and in fact a few people were demonstrating such technology at GTC last week.  You can refer to the poster from Harvard’s School of Engineering & Applied Sciences that claim interactive visualization of a 92Gb EM model across 1 to 8 tesla nodes.
  • Fovia (correctly) claims that GPU’s cannot hold datasets of this size.  I have to give them this one, you’re not going to hold a 4k cube in memory as-is on a GPU (4k cubed results in 68 GigaVoxels).  However, where do you get a 4K cube?  Most imaging systems work at significantly smaller scales, and Fovia admits that their ‘large’ datasets are either non-medical or from post-processed & stitched scans.  Also, more systems are supporting “4D Scans”, which is a time-varying 3D scan.  The result is a simple multiplication of the data size, creating small spatial datasets that can still occupy large amounts of disk space and memory.   In addition, if you have any ‘multidimensional’ data (such as maintaining a gradient volume or any other derived dataset) you wind up doubling or tripling your dataset size.
  • Fovia is big on the “adaptive sampling can’t be done effectively on the GPU”.  This one is iffy, at best. No doubt, programming on the GPU is hard, and such adaptive methods are that much harder.  However, it has been done before using different techniques.  At GTC, a few people discussed such adaptive methods using ray bundling and multipass techniques, but they are still very early in development and the full implications aren’t fully understood.

But even with that, Fovia has a few huge advantages:

  • Perhaps the biggest is the cost advantage.  Sure, Harvard demonstrated a system that could perform visualization of similar quality and performance on the GPU, but it took 8 Tesla’s to do it.  Fovia does it with 2 regular CPU’s.
  • GPU clusters are rare in-the-field, with maintenance and upkeep headaches.  Airflow problems, hardware failures, and power consumption are all issues when dealing with GPU clusters of any significant size.
  • Computational clusters are everywhere, and can be used for a wide variety of purposes.  Take that big accounting cluster that’s only used 1 day a week or 1 week a month for billing, and use it for big HDVR visualization the rest of the time.
  • Also, while I was unable to test or see this, they claim the system has almost pure linear scaling.  Double your CPU core count, double your performance. This works great, but their solution is currently limited to what you can pack into a single box.

And finally the most important from a business aspect:  It’s available today.  The Harvard system is a great proof-of-concept, but it’s still in the research stages.  Several of the other GPU solutions out there are similar, conceptually sound but still “in the lab” and not ready for production.  Fovia’s solution is available, and actively in use by several customers in several settings today.

Fovia is a company to keep an eye on.  Their CPU-only solution is impressive, and it will be interesting to see how they respond to increased market pressure to improve their system as more GPU-accelerated systems come online. The algorithms and development tools will improve (Parallel NSight will be a huge bonus for that), and GPU memory sizes and performance figures will improve, but of course so will CPUs.  The impressive parallelism they can demonstrate right now shows they have an early start on their competition, and I can’t wait to see where they go with it to maintain their edge.

If you want to know more about Fovia: