openmp-vs-openclLast time we talked about Jack Pien he was trying out AMD’s OpenCL CPU drivers on Intel processors with surprising results.  He’s back again with another great test, pitting OpenCL against OpenMP in a 2D Convolution algorithm, mainstay of several image and signal processing algorithms.

In this post, I will compare OpenCL with OpenMP performance for the same convolution configuration. I did make one minor tweak in both the CPU reference implementation and CL kernel code. AMD’s convolution sample has an outer loop that walks in x-steps and inner loop that walks in y steps. For untiled images stored row/width major, this is a big no-no for cache coherency. The inner loop should iterate in x-steps.

via Jack Pien» Blog Archive » AMD’s x86 OpenCL versus OpenMP.