Image gradient with Sobel operator, OpenCV 3.X and CUDA

This prototype tests different implementations of the image gradient using the Sobel operator, C++, CUDA and OpenCV 3.X.

An image gradient is a directional change in the intensity or color in an image. Image gradients may be used to extract information from images. In graphics software for digital image editing, the term gradient or color gradient is used for a gradual blend of color which can be considered as an even gradation from low to high values, as used from white to black in the images to the right. Another name for this is color progression.

Two types of gradients, with blue arrows to indicate the direction of the gradient. Dark areas indicate higher values

Two types of gradients, with blue arrows to indicate the direction of the gradient. Dark areas indicate higher values

The Sobel operator, sometimes called the Sobel–Feldman operator or Sobel filter, is used in image processing and computer vision, particularly within edge detection algorithms where it creates an image emphasising edges. Technically, it is a discrete differentiation operator, computing an approximation of the gradient of the image intensity function. At each point in the image, the result of the Sobel–Feldman operator is either the corresponding gradient vector or the norm of this vector. The Sobel–Feldman operator is based on convolving the image with a small, separable, and integer-valued filter in the horizontal and vertical directions and is therefore relatively inexpensive in terms of computations. On the other hand, the gradient approximation that it produces is relatively crude, in particular for high-frequency variations in the image.

The operator uses two 3×3 kernels which are convolved with the original image to calculate approximations of the derivatives – one for horizontal changes, and one for vertical. If we define A as the source image, and Gx and Gy are two images which at each point contain the horizontal and vertical derivative approximations respectively, the computations are as follows:

 \mathbf{G}_x = \begin{bmatrix} -1 & 0 & +1 \\ -2 & 0 & +2 \\ -1 & 0 & +1 \end{bmatrix} * \mathbf{A} \quad \mbox{and} \quad \mathbf{G}_y = \begin{bmatrix} -1 & -2 & -1 \\ 0 & 0 & 0 \\ +1 & +2 & +1 \end{bmatrix} * \mathbf{A}

where * here denotes the 2-dimensional signal processing convolution operation.

Since the Sobel kernels can be decomposed as the products of an averaging and a differentiation kernel, they compute the gradient with smoothing. For example, \mathbf{G_x} can be written as

 \begin{bmatrix} -1 & 0 & +1 \\ -2 & 0 & +2 \\ -1 & 0 & +1 \end{bmatrix} = \begin{bmatrix} 1\\ 2\\ 1 \end{bmatrix} \begin{bmatrix} -1 & 0 & +1 \end{bmatrix}

The x-coordinate is defined here as increasing in the “right”-direction, and the y-coordinate is defined as increasing in the “down”-direction. At each point in the image, the resulting gradient approximations can be combined to give the gradient magnitude, using:

\mathbf{G} = \sqrt{ {\mathbf{G}_x}^2 + {\mathbf{G}_y}^2 }

Using this information, we can also calculate the gradient’s direction:

\mathbf{\Theta} = \operatorname{atan2}\left({ \mathbf{G}_y , \mathbf{G}_x }\right)

where, for example, Θ is 0 for a vertical edge which is lighter on the right side.

Gradient examples of the result of different filters

Gradient results of different filters

Three different methods are compared to each other in this prototype:

  • OpenCV 3.x CPU based method cv::Sobel from the imgproc module.
  • OpenCV 3.x GPU based method cv::cuda::createSobelFilter from the cudaimgproc module.
  • Own method sobelFilterCuda() implemented in Cuda to run in parallel on the GPU.

OpenCV CPU implementation:

In this first test, the CPU version from OpenCV cv::Sobel is tested.

OpenCV GPU implementation:

In this second test, the CPU version from OpenCV cv::cuda::createSobelFilter is tested.

Own CUDA implementation:

In this third test, own CUDA implementation is tested.

Execution times of the 3 implementations on simple_room-wallpaper-4096×3072.jpg, 2,3 MB, 4096×3072 pixels:

t can be seen that own implementation using CUDA is as fast as the OpenCV GPU version, while the OpenCV CPU version is least 3 times slower.

The CUDA version has implemented some optimizations such: shared memory for the input images together with tiling windows (16×16), coalesed memory access, used constant memory for the the gaussianan and sobel kernels to gain fast loading from the cache memory.

All three versions are executing exactly the same steps and parameters: 1) gaussian blurring of the color image using a radius of 3 and a delta of 1; 2) grayscale conversion; 3) computing the x and y gradiants and merging them into the final output image.


Profiling of the OpenCV GPU and own CUDA implementation using NSight profiler:

Using the profiler it can also be seen that own methods uses around half of the total amount processing power.


Original image and generated gadient images:



sobel filter OpenCv Cpu

Sobel filter OpenCv Cpu


Scharr Filter OpenCv Cpu


Sobel Filter OpenCv Gpu


Sobel filter Cuda


The Cuda code with some optimizations is as fast as the built-in OpenCV GPU version, while the OpenCV CPU version is at least 3 times slower.



Leave a Reply

Your email address will not be published.