16. Mai 2023 by Klaus

Edge Detection for Image Processing

For a lot of image processing tasks, one needs to apply edge detection at some point of their image processing pipeline. Edge detection means identifying and highlighting the boundaries or transitions between different regions in an image due to variety in brightness or intensity.

Why is edge detection necessary?
As for our case, we need to apply edge detection in order to find the edges of documents in order for our Document Scanner SDK to detect the document in real time and scan it.

What is edge detection in image processing?
In document detection, edge detection algorithms are used to locate the boundaries of documents within a larger image, enabling efficient cropping, perspective correction and image extraction.

In this article we will talk about the most used approaches for document edge detection, their strengths and their weaknesses as well as a recommendation for the best document detection technique in our eyes.

We will also show some code examples so that you can try it out yourself quickly. In our code examples we use OpenCV, a very famous open source computer vision library and C++ as coding language.

Check out our Docutain SDK
Integrate high quality document scanning, text recognition and data extraction into your apps. If you like to learn more about the Docutain SDK contact us anytime via SDK@Docutain.com.

Sobel Edge Detector

Sobel Edge Detection is one of the most widely technique used for edge detection. The Sobel Operator detects edges marked by sudden changes in pixel intensity. Other algorithms like Canny, make use of sobel as part of their edge detection algorithms.

With Sobel you have 3 options. You can get edges enhanced in the X-direction, edges enhanced in the Y-direction or edges enhanced in both directions which is what we want.

The input image needs to be a grayscale image, so we use cvtColor() to transform the input image into a single channel gray scale image. The edge detection is highly sensitive to noise because it is based on derivatives. Therefore, we apply a GaussianBlur to reduce the noise.

Let’s see some code and the result:

//Load the input image
Mat image = imread("C:\\Users\\MarvinFrankenfeld\\Downloads\\361.jpg");

//convert to single channel grayscale
Mat grayImage;
cvtColor(image, grayImage, COLOR_BGR2GRAY);

//reduce noise
GaussianBlur(grayImage, grayImage, Size(3,3), 0);

// Sobel edge detection
Mat sobel;
Sobel(grayImage, sobel, CV_16S, 1, 1, 5);
convertScaleAbs(sobel, sobel);

// Display Sobel edge detection images
namedWindow("Input", WINDOW_NORMAL | WINDOW_FREERATIO);
imshow("Input", image);
namedWindow("Sobel", WINDOW_NORMAL | WINDOW_FREERATIO);
imshow("Sobel", sobel);

Left side is the input image, right side is the Sobel edge output:

Input image on the left and Sobel edge detection output on the right side

As you can see, we get the edge image as output. However, you can see that even with Gaussian Blur applied to remove noise, we still can see a lot of noise in the edge image. So relying solely on Sobel to get the edges of the object we want, in our case the document, does not seem to be a great idea.

Canny Edge Detector

The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images. It was developed by John F. Canny in 1986. Canny also produced a computational theory of edge detection explaining why the technique works.
Wikipedia

The Canny edge detection algorithm consists of 5 steps:

Noise reduction
Gradient calculation
Non-maximum suppression
Double threshold
Edge Tracking by hysteresis thresholding

In order to find the gradient, Canny makes use of the sobel operator. So you could see the Canny Edge Detection as an improvement to Sobel Edge Detection.

Let’s start with a simple code sample and see the results.

The Canny Edge Detector needs two threshold values. Any edges with intensity gradient more than the upper threshold will definitely be edges. Edges with values below the lower threshold will be discarded.

//Load the input image
Mat image = imread("C:\\Users\\MarvinFrankenfeld\\Downloads\\Testimage.jpg");

//convert to single channel grayscale
Mat grayImage;
cvtColor(image, grayImage, COLOR_BGR2GRAY);

//reduce noise
GaussianBlur(grayImage, grayImage, Size(5,5), 0);

//run the canny edge detection agorithm
Mat cannyEdges;
Canny(grayImage, cannyEdges, 66, 133);

// Display Canny edge detection images
namedWindow("Input", WINDOW_NORMAL | WINDOW_FREERATIO);
imshow("Input", image);
namedWindow("CannyEdges", WINDOW_NORMAL | WINDOW_FREERATIO);
imshow("CannyEdges", cannyEdges);

When we check the output, we can see that the result is pretty good.

Left side is the input image, right side is the Sobel edge output:

Input image on the left and Canny edge detection output on the right side

Let’s try another image and see if the result is just as good.

Canny edge detection shows weak result due to noise of the background on the right side

As you can see, the results are not that good as we have a lot of noise. This is because of the wooden floor which has a lot of structure compared to the first input image.

One way of reducing the noise is by increasing the Gaussian Blur. If we set a Kernel to Size(19,19) instead of Size(5,5) we get the following output.

Canny edge detection after increasing Gaussian Blur loses document’s edge on the right side

Now the noise of the wooden floor is eliminated but we also lose parts of the document’s edges. This is rather bad as the main goal is to get exactly the edges of the document.

Another way of improving the edge detection would be to adjust the lower and upper thresholds.

In a real-world scenario, you don’t know the input images and therefore you don’t know what would be the best value for the Gaussian Blur in order to remove the noise. Also, the lower and upper thresholds you have defined might work pretty well for some images but will deliver you pretty bad results for other images.

There are a lot of approaches to implement algorithms that will get you the best upper and lower threshold based on the input image. This might be good enough for some canny use cases but if you are looking for a solid, high quality edge detection algorithm this is not the best way to go.

TensorFlow Edge Detection

TensorFlow is a very popular, powerful machine learning framework that you can use on mostly all relevant platforms. You can use it for a lot of image processing tasks e.g., image classification or image segmentation. We can also leverage the power of machine learning to solve our edge detection problem. The idea behind this is simple:

Define a set of input images. Every input image has one ground truth which shows the ideal edge image. Train a machine learning model using TensorFlow which gets these two images as input to make it learn what part of the image is the edge. If you provide enough input images, you will get a model that can predict the exact edges of every input image.

A sample of an input image (left) and it’s associated ground truth (right):

Example of receipt on the left and corresponding TensorFlow Edge Detection Ground Truth on the right side

Once you have trained a good TensorFlow model, you are able to get outputs like this:

Example of receipt on the left and TensorFlow Edge Detection Simple Output with trained model on the right side

Now let’s compare a few images using the 3 different kinds of edge detection in order to find the best edge detection algorithm. Top left is the input image, top right is Sobel edge detection, bottom left is Canny edge detection and bottom right is TensorFlow edge detection.

Edge Detection Algorithms Comparison (top left input image, top right Sobel edge detection, bottom left Canny edge detection, bottom right TensorFlow edge detection)

Sobel vs Canny? What we can see is that the Sobel Edge Detector as well as the Canny Edge Detector have trouble with noise and hardly detect edges. Whereas the trained TensorFlow model is able to detect the correct edges in all cases, even if the input images contain parts of other documents, lots of shadows or suffer from few contrast (white document on light background).

So far it should be obvious that our recommendation for a stable, high quality, reliable and fast edge detection is to train a TensorFlow model and leverage the power of machine learning.

The question is: How exactly can you do that?

The answer is: As long as you are not a machine learning expert with hundreds of thousands of sample images, enough time to wait for the model to be trained perfectly and very powerful hardware with a lot of GPU power to get fast training, you can’t.

However, in case you need the edge detection in order to build a document scanner, we have the perfect solution for you: our Docutain SDK.

Our SDK uses the same technology as the document scanner in our successful document management App Docutain which is used by millions of users on a daily basis. It has been improved continuously over the past years. So if you need a best in class document scanner with a quick and reliable document detection, choose our Docutain SDK.

Check out our Docutain SDK
Integrate high quality document scanning, text recognition and data extraction into your apps. If you like to learn more about the Image Processing SDK, have a look at our Developer Documentation, Samples or contact us anytime via SDK@Docutain.com.