16. Mai 2023 by Klaus
Edge Detection for Image Processing
For a lot of image processing tasks, one needs to apply edge detection at some point of their image processing pipeline.
Edge detection means identifying and highlighting the boundaries or transitions between different regions in an image due to variety in brightness or intensity.
Why is edge detection necessary?
As for our case, we need to apply edge detection in order to find the edges of documents in order for our
Document Scanner SDK to detect the document
in real time and scan it.
What is edge detection in image processing?
In document detection, edge detection algorithms are used to locate the boundaries of documents within a larger image, enabling efficient cropping, perspective correction and image extraction.
In this article we will talk about the most used approaches for document edge detection, their strengths and their
weaknesses as well as a recommendation for the best document detection technique in our eyes.
We will also show some code examples so that you can try it out yourself quickly. In our code examples we use
OpenCV, a very famous open source computer vision library and C++ as coding language.
Check out our Docutain SDK
Integrate high quality document scanning, text recognition and data extraction into your apps. If you like to learn more about the Docutain SDK contact us anytime via SDK@Docutain.com.
Sobel Edge Detector
Sobel Edge Detection is one of the most widely technique used for edge detection. The Sobel Operator detects edges
marked by sudden changes in pixel intensity. Other algorithms like Canny, make use of sobel as
part of their edge detection algorithms.
With Sobel you have 3 options. You can get edges enhanced in the X-direction, edges enhanced in the Y-direction or
edges enhanced in both directions which is what we want.
The input image needs to be a grayscale image, so we use cvtColor() to transform the input image into a single channel gray
scale image. The edge detection is highly sensitive to noise because it is based on derivatives. Therefore, we apply
a GaussianBlur to reduce the noise.
Let’s see some code and the result:
//Load the input image
Mat image = imread("C:\\Users\\MarvinFrankenfeld\\Downloads\\361.jpg");
//convert to single channel grayscale
Mat grayImage;
cvtColor(image, grayImage, COLOR_BGR2GRAY);
//reduce noise
GaussianBlur(grayImage, grayImage, Size(3,3), 0);
// Sobel edge detection
Mat sobel;
Sobel(grayImage, sobel, CV_16S, 1, 1, 5);
convertScaleAbs(sobel, sobel);
// Display Sobel edge detection images
namedWindow("Input", WINDOW_NORMAL | WINDOW_FREERATIO);
imshow("Input", image);
namedWindow("Sobel", WINDOW_NORMAL | WINDOW_FREERATIO);
imshow("Sobel", sobel);
Left side is the input image, right side is the Sobel edge output:
As you can see, we get the edge image as output. However, you can see that even with Gaussian Blur applied to remove noise, we still can see a lot of noise in the
edge image. So relying solely on Sobel to get the edges of the object we want, in our case the document, does not seem to be a great idea.
Canny Edge Detector
The Canny edge detection algorithm consists of 5 steps:
-
Noise reduction
-
Gradient calculation
-
Non-maximum suppression
-
Double threshold
-
Edge Tracking by hysteresis thresholding
In order to find the gradient, Canny makes use of the sobel operator. So you could see the Canny Edge Detection as
an improvement to Sobel Edge Detection.
Let’s start with a simple code sample and see the results.
The Canny Edge Detector needs two threshold values. Any edges with intensity gradient more than the upper threshold
will definitely be edges. Edges with values below the lower threshold will be discarded.
//Load the input image
Mat image = imread("C:\\Users\\MarvinFrankenfeld\\Downloads\\Testimage.jpg");
//convert to single channel grayscale
Mat grayImage;
cvtColor(image, grayImage, COLOR_BGR2GRAY);
//reduce noise
GaussianBlur(grayImage, grayImage, Size(5,5), 0);
//run the canny edge detection agorithm
Mat cannyEdges;
Canny(grayImage, cannyEdges, 66, 133);
// Display Canny edge detection images
namedWindow("Input", WINDOW_NORMAL | WINDOW_FREERATIO);
imshow("Input", image);
namedWindow("CannyEdges", WINDOW_NORMAL | WINDOW_FREERATIO);
imshow("CannyEdges", cannyEdges);
When we check the output, we can see that the result is pretty good.
Left side is the input image, right side is the Sobel edge output:
Let’s try another image and see if the result is just as good.
As you can see, the results are not that good as we have a lot of noise. This is because of the wooden floor which has
a lot of structure compared to the first input image.
One way of reducing the noise is by increasing the Gaussian Blur. If we set a Kernel to Size(19,19) instead of
Size(5,5) we get the following output.
Now the noise of the wooden floor is eliminated but we also lose parts of the document’s edges. This is rather bad as
the main goal is to get exactly the edges of the document.
Another way of improving the edge detection would be to adjust the lower and upper thresholds.
In a real-world scenario, you don’t know the input images and therefore you don’t know what would be the best value for the Gaussian Blur in order to remove the noise.
Also, the lower and upper thresholds you have defined might work pretty well for some images but will deliver you pretty bad results for other images.
There are a lot of approaches to implement algorithms that will get you the best upper and lower threshold based on
the input image. This might be good enough for some canny use cases but if you are looking for a solid, high quality edge
detection algorithm this is not the best way to go.
TensorFlow Edge Detection
TensorFlow is a very popular, powerful
machine learning framework that you can use on mostly all relevant platforms.
You can use it for a lot of image processing tasks e.g., image classification or image segmentation. We can also
leverage the power of machine learning to solve our edge detection problem. The idea behind this is simple:
Define a set of input images. Every input image has one ground truth which shows the ideal edge image. Train a
machine learning model using TensorFlow which gets these two images as input to make it learn what part of the image
is the edge. If you provide enough input images, you will get a model that can predict the exact edges of every input
image.
A sample of an input image (left) and it’s associated ground truth (right):
Once you have trained a good TensorFlow model, you are able to get outputs like this:
Now let’s compare a few images using the 3 different kinds of edge detection in order to find the best edge detection algorithm. Top left is the input image,
top right is Sobel edge detection, bottom left is Canny edge detection and bottom right is TensorFlow edge detection.
Sobel vs Canny? What we can see is that the Sobel Edge Detector as well as the Canny Edge Detector have trouble with noise and hardly
detect edges. Whereas the trained TensorFlow model is able to detect the correct edges in all cases, even if the input
images contain parts of other documents, lots of shadows or suffer from few contrast (white document on light
background).
So far it should be obvious that our recommendation for a stable, high quality, reliable and fast edge
detection is to train a TensorFlow model and leverage the power of machine learning.
The question is: How exactly can you do that?
The answer is: As long as you are not a machine learning expert with hundreds of thousands of sample images, enough
time to wait for the model to be trained perfectly and very powerful hardware with a lot of GPU power to get fast
training, you can’t.
However, in case you need the edge detection in order to build a document scanner, we have the perfect solution
for you: our Docutain SDK.
Our SDK uses the same technology as the document scanner in our successful document management App Docutain which
is used by millions of users on a daily basis. It has been improved continuously over the past years.
So if you need a best in class document scanner with a quick and reliable document detection, choose our Docutain SDK.
Check out our Docutain SDK
Integrate high quality document scanning, text recognition and data extraction into your apps. If you like to learn more about the Image Processing SDK, have a look at our Developer Documentation, Samples or contact us anytime via SDK@Docutain.com.
You might also be interested in the following articles