An important pre-processing step in any OCR tool or algorithm is to deskew the scanned document first. Take a look at the below sample scanned image, its tilted by a small angle.

In this article I will explain a method to deskew these types of documents using their horizontal projections. The final outcome of deskewing will look like this:

The basic steps are:

- Convert your image to grayscale
- Apply sobel filter
- Find the horizontal projection array of the image at rotation angles between -10 to 10
- Find the median of each projection array and pick the angle with the highest median.
- Rotate the image by the identified angle.

##### Convert your image to grayscale

from skimage.io import imread import matplotlib.pyplot as plt from skimage.color import rgb2gray img = rgb2gray(imread("https://cdn.instructables.com/F92/1G4I/IJX58MRB/F921G4IIJX58MRB.LARGE.jpg?auto=webp&width=350")) plt.grid() plt.imshow(img, cmap="gray")

##### Apply Sobel filter

from skimage.filters import sobel from skimage.util import invert sobel_image = invert(sobel(img)) plt.imshow(sobel_img)

I am using sobel filter just to extract the bare bones (edges) which creates this image. This process would eliminate any noise in the image.

##### Find the horizontal projection array of the image at rotation angles between -10 to 10

def horizontal_projections(sobel_image): sum_of_cols = [] rows,cols = sobel_image.shape for row in range(rows-1): sum_of_cols.append(np.sum(sobel_image[row,:])) return sum_of_cols

As you can see in the above projections, the angle -7 degree is where we have the most number of square like signals. And since those squares depict bigger white regions, the median will be highest there. Lets see if this is true by plotting the same data on a box plot.

You can clearly see how at angle -7 degree our median is the highest which takes us to our last step.

##### Find the median of each projection array and pick the angle with the highest median.

rows,cols = sobel_image.shape predicted_angle = 0 highest_hp = 0 for index,angle in enumerate(range(-10,10)): hp = horizontal_projections(skimage.transform.rotate(img, angle, cval=1)) median_hp = np.median(hp) print(median_hp) if highest_hp < median_hp: predicted_angle = angle highest_hp = median_hp

##### Rotate the image by the identified angle.

fig, ax = plt.subplots(ncols=2, figsize=(20,10)) ax[0].set_title('original image grayscale') ax[0].imshow(img, cmap="gray") ax[0].grid(color='r', linestyle='-', markevery=1) ax[1].set_title('original image rotated by angle'+str(predicted_angle)) ax[1].imshow(skimage.transform.rotate(img, predicted_angle, cval=1), cmap="gray") #ax[1].grid(color='r', linestyle='-', markevery=1) ax[1].grid(None)

You can find the entire process in this gist here.

## Write a Comment

[…] The explained method will only work with non-skewed documents. To de-skew the document, you can refer to my last article. […]