{"id":1422,"date":"2020-08-17T13:36:24","date_gmt":"2020-08-17T13:36:24","guid":{"rendered":"https:\/\/muthu.co\/?p=1422"},"modified":"2021-05-24T02:25:03","modified_gmt":"2021-05-24T02:25:03","slug":"pre-processing-for-detecting-text-in-images","status":"publish","type":"post","link":"http:\/\/write.muthu.co\/pre-processing-for-detecting-text-in-images\/","title":{"rendered":"Pre processing for Detecting text in images"},"content":{"rendered":"\n

One of the important steps in OCR is the thresholding process. It helps us in separating the text regions (foreground) from the background. If you apply a thresholding algorithm like OTSU or Sauvola, you might end up with a lot of noise. Some of your text regions may even get categorized as background. Take a look at the below examples where I applied Sauvola algorithm directly on grayscale images without any pre-processing. <\/p>\n\n\n\n

\"\"<\/figure>\n\n\n\n

<\/p>\n\n\n\n

\"\"<\/figure>\n\n\n\n

As you can see in the outputs, there is a lot of noise in the images and a few characters in the binary image are also vague. Passing this image directly into an OCR engine like Tesseract may not yield best results. <\/p>\n\n\n\n

To overcome these issues related to noise and loss of text regions, I usually try to adjust gamma and remove noise before thresholding. The choice of algorithms for contrast enhancement and noise removal may differ based on the image type. If the image is of low resolution, noise removal may clean up some of your regions of interest. So, its important to use noise removal only when the image is of high resolution. <\/p>\n\n\n\n

Here is my pre-processing code. <\/p>\n\n\n\n

from skimage.color import rgb2gray\nimport matplotlib.pyplot as plt\nfrom skimage.io import imread\nfrom skimage.filters import threshold_sauvola\nfrom skimage.exposure import is_low_contrast\nfrom skimage.exposure import adjust_gamma\nfrom skimage.restoration import denoise_tv_chambolle\n\ncimage = imread('https:\/\/muthu.co\/wp-content\/uploads\/2020\/08\/image9.jpg')\ngamma_corrected = adjust_gamma(cimage, 1.2)\nnoise_removed = denoise_tv_chambolle(gamma_corrected, multichannel=True)\ngry_img = rgb2gray(cimage)\nth = threshold_sauvola(gry_img, 19)\nbimage = gry_img > th\n\nfig, ax = plt.subplots(ncols=2, figsize=(20,20))\nax[0].imshow(cimage)\nax[0].axis(\"off\")\nax[1].imshow(bimage, cmap=\"gray\")\nax[1].axis(\"off\")<\/code><\/pre>\n\n\n\n

The outputs for the above sample images is as below:<\/p>\n\n\n\n

\"\"<\/figure>\n\n\n\n
\"\"<\/figure>\n\n\n\n

What I am basically doing is enhancing the contrast first to darken text regions and then removing noise using the Total_variation_denoising<\/a> method.<\/p>\n","protected":false},"excerpt":{"rendered":"

One of the important steps in OCR is the thresholding process. It helps us in separating the text regions (foreground) from the background. If you apply a thresholding algorithm like OTSU or Sauvola, you might end up with a lot of noise. Some of your text regions may even get categorized as background. Take a […]<\/p>\n","protected":false},"author":1,"featured_media":1423,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[38],"tags":[47,42,43],"_links":{"self":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/1422"}],"collection":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/comments?post=1422"}],"version-history":[{"count":2,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/1422\/revisions"}],"predecessor-version":[{"id":1841,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/1422\/revisions\/1841"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/media\/1423"}],"wp:attachment":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/media?parent=1422"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/categories?post=1422"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/tags?post=1422"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}