{"id":963,"date":"2019-04-20T09:44:28","date_gmt":"2019-04-20T09:44:28","guid":{"rendered":"http:\/\/muthu.co\/?p=963"},"modified":"2021-05-24T02:55:23","modified_gmt":"2021-05-24T02:55:23","slug":"segmenting-lines-in-handwritten-documents-using-a-path-planning-algorithm","status":"publish","type":"post","link":"http:\/\/write.muthu.co\/segmenting-lines-in-handwritten-documents-using-a-path-planning-algorithm\/","title":{"rendered":"Segmenting lines in handwritten documents using A* Path planning algorithm"},"content":{"rendered":"\n<p>In this article, I will explain a widely used method for segmenting handwritten documents into individual lines. Below is a sample output from my algorithm.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/output.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1156\" height=\"351\" src=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/output.png\" alt=\"machine vision segmentation\" class=\"wp-image-964\" srcset=\"http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/output.png 1156w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/output-300x91.png 300w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/output-768x233.png 768w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/output-1024x311.png 1024w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/output-700x213.png 700w\" sizes=\"auto, (max-width: 1156px) 100vw, 1156px\" \/><\/a><\/figure><\/div>\n\n\n\n<p>The below flowchart outlines the different steps involved in the segmentation process.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/astarslgorithm.png\"><img loading=\"lazy\" decoding=\"async\" width=\"286\" height=\"686\" src=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/astarslgorithm.png\" alt=\"line segmentation\" class=\"wp-image-965\" srcset=\"http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/astarslgorithm.png 286w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/astarslgorithm-125x300.png 125w\" sizes=\"auto, (max-width: 286px) 100vw, 286px\" \/><\/a><\/figure><\/div>\n\n\n\n<p>The explained method will only work with non-skewed documents. To de-skew the document, you can refer to<a href=\"http:\/\/muthu.co\/deskewing-scanned-documents-using-horizontal-projections\/\"> my last article.<\/a><\/p>\n\n\n\n<p>Now, coming to the process, let&#8217;s go step by step and understand the various stages of line segmentation. You can also see <a href=\"https:\/\/github.com\/muthuspark\/line-segmentation-handwritten-doc\/blob\/master\/A*%20Path%20Planning%20Line%20Segmentation%20Algorithm.ipynb\" target=\"_blank\" rel=\"noopener noreferrer\">my fully working notebook<\/a> here if you would like to learn it through code. If you read this article, you will be able to understand the thought process which will help you build upon my code for a much accurate line segmentation algorithm.<\/p>\n\n\n\n<p>To walk through the algorithm steps, let&#8217;s consider the below sample image :<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/handwritten1.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"399\" src=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/handwritten1.jpg\" alt=\"\" class=\"wp-image-966\" srcset=\"http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/handwritten1.jpg 640w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/handwritten1-300x187.jpg 300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/figure><\/div>\n\n\n\n<p><strong>Step&nbsp; 1: Convert the image to 2D grayscale.&nbsp;<\/strong><\/p>\n\n\n\n<p>Default first step for almost all document pre-processing is to convert it first to grayscale so we can work on a 2 Dimensional image.<\/p>\n\n\n\n<pre class=\"wp-block-code EnlighterJSRAW has-small-font-size\"><code>from skimage.io import imread\nfrom skimage.color import rgb2gray\nimport matplotlib.pyplot as plt\n\nimg = rgb2gray(imread(\"handwritten1.jpg\"))\nplt.figure(figsize=(10,10))\nplt.axis(\"off\")\nplt.imshow(img, cmap=\"gray\")\nplt.show()<\/code><\/pre>\n\n\n\n<p><strong>Step 2: Apply an Edge detection algorithm to find the bare bones that make up the image.<\/strong><\/p>\n\n\n\n<p>I am using the&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Sobel_operator\">Sobel filter<\/a> to find edges in my sample image. You can find a detailed explanation of this filter in my <a href=\"https:\/\/muthu.co\/sobel-feldman-operator-or-sobel-filter\/\">previous article here.&nbsp;<\/a><\/p>\n\n\n\n<pre class=\"wp-block-code EnlighterJSRAW has-small-font-size\"><code>from skimage.filters import sobel\nsobel_image = sobel(img)\nplt.figure(figsize=(20,20))\nplt.axis(\"off\")\nplt.imshow(sobel_image, cmap=\"gray\")\nplt.show()<\/code><\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/download.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1156\" height=\"730\" src=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/download.png\" alt=\"\" class=\"wp-image-967\" srcset=\"http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download.png 1156w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-300x189.png 300w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-768x485.png 768w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-1024x647.png 1024w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-700x442.png 700w\" sizes=\"auto, (max-width: 1156px) 100vw, 1156px\" \/><\/a><\/figure><\/div>\n\n\n\n<p><strong>Step 3:&nbsp;Find the Horizontal Projection Profile<\/strong><\/p>\n\n\n\n<p>One of the common ways of finding the line-height of a document is by analyzing its Horizontal projection profile. Horizontal projection profile (HPP) is the array of the sum of rows of a 2D image. Where there are texts we see more peaks and the more white regions have a lower row sum. These peaks give us an idea of where the segmentation between two lines can be done.<\/p>\n\n\n\n<pre class=\"wp-block-code EnlighterJSRAW has-small-font-size\"><code>from skimage.filters import sobel\nimport numpy as np\n\ndef horizontal_projections(sobel_image):\n    #threshold the image.\n    sum_of_rows = &#091;]\n    for row in range(sobel_image.shape&#091;0]-1):\n        sum_of_rows.append(np.sum(sobel_image&#091;row,:]))\n    \n    return sum_of_rows\n\nsobel_image = sobel(img)\nhpp = horizontal_projections(sobel_image)\nplt.plot(hpp)\nplt.show()<\/code><\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/Snip20190420_30.png\"><img loading=\"lazy\" decoding=\"async\" width=\"714\" height=\"270\" src=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/Snip20190420_30.png\" alt=\"\" class=\"wp-image-968\" srcset=\"http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/Snip20190420_30.png 714w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/Snip20190420_30-300x113.png 300w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/Snip20190420_30-700x265.png 700w\" sizes=\"auto, (max-width: 714px) 100vw, 714px\" \/><\/a><\/figure><\/div>\n\n\n\n<p><strong>Step 4:&nbsp;Detect peaks in the HPP, divide the potential line segment regions from text<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code EnlighterJSRAW has-small-font-size\"><code>def find_peak_regions(hpp, divider=2):\n    threshold = (np.max(hpp)-np.min(hpp))\/divider\n    peaks = &#091;]\n    peaks_index = &#091;]\n    for i, hppv in enumerate(hpp):\n        if hppv &lt; threshold:\n            peaks.append(&#091;i, hppv])\n    return peaks<\/code><\/pre>\n\n\n\n<p>The &#8220;divider&#8221; parameter in the above method defaults to 2, which means I will be thresholding my regions in the middle of higher and lower peaks in the HPP. This parameter is very important as this one completely changes my final segments. Finding the best divider parameter is an algorithm by itself which I will work on in my further research. Now for simplicity sake, I will be using the midpoint. If you are trying out with other sample images and fail to get good results, you can modify this parameter for better segments.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/download-1.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1156\" height=\"730\" src=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/download-1.png\" alt=\"line segmenated\" class=\"wp-image-969\" srcset=\"http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-1.png 1156w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-1-300x189.png 300w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-1-768x485.png 768w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-1-1024x647.png 1024w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-1-700x442.png 700w\" sizes=\"auto, (max-width: 1156px) 100vw, 1156px\" \/><\/a><\/figure><\/div>\n\n\n\n<p><strong>Step 5:&nbsp;Identify the regions where upper line text is connected to the lower line and make a cut in the middle<\/strong><\/p>\n\n\n\n<p>Take a look at the below portion of another sample image, where the white upper line is connected to the lower line.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/Snip20190420_33.png\"><img loading=\"lazy\" decoding=\"async\" width=\"455\" height=\"301\" src=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/Snip20190420_33.png\" alt=\"connected regions\" class=\"wp-image-970\" srcset=\"http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/Snip20190420_33.png 455w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/Snip20190420_33-300x198.png 300w\" sizes=\"auto, (max-width: 455px) 100vw, 455px\" \/><\/a><\/figure><\/div>\n\n\n\n<p>This creates a problem while segmenting the upper and lowers lines. In this step, I try to identify these regions and put a cut in between them. This is another area which can be improved significantly with better algorithms to avoid losing information about the cut letters. In my algorithm, I call them the roadblocked regions. In the method I run a window of size 20 and see if I can pass through to the other end, if yes, then I ignore it else I store the blocked window.<\/p>\n\n\n\n<pre class=\"wp-block-code EnlighterJSRAW has-small-font-size\"><code>def get_road_block_regions(nmap):\n    road_blocks = &#091;]\n    needtobreak = False\n    \n    for col in range(nmap.shape&#091;1]):\n        start = col\n        end = col+20\n        if end &gt; nmap.shape&#091;1]-1:\n            end = nmap.shape&#091;1]-1\n            needtobreak = True\n\n        if path_exists(nmap&#091;:, start:end]) == False:\n            road_blocks.append(col)\n\n        if needtobreak == True:\n            break\n            \n    return road_blocks<\/code><\/pre>\n\n\n\n<p><strong>Step 6:&nbsp;Run the A* path planning along the segmentation region and record the paths<\/strong><\/p>\n\n\n\n<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/A*_search_algorithm\">A* search algorithm<\/a> is a fast pathfinding algorithm to find the shortest distance between two points on a coordinate space invented by researchers working on <a href=\"https:\/\/en.wikipedia.org\/wiki\/Shakey_the_robot\">Shakey the Robot&#8217;s<\/a> path planning. Take a look at the below gif showing how it proceeds to reach the end point avoiding blocks.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/Weighted_A_star_with_eps_5.gif\"><img loading=\"lazy\" decoding=\"async\" width=\"210\" height=\"210\" src=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/Weighted_A_star_with_eps_5.gif\" alt=\"\" class=\"wp-image-971\"\/><\/a><\/figure><\/div>\n\n\n\n<p>&nbsp;To get a deeper understanding of A* star algorithm, I recommend this youtube video.<\/p>\n\n\n<p><iframe loading=\"lazy\" title=\"A* Pathfinding (E01: algorithm explanation)\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/-L-WgKMFuhE?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n\n\n\n<pre class=\"wp-block-code EnlighterJSRAW has-small-font-size\"><code>#a star path planning algorithm \nfrom heapq import *\n\ndef heuristic(a, b):\n    return (b&#091;0] - a&#091;0]) ** 2 + (b&#091;1] - a&#091;1]) ** 2\n\ndef astar(array, start, goal):\n\n    neighbors = &#091;(0,1),(0,-1),(1,0),(-1,0),(1,1),(1,-1),(-1,1),(-1,-1)]\n    close_set = set()\n    came_from = {}\n    gscore = {start:0}\n    fscore = {start:heuristic(start, goal)}\n    oheap = &#091;]\n\n    heappush(oheap, (fscore&#091;start], start))\n    \n    while oheap:\n\n        current = heappop(oheap)&#091;1]\n\n        if current == goal:\n            data = &#091;]\n            while current in came_from:\n                data.append(current)\n                current = came_from&#091;current]\n            return data\n\n        close_set.add(current)\n        for i, j in neighbors:\n            neighbor = current&#091;0] + i, current&#091;1] + j            \n            tentative_g_score = gscore&#091;current] + heuristic(current, neighbor)\n            if 0 &lt;= neighbor&#091;0] &lt; array.shape&#091;0]:\n                if 0 &lt;= neighbor&#091;1] &lt; array.shape&#091;1]:                \n                    if array&#091;neighbor&#091;0]]&#091;neighbor&#091;1]] == 1:\n                        continue\n                else:\n                    # array bound y walls\n                    continue\n            else:\n                # array bound x walls\n                continue\n                \n            if neighbor in close_set and tentative_g_score &gt;= gscore.get(neighbor, 0):\n                continue\n                \n            if  tentative_g_score &lt; gscore.get(neighbor, 0) or neighbor not in &#091;i&#091;1]for i in oheap]:\n                came_from&#091;neighbor] = current\n                gscore&#091;neighbor] = tentative_g_score\n                fscore&#091;neighbor] = tentative_g_score + heuristic(neighbor, goal)\n                heappush(oheap, (fscore&#091;neighbor], neighbor))\n                \n    return &#091;]<\/code><\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"1153\" height=\"80\" src=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/download-2.png\" alt=\"\" class=\"wp-image-972\" srcset=\"http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-2.png 1153w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-2-300x21.png 300w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-2-768x53.png 768w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-2-1024x71.png 1024w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/download-2-700x49.png 700w\" sizes=\"auto, (max-width: 1153px) 100vw, 1153px\" \/><\/figure><\/div>\n\n\n\n<p>The blue line in the below image is the path the algorithm finds between, start and end points.&nbsp;<a href=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/download-2.png\"><br><\/a><\/p>\n\n\n\n<p><strong>Step 6:&nbsp;Plot the segmentation lines on the image.&nbsp;<\/strong><\/p>\n\n\n\n<p>Once you have found all the paths separating the lines, you can plot them on the image or use it to extract the lines.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/output.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1156\" height=\"351\" src=\"https:\/\/muthu.co\/wp-content\/uploads\/2019\/04\/output.png\" alt=\"machine vision segmentation\" class=\"wp-image-964\" srcset=\"http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/output.png 1156w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/output-300x91.png 300w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/output-768x233.png 768w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/output-1024x311.png 1024w, http:\/\/write.muthu.co\/wp-content\/uploads\/2019\/04\/output-700x213.png 700w\" sizes=\"auto, (max-width: 1156px) 100vw, 1156px\" \/><\/a><\/figure><\/div>\n\n\n\n<p>Here is the latest python notebook.<br><iframe style=\"width: 100%; height: 900px; border: none;\" src=\"https:\/\/nbviewer.jupyter.org\/github\/muthuspark\/line-segmentation-handwritten-doc\/blob\/master\/A%2A%20Path%20Planning%20Line%20Segmentation%20Algorithm.ipynb\"><\/iframe><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this article, I will explain a widely used method for segmenting handwritten documents into individual lines. Below is a sample output from my algorithm. The below flowchart outlines the different steps involved in the segmentation process. The explained method will only work with non-skewed documents. To de-skew the document, you can refer to my [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":966,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24,38],"tags":[46,47],"class_list":["post-963","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence","category-computer-vision","tag-artificial-intelligence","tag-computer-vision"],"_links":{"self":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/963","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/comments?post=963"}],"version-history":[{"count":3,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/963\/revisions"}],"predecessor-version":[{"id":1872,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/963\/revisions\/1872"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/media\/966"}],"wp:attachment":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/media?parent=963"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/categories?post=963"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/tags?post=963"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}