{"id":963,"date":"2019-04-20T09:44:28","date_gmt":"2019-04-20T09:44:28","guid":{"rendered":"http:\/\/muthu.co\/?p=963"},"modified":"2021-05-24T02:55:23","modified_gmt":"2021-05-24T02:55:23","slug":"segmenting-lines-in-handwritten-documents-using-a-path-planning-algorithm","status":"publish","type":"post","link":"http:\/\/write.muthu.co\/segmenting-lines-in-handwritten-documents-using-a-path-planning-algorithm\/","title":{"rendered":"Segmenting lines in handwritten documents using A* Path planning algorithm"},"content":{"rendered":"\n

In this article, I will explain a widely used method for segmenting handwritten documents into individual lines. Below is a sample output from my algorithm.<\/p>\n\n\n\n

\"machine<\/a><\/figure><\/div>\n\n\n\n

The below flowchart outlines the different steps involved in the segmentation process.<\/p>\n\n\n\n

\"line<\/a><\/figure><\/div>\n\n\n\n

The explained method will only work with non-skewed documents. To de-skew the document, you can refer to my last article.<\/a><\/p>\n\n\n\n

Now, coming to the process, let’s go step by step and understand the various stages of line segmentation. You can also see my fully working notebook<\/a> here if you would like to learn it through code. If you read this article, you will be able to understand the thought process which will help you build upon my code for a much accurate line segmentation algorithm.<\/p>\n\n\n\n

To walk through the algorithm steps, let’s consider the below sample image :<\/p>\n\n\n\n

\"\"<\/a><\/figure><\/div>\n\n\n\n

Step  1: Convert the image to 2D grayscale. <\/strong><\/p>\n\n\n\n

Default first step for almost all document pre-processing is to convert it first to grayscale so we can work on a 2 Dimensional image.<\/p>\n\n\n\n

from skimage.io import imread\nfrom skimage.color import rgb2gray\nimport matplotlib.pyplot as plt\n\nimg = rgb2gray(imread(\"handwritten1.jpg\"))\nplt.figure(figsize=(10,10))\nplt.axis(\"off\")\nplt.imshow(img, cmap=\"gray\")\nplt.show()<\/code><\/pre>\n\n\n\n

Step 2: Apply an Edge detection algorithm to find the bare bones that make up the image.<\/strong><\/p>\n\n\n\n

I am using the Sobel filter<\/a> to find edges in my sample image. You can find a detailed explanation of this filter in my previous article here. <\/a><\/p>\n\n\n\n

from skimage.filters import sobel\nsobel_image = sobel(img)\nplt.figure(figsize=(20,20))\nplt.axis(\"off\")\nplt.imshow(sobel_image, cmap=\"gray\")\nplt.show()<\/code><\/pre>\n\n\n\n
\"\"<\/a><\/figure><\/div>\n\n\n\n

Step 3: Find the Horizontal Projection Profile<\/strong><\/p>\n\n\n\n

One of the common ways of finding the line-height of a document is by analyzing its Horizontal projection profile. Horizontal projection profile (HPP) is the array of the sum of rows of a 2D image. Where there are texts we see more peaks and the more white regions have a lower row sum. These peaks give us an idea of where the segmentation between two lines can be done.<\/p>\n\n\n\n

from skimage.filters import sobel\nimport numpy as np\n\ndef horizontal_projections(sobel_image):\n    #threshold the image.\n    sum_of_rows = []\n    for row in range(sobel_image.shape[0]-1):\n        sum_of_rows.append(np.sum(sobel_image[row,:]))\n    \n    return sum_of_rows\n\nsobel_image = sobel(img)\nhpp = horizontal_projections(sobel_image)\nplt.plot(hpp)\nplt.show()<\/code><\/pre>\n\n\n\n
\"\"<\/a><\/figure><\/div>\n\n\n\n

Step 4: Detect peaks in the HPP, divide the potential line segment regions from text<\/strong><\/p>\n\n\n\n

def find_peak_regions(hpp, divider=2):\n    threshold = (np.max(hpp)-np.min(hpp))\/divider\n    peaks = []\n    peaks_index = []\n    for i, hppv in enumerate(hpp):\n        if hppv < threshold:\n            peaks.append([i, hppv])\n    return peaks<\/code><\/pre>\n\n\n\n

The “divider” parameter in the above method defaults to 2, which means I will be thresholding my regions in the middle of higher and lower peaks in the HPP. This parameter is very important as this one completely changes my final segments. Finding the best divider parameter is an algorithm by itself which I will work on in my further research. Now for simplicity sake, I will be using the midpoint. If you are trying out with other sample images and fail to get good results, you can modify this parameter for better segments.<\/p>\n\n\n\n

\"line<\/a><\/figure><\/div>\n\n\n\n

Step 5: Identify the regions where upper line text is connected to the lower line and make a cut in the middle<\/strong><\/p>\n\n\n\n

Take a look at the below portion of another sample image, where the white upper line is connected to the lower line.<\/p>\n\n\n\n

\"connected<\/a><\/figure><\/div>\n\n\n\n

This creates a problem while segmenting the upper and lowers lines. In this step, I try to identify these regions and put a cut in between them. This is another area which can be improved significantly with better algorithms to avoid losing information about the cut letters. In my algorithm, I call them the roadblocked regions. In the method I run a window of size 20 and see if I can pass through to the other end, if yes, then I ignore it else I store the blocked window.<\/p>\n\n\n\n

def get_road_block_regions(nmap):\n    road_blocks = []\n    needtobreak = False\n    \n    for col in range(nmap.shape[1]):\n        start = col\n        end = col+20\n        if end > nmap.shape[1]-1:\n            end = nmap.shape[1]-1\n            needtobreak = True\n\n        if path_exists(nmap[:, start:end]) == False:\n            road_blocks.append(col)\n\n        if needtobreak == True:\n            break\n            \n    return road_blocks<\/code><\/pre>\n\n\n\n

Step 6: Run the A* path planning along the segmentation region and record the paths<\/strong><\/p>\n\n\n\n

A* search algorithm<\/a> is a fast pathfinding algorithm to find the shortest distance between two points on a coordinate space invented by researchers working on Shakey the Robot’s<\/a> path planning. Take a look at the below gif showing how it proceeds to reach the end point avoiding blocks.<\/p>\n\n\n\n

\"\"<\/a><\/figure><\/div>\n\n\n\n

 To get a deeper understanding of A* star algorithm, I recommend this youtube video.<\/p>\n\n\n