In response to the growing usage of our windows application and the need to enhance its performance, our development team embarked on a project to refactor our legacy C# codebase. A key focus of this effort was to streamline the application’s memory usage to ensure optimal performance and scalability. Recognizing that inefficient memory management can […]
Category: Artificial Intelligence
Beam search
Beam search is a heuristic search algorithm, a variant of breadth first search designed in such a way that it only explores a limited set of promising paths or solutions in a search space instead of all possible paths, which is often computationally expensive. It is used in the field of artificial intelligence, particularly in […]
Ant Colony Optimization (ACO)
Ant Colony Optimization (ACO) is a metaheuristic optimization algorithm inspired by the foraging behavior of ants. Ants are social insects that communicate with each other using pheromones, which are chemicals that they leave on trails. When an ant finds a good source of food, it will lay down a trail of pheromones on the way […]
Birthday Problem and Monte Carlo Simulation
If you get on a plane that can carry 100 or more passengers at a time, there is a 99% chance that one of the passengers shares the same birthday as yours. If you get into a bus with a capacity of 50 passengers, you have a 97% chance of finding someone who shares your […]
Understanding Correlations and Correlation Matrix
Correlation is the measure of how two or more variables are related to one another, also referred to as linear dependence. An increase in demand for a product increases its price, also called the demand curve, traffic on roads at certain intervals of time of the day, the amount of rain correlates with grass fires, […]
Computing the discrete Fréchet distance using dynamic programming
Definition The Fréchet distance is usually explained using the analogy of a man walking his dog. A man is walking a dog on a leash: the man can move on one curve, the dog on the other; both may vary their speed, but backtracking is not allowed. What is the length of the shortest leash […]
All Tesseract OCR options
This is for my reference and this might come in handy for others too. All Tesseract options CLI Examples Command Example Notes tesseract sample_images/image2.jpg stdout To print the output to standard output tesseract sample_images/image2.jpg sample_images/output By default the output will be named outbase.txt. tesseract sample_images/image2.jpg sample_images/output -l eng -l is for language. English is default […]
Segmenting lines in handwritten documents using A* Path planning algorithm
In this article, I will explain a widely used method for segmenting handwritten documents into individual lines. Below is a sample output from my algorithm. The below flowchart outlines the different steps involved in the segmentation process. The explained method will only work with non-skewed documents. To de-skew the document, you can refer to my […]
Deskewing scanned documents using horizontal projections
An important pre-processing step in any OCR tool or algorithm is to deskew the scanned document first. Take a look at the below sample scanned image, its tilted by a small angle. In this article I will explain a method to deskew these types of documents using their horizontal projections. The final outcome of deskewing […]
Mathematics of Principal component analysis
Principal component analysis is a method used to reduce the number of dimensions in a dataset without losing much information. It’s used in many fields such as face recognition and image compression and is a common technique for finding patterns in data and also in the visualization of higher-dimensional data. PCA is all about geometrically […]