Optical Character Recognition system for handwritten math expressions
OCR is one of the earliest addressed computer vision tasks. But there are few solutions available for its application in specific domains such as parsing of mathematical formulas. Thus we will solve this problem in a simple educative way, providing a comprehensive introduction into Computer Vision(CV) field with a possibility to extend the base solution. The resulting program incorporates segmentation of input image into characters and then character recognition itself based on Convolutional Neural Network(CNN).
Firstly, prepare the data by executing the following commands from the project’s root folder:
cd data
unzip emnist.zip
unzip crohme.zip
And then run Main.ipynb
notebook
In the root folder, you can see multiple *.ipynb
and *.py
files. Below is a detailed description of each of them
data
directoryinput
folder and saves the segmented characters into segmented
folder.model
directory.For project details, please check out supporting report