项目作者: mnansary

项目描述 :
Handwritten Bangla Symbol Recognition With DenseNet
高级语言: Jupyter Notebook
项目地址: git://github.com/mnansary/pyHOCR.git
创建时间: 2019-03-24T19:04:43Z
项目社区:https://github.com/mnansary/pyHOCR

开源协议:MIT License

下载


Handwritten Bangla Symbol Recognition with DenseNet

  1. Version: 0.0.3
  2. Author : Md. Nazmuddoha Ansary
  3. Python : 3.6.8





Symbol List

  1. 'অ','আ','ই','ঈ','উ','ঊ',
  2. 'ঋ','এ','ঐ','ও','ঔ',
  3. 'ক','খ','গ','ঘ','ঙ',
  4. 'চ','ছ','জ','ঝ','ঞ',
  5. 'ট','ঠ','ড','ঢ','ণ',
  6. 'ত','থ','দ','ধ','ন',
  7. 'প','ফ','ব','ভ','ম',
  8. 'য','র','ল',
  9. 'শ','ষ','স','হ',
  10. 'ড়','ঢ়','য়',
  11. 'ৎ','ং','ঃ','ঁ'
  12. 'ঁ'
  • ‘ঁ’ is not printable

DenseNet

The model is based on the original paper:Densely Connected Convolutional Networks

Authors and Researchers: Gao Huang ; Zhuang Liu ; Laurens van der Maaten ; Kilian Q. Weinberger

The paper introduces Dense Blocks within the traditional convolutional neural network architechture.

The composite layers can also contain bottoleneck layers

As compared to well established CNN models (like : FractNet or ResNet) DenseNet has:

  1. * Less number of feature vector
  2. * Low information bottoleneck
  3. * Better Handling Of the *vanishing gradient* problem

Database:

CMATERdb

CMATERdb 3.1.2: Handwritten Bangla basic-character database

Data Sample

Established Results

From:Alom et. al. 2018

Version and Requirements

  1. Keras==2.2.5
  2. numpy==1.16.4
  3. tensorflow==1.13.1
  • pip3 install -r requirements.txt

    Colab and TPU(Tensor Processing Unit)

    TPU’s have been recently added to the Google Colab portfolio making it even more attractive for quick-and-dirty machine learning projects when your own local processing units are just not fast enough. While the Tesla K80 available in Google Colab delivers respectable 1.87 TFlops and has 12GB RAM, the TPUv2 available from within Google Colab comes with a whopping 180 TFlops, give or take. It also comes with 64 GB High Bandwidth Memory (HBM).
    @jannik.zuern/using-a-tpu-in-google-colab-54257328d7da">Visit This For More Info
    For this model the approx time/epoch=24s

    Test data Prediction Accuracy [F1 accuracy]: 98.56666666666666

    Flask App Deployement

    For Deployment of the Saved Model python-flask is used.
    The deployment is very simple and to be honest can be more optimized

    Segmentation (incomplete)

    The final goal of the segmentation script is to separate:

  1. Words From Lines
  2. Symbols From Words
    For the goal of separation, Connected Components are mapped with pixel distribution after “skeletonization” and finding an optimal rotation for both skewness and separation.

    Example Image:

    Connected Components:

    Segmented Words Example:




    NOTE: See how the word “মনেরে” and “ভাল-মন্দ” are rotated for an optimal position with respect to a straight line or “মাত্রা” as we call it in “বাংলা” but the word “যাহাই” is left as it is because the skewness is completely by chance in the optimal rotation for separation.

Implemented DenseNet Model Architechture

The implemented model architechture can be found at /info/model.png

Loading the image may take time due to speed and size