项目作者: arnavdutta

项目描述 :
Detect the tables in a form and extract the tables as well as the cells of the tables.
高级语言: Python
项目地址: git://github.com/arnavdutta/Table-Detection-Extraction.git
创建时间: 2019-08-03T19:42:30Z
项目社区:https://github.com/arnavdutta/Table-Detection-Extraction

开源协议:MIT License

下载


Table Detection & Extraction From The Forms


Functionality:

  • Detects all the tables in a form page.
  • Create bounding boxes around it.
  • Segment it out and extract the cells of the tables.

Steps:

  1. Grayscale the image
  2. Binary Thresholding
  3. Get all the vertical lines using vertical kernel and cv2.getStructuringElement
  4. Similarly, get all the horizontal lines using horizontal kernel and cv2getStructuringElement
  5. Combine all the horizontal and vertical lines using cv2.addWeighted
  6. Perform some morphological transformation like cv2.erode to get crisp lines & for better results.
  7. Finding the contours and extracting out the rectangles/table cells.

" class="reference-link">

Prerequisites

  1. Python v3.6
  2. OpenCV v3.4 import cv2
  3. Numpy v1.16 import numpy as np
  4. OS import os