In this second article of our Table Extraction series, we will see how to apply deep learning approaches to detect tables and decode them on document images.

You can read the first article of the Table Extraction Series here
Part 1: Challenges

Introduction

Deep learning can be used for both table detection and table decoding.

Table detection refers to the task of localizing a table on an image or document. The algorithm used for table detection should return the coordinates of the table, usually denoted by 2 points: upper left and lower right.

The figure below shows an example of tables detected on an image with blue bounding boxes around the tables.

detected tables on a document

Figure 1: detected tables (source [3])

Table decoding refers to the task of localizing rows and columns of a table. Just like table detection, a table decoding algorithm needs to return the coordinates of each row and column. The figure below shows rows and columns detection. Cell detection is then found by the intersection of a row and a column.