In this third article of the Table Extraction series, we’ll see how the GdPicture.NET engine goes beyond OCR and Deep Learning methods thanks to the Layout Understanding approach.

You can read the first articles of the Table Extraction series here
Part 1: Challenges
Part 2: Deep Learning Approaches

The Layout Understanding approach

A performant table extraction system must detect, recognize, and understand a document, ideally without a model.

We’ve seen that OCR can do a good job with clean and simple tables on high-quality documents.

We’ve also mentioned that machine learning and deep learning models rely on templates. If they provide excellent results, they are heavy on resources.

Our approach for the GdPicture.NET Table Extraction component is to add a layer of understanding to the document that doesn’t need any predefined training model. That’s what we call the unsupervised Layout Understanding approach.

The layout understanding technologies combine layout analysis, OCR, Key-Value Pair extraction, NLP, and Named-Entity Recognition. The goal is to achieve what we call creator intent.

Indeed, understanding the purpose of a document enables better decision-making for extraction, qualification, and conversion.

GdPicture.NET component for table extraction

Overview

The GdPicture.NET Table Extraction component actually includes three internal engines:

  • Table detection,
  • Table decoding,
  • Table conversion.

The table detection and decoding engines have been available in the SDK since version 14.2.

The table conversion engine currently provides a set of low-level APIs which can be used to export the content.

In the next GdPicture.NET release, we’ll offer the possibility to export tables to Excel spreadsheets with various options while keeping the original document’s style (text & cell colors, font size & type, etc.).