In today’s article, we will show why a strong document layout analysis system is crucial for document understanding and Intelligent Document Processing solutions.

In our latest blog article, we described our new Key-Value Pair extractor.

To give you a better insight into how it’s working, we will talk a bit more about the technology essential for key-value pair extraction: document layout analysis. With a strong OCR engine, document layout analysis is the first level of any document understanding process.

Definitions

Document Layout Analysis

Document layout analysis (DLA) is the identification (or detection) and categorization (or decoding) of regions.

DLA implies a geometric analysis of tables, pictures, equations, and barcodes and a logical layout analysis (paragraphs, lines, words, characters) of the document.

DLA and OCR

An OCR solution is a complex system that combines several engines which intervene at different stages of the process.

A standard OCR process includes:

  1. Preprocessing (the cleanup phase);
  2. Thresholding (segmentation);
  3. Layout analysis;
  4. Recognition;
  5. Post-processing.

All OCR processes include these steps, with more or less success. It is why you obtain different results with the same document when testing solutions from different vendors.