Earlier this month, we have released a new OCR engine to read ID documents, and today we’re going to tell you more about it.
And we’re very happy with the results:
our engine decodes MRZ characters on any image in less than 100ms, even if the quality is low and the image skewed.
What is MRZ?
MRZ stands for Machine Readable Zone.
It’s a format intended for countries to have machines able to read ID documents like passports, ID cards, or visas, without typing anything. These documents all use a specific font called OCR-B and a specific number of characters and lines.
The data of the MRZ is found at the bottom of the ID document.
The International Civil Aviation Organization (ICAO) is in charge of the MRZ specifications in Document 9303.
Since not all IDs are made equals, there is more than one MRZ format, as each need requires a specific amount of information and adaptations. Current MRZ formats are:
- TD1 (ID card, passport card)
- TD2 (passport)
- TD3 (passport)
- MRV-A (visa)
- MRV-B (visa)
There are also other specific formats depending on the country. For instance, French and Portuguese ID cards don’t follow these standards.
What makes MRZ different is that, in addition to storing your data in a way that is more machine-friendly for it to read, it also adds checksum validation.
Each numerical value is followed by a checksum that verifies if the machine reads it properly. Of course, checksums are not infallible, but they significantly improve accuracy.
The French ID cards validate first names and last names/surnames. The French government decided that ensuring the validity of the names was equally important as the integrity of the number, and by doing so, doesn’t follow the global standard.
The Portuguese ID cards also vary from the standard MRZ format and use a different algorithm for validating the integrity of their ID number.
How to use MRZ with GdPicture.NET?
Now that you have the theory of what is an MRZ let’s dive into how to use it with GdPicture.NET.
Let’s assume you already have a working sample able to scan a document (if not look at this documentation).
The only thing you need to do for switching from our general-purpose OCR to MRZ is to specify the special context as a parameter of the RunOCR method:
resultId = gdPictureOcr.RunOCR(OCRSpecialContext.MRZ);
string mrzRead = _ gdPictureOcr.GetOCRResultText(resultId , false);
You get the result of the OCR and an MRZ value.