We’re happy to introduce our new PDF/A Converter, available in our SDK with GdPicture.NET version 14.1.22. Today we’re going to explain more how it works and why it’s a feature that will become a requirement (if not already) for most organizations.
Why do you need a PDF/A converter?
PDF/A is the ISO standard for long-term archiving of electronic documents.
It provides optimal security of the information and signatures contained in the document, as well as all the necessary information required to open, read, and access the necessary data of the document in the future.
PDFs converted to PDF/A are fully searchable and keep a reasonable size thanks to the many compression techniques supported.
Legislation and regulations worldwide evolve and require data to be protected and recorded digitally for a few years up to… forever in certain cases. That is why many public and governmental organizations like the Library of Congress now recognize, recommend, and favor PDF/A for archiving electronic documents.
You will find more information about the format on the PDF Association website here.
How does it work?
So back to our PDF to PDF/A converter.
Our native PDF to PDF/A converter parses the source document and compares the document structure to the expected conformance level.
The document can be modified to conform to the specification only by adding, editing, or removing required document structure elements, embedding fonts if possible and using other techniques. If such a conversion by direct modification is not possible, the PDF to PDF/A conversion engine falls back to secondary conversion options, which are Vectorization and Rasterization.
Both these options perform rendering of the document content into a completely new document. Vectorization produces vector based graphic elements where applicable, for example, fonts and paths, and combines them with image resources while Rasterization renders the document content using the raster (pixel-based) approach.
The important note here is that both Vectorization and Rasterization approaches result in loss of fonts and text information because the text converts into shapes and raster images. Text information can be later recovered using OCR features.