What are the PDF/A documents and Why do we need them?

What are the PDF/A documents and Why do we need them?

More and more often we hear the recommendation of using PDF/A versus “normal” PDFs, especially when it comes to archiving scenarios. But what is a PDF/A document? What’s the difference from a normal PDF and when should you consider using it?

In this post we are going to see what exactly is th PDF/A format and why they are becoming so popular.

The Need for a Universal Format for Documents

For me, the best way to understand what we mean with Universal Format and the need for it, is using the example of the different Media formats we’ve seen throughout the last decades and the frustration all of us have experienced at some point in our lives. How many of you have had to retire your good-old vinyl records because you no longer have your beloved vinyl records player? Do you still keep your tape player because you don’t want to lose your old radio recordings? Or even worse, were you one of the victims of the Sony’s Betamax Video?

That’s what we mean with Universal format, something that whatever the content is, it can be used in 20 or 30 years time, regardless of the technology we’ll use for that. And when it comes to documents this is extremely important for archiving, legal and compliance scenarios, as you can imagine.

Many organizations, either private or public, need to guarantee that the contracts, agreements… any documents can be open in the future regardless of the underlying technology we’ll be using. That’s exactly why the format PDF/A was born. Let’s see how PDF/A can help us out on that but before, let’s see a little bit of history.

A little bit of History

The PDF format (Portable Document Format) was published by Adobe back in 1992 with the idea of building a file format to present and exchange documents reliably, independent of software, hardware or operating system. So, as we can see, this concept of universal document was there right from the beginning of PDF. The PDF format was owned by Adobe till 2008, when it became an open format within the ISO.

From that moment on PDFs can be displayed and generated not only with Adobe tools but with many other software tools.

A PDF document can contain text, images, embeded graphics, metadata… they can even be signed.

What does PDF/A do that PDF doesn't?

So, considering all of the above, it looks like PDF is already a universal format so, why do we need PDF/A?
Yes, that’s true, PDF is a format that can be generated and display by any software supporting PDFs, not only Adobe… NOW. But what about in 20 years time? Can we guarantee that a system in 20 years will be able to render a PDF doc generated today? This is when PDF/A format comes into the equation. I don’t want to take you through all the PDF/A specification but let’s see some examples of the kind of scenarios PDF/A looks after to understand more precisely what the format is about:

  • PDF/A enforces to embed the fonts into the document. In other words, it’s not enough to specify the font of the text, we need to include the description of that font so that it can be rendered. Just imagine a document using text in Lato 16pt. What if this font does not exist in 20 years time?
  • PDF/A enforces the use of ICC colors. The International Color Consortium is the international organization that, in a nutshell, defines “universal colors” or, better say, establishes color formats independent of the device. This way PDF/A guarantees that any color could be rendered in the future.
  • PDF/A does not allow protected or encrypted files. This makes sure that the document will not depend on a password or key to be opened.
  • URLs are not allowed. What if we include in the PDF an statement like this contract is signed according to the international law published at http://internationallaw.com and in 20 years time that URL does not exist? 

In Summary...

So, as you can see, the PDF/A format makes sure that you will be able to not only open the document but also that all the content can be rendered and available. In some ways it is limiting some of the native functionalities that we’ve got currently in PDF but in doing so there’s a certainty that the document, could be opened, read and understood for many years. For that reason many PDF readers will show you a warning or pop up notifying that you’re reading a PDF/A document.

Lastly it’s worth mentioning, as you might also come across, PDF/A version has been updated since its first version in 2005 and currently exist 3 versions: PDF/A-1, PDF/A-2 and PDF/A-3. The differences are out of the scope of this post, but if you want to know more I leave you here link from the PDF Association:

ISO 19005 (PDF/A)

The primary purpose of ISO 19005 is to define a file format based on PDF, known as PDF/A, which provides a mechanism for representing electronic documents in a ...

This post is also available in: esEspañol (Spanish)

Leave a Reply

Close Menu