Guest column: Martin Bailey of Global Graphics says the recently approved PDF/A standard is useful for more than just archiving.You may have seen some announcements recently about a new ISO standard called PDF/A. More exactly, it's ISO 19005-1:2005, "Document managementElectronic document file format for long-term preservationPart 1: Use of PDF 1.4 (PDF/A-1)".
As I write, the standard is approved and currently wending its way through the corridors of ISO central secretariat towards publication; it may have arrived by the time you're reading this.
Look at that name: It's for "long-term preservation," and the documentation at AIIM (one of the two standards bodies providing secretariat functions for the development of the standard) describes it as "PDF for Archiving". The list of participants in the process that led to the development of PDF/A includes many people whose primary interest is in archiving, such as representatives from Harvard University Library, the National Library of Medicine and the National Archives and Records Administration.
So you'd be forgiven for thinking that the standard is intended just for archivists, for people concerned with maintaining digital documents for decades, if not centuries.
Don't get me wrong, many archivists are great people, and they have a necessary and often difficult job to do, but that's a bit of a niche market, isn't it? Why all the fuss? And why did Adobe see fit to include support for an early draft of the standard in Acrobat 7?
The simple answer is that PDF/A has value well outside the scope that the original archival project was designed to address. The audience that can make effective use of PDF/A is far larger than you might expect from the title of the standard.
Just about every government agency in America is currently struggling to find a way to deliver on the requirement that they accept documents electronically. The same is happening in corporate America, at least partly because those companies need to submit files to the government agencies. Most of the rest of the world seems to be a couple steps behind, but they're heading down the same path.
The problem is not usually around getting files in, although there are certainly plenty of issues there. It's figuring out what to do with the files when you've got them that trips people up. I'm sure you know the problems that can occur in an unregulated file submission environment. You tend to receive document formats that you've never heard of, and you need a huge variety of software (and expertise) to read even the common formats.
Click here to read about the PDF/E standard for engineering documents.
As a result many agencies, at least in the U.S., have decided to require that documents be submitted in some flavor of PDF. But again, as you probably know, that doesn't eliminate problems. Perhaps the two most common and obvious issues are around fonts and versions of PDF.
Files often come in that use a font but don't include it in the file. That can cause serious problems if font emulation or substitution leads to the wrong characters being shown (not uncommon for currency symbols, accented characters and others outside of 7-bit ASCII). The missing font could also be a barcode or some other symbolic typeface.
A file receiver will usually spend some considerable time ensuring that they have standardized versions of software installed across their offices. On the other hand, somebody creating files for submission may make them in a later version of PDF than the receiver can reliably process.
To bring order to the chaos, many agencies have been trying to pull together their own specifications designed as subsets of PDF. The staff at those agencies are obviously experts in the subject matter that the agency works on, but it's rare to find somebody who's also expert in the inner workings of PDF. As a result most of those subset specifications tend to have loopholes that would allow files with various undesirable features to enter the workflow.
This all sounds very familiar; it's very similar to the position in the print industry before the publication of the PDF/X standard for printing.
One aspect of the PDF/X standards is the neatness of the division between the "propeller head" technical PDF requirements and the issues that any competent print service provider or publisher understands very well. The technical stuff is hidden away in the standard and the recipient can say "send me a PDF/X", knowing that it covers all that stuff.
In the same way the PDF/A standard allows flexibility for the differences in metadata handling etc. that are required to integrate into different organization's systems while strictly regulating those issues that have a direct impact on the visual appearance of the document.
Next Page: When should files become PDF/A?
So is PDF/A the knight in shining armor that all those agencies struggling with digital document submissions are desperate for? In my opinion, the answer is a qualified "yes".
It's qualified for two reasons:
• PDF/A only addresses those files that might be described as a digital representation of a paper document. A small, but growing, proportion of document submission is now more complex than that, including 3D engineering drawings and databases.
• There aren't yet any tools that create PDF/A files according to the standard as published. I applaud Adobe's bravery in including support for a draft version of PDF/A in Acrobat 7, and their honesty in clearly labeling that as draft. That makes Acrobat 7 very useful for prototyping a PDF/A workflow, but it's not yet ready for production work. Many more PDF/A-compliant creation and validation tools will be available over the next few months.
If we accept that PDF/A will become at least part of the solution for accepting digital documents and for in-house archiving, the biggest question remaining is when should files become PDF/A?
There are two obvious workflows from native application documents (such as a Microsoft Word Doc file, or a PPT file from PowerPoint) to a PDF/A file. It can be done in one step: direct from application file to PDF/A. Or it can be done as two: from application file to baseline PDF, and then a later conversion to PDF/A. While that doesn't sound very important, it has implications for the whole of the document management strategy in an organization: When a document is shared internally, should that be as a Word file, as PDF or as PDF/A?
Click here to read about a panel discussion of the PDF/A standard.
Sharing application files works fine if every worker has a copy of Microsoft Office
and if all documents are created in one or more of the applications from that
and if everyone is kept in lock-step on a version of Office
and if each new version of Office can read files from older versions
You get the picture!
If files are shared as application files or baseline PDF, they may also suffer from the kinds of problems with missing fonts that were mentioned earlier. A later conversion to PDF/A will require that all fonts are available.
Pretty much all of the advantages of PDF/A for inter-company document submission and archiving also apply when sharing documents within an organization. The greatest benefit can be achieved if the PDF/A is made by the creator of the document. Early creation of PDF/A therefore makes a lot of sense.
Get the word out, PDF/A isn't just for archiving!
Martin Bailey is senior technical consultant for Global Graphics. He specifies and designs many aspects of Global Graphics' RIP and PDF products and represents the company on a number of industry bodies and standards committees.