Question:
I have been told that there is a new version of the PDF/A standard coming soon. What does this mean for me?
Answer:
The first important message is that the new part PDF/A-2 will not substitute or ‘fix’ the current one. PDF/A-1 will remain available as an independent, valid standard. All existing PDF/A-1 documents and those that will be created in future are perfectly well suited for long-term archiving. That said, why do we need a part of the standard then? The PDF format is constantly being enhanced and improved. The current version of PDF/A is based on the PDF specification 1.4. But in the meantime the PDF specification reached version 1.7 and has even been published as an ISO standard itself (ISO 32000-1). Since PDF 1.4 there were added numbers of new features to the PDF format, and some of these are also useful for long-term archiving. So PDF/A-2 will be based on the new PDF standard. The new features in PDF/A-2 cover document collections, metadata, image formats, transparency, among other things. The most important aspect for LuraTech customers will be the new support for JPEG 2000 in PDF/A-2. Now highly compressed PDF/A documents will be possible with the same great visual quality and small file size as it could so far only be gained for standard PDF output. LuraTech is actively involved in the development of the PDF/A standard. So our customers will always be among the first who can benefit from the new possibilities.
With the increasing spread of PDF/A as the ISO standard for long-term archiving, unfortunately a few misunderstandings have been popularized as well. After nearly four years, some DMS providers still seem intent on “riding out” the PDF wave. But as I see it, the saying applies that “you snooze, you lose!”
Myth #1: TIFF is secured against tampering, PDF and PDF/A are not
This assertion is clearly incorrect. There is no document format which is inherently secured against alteration and compliant with auditing requirements. A TIFF-file can be modified with simple tools just like a PDF/A document or any other format. “Inalterability” of documents can only be achieved using a signature. If files must be archived in compliance with auditing requirements, then a system or process is necessary to ensure protection against changes.
Myth #2: PDF is a standard from one provider, TIFF is a disclosed standard
Yes and no. TIFF is a de facto industry “standard”, but it has never been standardized by an international standards organization such as ISO or DIN. Both PDF itself (ISO 32000) and PDF/A (ISO 19005) are disclosed ISO standards and are thus not only de facto but also de jure standards.
Myth #3: PDF/A does not support signatures
On the contrary. PDF/A even permits embedded signatures – including qualified electronic signatures. To do this, the signature provider must simply apply the product in a PDF/A-compliant manner, but there are still some signature providers who have not yet accomplished this with their products.
Myth #4: PDF/A does not support compression
Wrong. PDF/A permits all common compression methods to be used, such as JBIG2, JPEG, etc. The exception is LZW, where at the time of the standardization patents were still in force. For these reasons of time, JPEG2000 was not incorporated in the PDF/A-1 standard, but it will be covered in the new version (PDF/A-2).
Myth #5: PDF/A does not allow OCR for scanned documents
Wrong. OCR is possible in both PDF/A-1b as well as PDF/A-1a, of course. A minor point – perhaps the cause of the confusion - is the exception that this invisible font does not have to be embedded.
Myth #6: PDF/A files are too large due to font embedding
Yes and no. It is true that fonts (except for OCR) must be embedded. Based on practical experience, this is only a problem in the particular application area for bulk outgoing mail. In this regard, one can apply font reduction and subsetting or pragmatically omit font embedding in a solution tailored to individual company needs. These files are then no longer PDF/A-compliant in a strict sense. However, except for the deliberate exception they retain all the advantages of PDF/A.
Myth #7: PDF/A does not support metadata
On the contrary. XMP particularly facilitates standardized metadata in PDF/A. Metadata can be managed in the surrounding systems as before. An advantage of PDF/A is that these data can also be embedded inseparably in the document.
Myth #8: PDF/A is not supported by DMS systems
Yes and no. Simply put, an ECM system which can handle PDF can also support PDF/A well. However, (unfortunately) there are still a number of DMS providers wedded to their outmoded TIFF viewers, and that can sometimes be a stumbling block in practice.
Myth #9: PDF/A is only supported by a small group of local German providers
Not at all! It is certainly true that PDF/A was first accepted in German-speaking countries – and that the PDF/A Competence Center originated in Germany. However, in the meantime many countries and industries recommend PDF/A or even require it by statute. Moreover, the PDF/A Competence Center now has over 100 members from about 20 countries!
Myth #10 PDF/A is expensive!
Yes and no. Of course the deployment of PDF/A tools requires an initial investment. Sometimes the ROI from highly compressed PDF/A files can be calculated within a few months even without an Excel spreadsheet, for example with the Sparkasse savings banks. But that is perhaps more of an exception. The problem here is assessing the benefits: how much is it worth if unifying formats saves training time and expense as well as viewer license fees. And when fewer migrations are necessary in the future? And last but not least, how do you place a value on a “good” archive thanks to standardized PDF/A files?
Thomas Zellmann is an executive board member of the PDF/A Competence Center
Question:
I have hundreds of boxes of documents that contain information I am required to store for at least ten years. I understand the best format to archive these documents is PDF/A? Will your PDF Compressor Enterprise output to PDF/A and make these documents full-text searchable?
Answer:
Yes, PDF/A is the best format for long-term archiving (defined by ISO 19005-1:2005). This standard offers assurance that archived documents will maintain their appearance and readability regardless of which applications and systems were used to create them. And yes, the PDF Compressor Enterprise has an integrated ABBYY FineReader OCR engine and with this tool you can create full-text searchable PDF/A documents in one pass. Additionally, the PDF Compressor applies award-winning mixed raster content (MRC) compression technology and therefore you will save on storage costs with smaller file sizes!
Question:
We are a scan service provider and we’d like to offer your PDF Compressor Enterprise to one of our customers. We’d like to calculate the time needed to complete their project before we finalize the deal. What information is needed in order to calculate how long it will take to compress and convert all of their documents to PDF/A with the PDF Compressor Enterprise?
Answer:
Thank you for your inquiry. The time it will take for you to process this job (or any job) depends on a number of factors. So that we can best estimate, can you provide us with more information regarding the scope of this project? Here are some important variables that we need to solve for before estimating the time it will take to complete this conversion project:
As soon as we have this information, we can best calculate the time it will take for you to complete this project. Additionally, this information will allow us to recommend a license model best suited for your project. For example, if you are working to meet a deadline we might suggest purchasing additional CPU-core licenses to complete your projects on time.
Click here to download a trial of PDF Compressor
For more information about our license models please click here
Question:
I am planning a scan project at the moment. I would like to scan, compress and convert 4 million pages in full color to PDF/A within the next four month. These pages are letter size and we are planning to scan at 150 dpi, 24-bit color and without OCR. Which PDF Compressor Enterprise license model would be the best for this project type?
Answer:
Thank you for your inquiry. First, we recommend that you scan at 200-300 dpi for optimum image quality. Extreme document compression with the PDF Compressor Enterprise allows for scanning at higher resolutions without a concern for file size. So, let‘s assume you’ll be scanning at 200 dpi.
There are two possible license models to meet your needs:
1) You could choose the Basic license model which is ideal for project-based conversion. The standard Basic license is for 20,000 pages. So, in this case you would purchase one license and an additional cartridge for 4,000,000 pages. The greatest advantage of this option is that the Basic license will use all available CPU-cores on the computer it is installed on. For example, if you install the software on a Quadcore Computer, you will benefit from all 4 CPU-cores. If you process this project on a Quadcore machine you’ll be finished in one month or two. Additionally, support is included at no additional fee.
2) If you expect future projects to arise, you might want to consider the Server license model which is an unlimited license per CPU-core without page or time limitations. To process your 4 million pages in four months, we’d recommend purchasing an additional CPU-core to ensure you meet your deadline. By investing in annual maintenance and support you can reduce the cost on ongoing processing.
Above all, we recommend that download and install a trial version of the PDF Compressor Enterprise so you can begin testing your documents within your environment to best judge compression rates and processing time.
Click here to download a trial of PDF Compressor
For more information about our license models please click here
Question:
I have a large number of images in a complex folder structure - one folder containing multiple subfolders for each customer. With the PDF Compressor is it possible to compress all images, even files in subfolders, and maintain the folder structure?
Answer:
Yes, in the PDF Compressor job input settings please enable the “Include subfolders” option. PDF Compressor will compress all images within your input folder, even files in subfolders, and duplicate the subfolder structure of the input folder in the output folder.
Question:
Is it possible to batch convert my Microsoft Office documents to PDF/A?
Answer:
Yes, now you can easily batch convert your born digital documents - such as Microsoft Word, Excel, and PowerPoint - to PDF/A for long-term archiving. With the Born Digital Module option available with our PDF Compressor Enterprise, document types supported by Microsoft Office, Outlook e-mails with attachments and digitally created PDFs can be converted to PDF/A in the same way that scanned documents can be converted to PDF/A. The Born Digital Module is available for all PDF Compressor Enterprise license models.
Question:
I want to use PDF Compressor to compress 100,000 single page TIFF scans and apply optical character recognition (OCR). Do you think a PC with Windows XP™ and a 1.8 GHz Dual-Core CPU with 2 GB is sufficient? What do you recommend for optimum performance?
Answer:
PDF Compressor clearly benefits from a fast CPU, so preferably use a machine with more than 2 GHz. Additionally, it helps if few other applications run on that CPU at the same time.
On a multi-core computer PDF Compressor will only use the number of processor cores it is licensed for. For your Dual-Core system you can buy either a Single-Core or Dual-Core license. With a Dual-Core license two compression jobs can run in parallel, effectively almost doubling the throughput.
2 GB of main memory should suffice for your compression task, unless other applications occupy a lot of memory. We recommend 2 GB per licensed processor core for PDF Compressor.
Moreover, as each project is unique, I suggest you download a trial of PDF Compressor and test its performance with your documents.
Question:
What is the difference between PDF and PDF/A?
Answer:
Good question. PDF/A is a special kind of PDF. In contrast to the common PDF, the PDF/A ISO standard is defined for long-term archiving. Essentially, PDF/A identifies a ‘profile’ for electronic documents that ensures the documents can be reproduced in years to come. It offers users a way for representing electronic documents in a manner that preserves their visual appearance over time, independent of the tools and systems used for creating, storing or rending the files.