Do you own a Debenu Quick PDF Library 12/11/10/9/8/7? Upgrade to Debenu Quick PDF Library 13!

Foxit Quick PDF Library

Shared Content Streams and Quick PDF Library

February 1, 2012

A lot of PDF tools expect all the pages of a PDF to have individual content streams.

But it’s technically possible for two or more pages to reference the exact same content stream, either entirely or even pieces of content stream parts.

10 0 obj
<<
/Type /Page
/Contents [ 11 0 R 12 0 R ]
>>

15 0 obj
<<
/Type /Page
/Contents [ 18 0 R 11 0 R ]
>

In this example, the page defined by object 10 has two content stream parts, stored in objects 11 and 12. The page in object 15 also has two content stream parts, but the second part is the same object 11 used by the first page – this is a shared content stream.

If a PDF is structured that way, any changes to one page will also appear on the other page with the shared content stream.

The PDF specification doesn’t forbid this at all and Adobe Acrobat and Adobe Reader both seem happy with files like this. It’s quite a useful trick. In fact Quick PDF Library’s ClonePages function uses this exact technique to allow many pages to share a single content stream without increasing the size of the file.

Some PDF software might not be able to read files structured like this, and most PDF tools will have unpredictable results when doing things like extracting or deleting pages.

When deleting a page from a PDF, it makes sense to delete the content streams that describe the page and not just the page dictionary otherwise there would be unused data in the output PDF, wasted space. Quick PDF Library’s DeletePages function does exactly that, it clears all the content stream parts (sets them to an empty string) and then deletes the page dictionary.

So if a page shares content streams with other pages, and is then deleted with DeletePages, the other pages will be affected too.

The RemoveSharedContentStreams function cycles through all the pages in the document building up a list of stream objects in the /Contents array. If any shared content streams are found they are left intact for one page and copies are made for any other pages using that same content.

This process might take a long time on PDFs with thousands of pages.

By Rowan | Comments Off on Shared Content Streams and Quick PDF Library | Posted in Quick PDF Library,Tips & Tutorials