PDF processing with OSS

Handout / Printing multiple pages on one page

Sometimes you just want to print e.g. several slides of a presentation on one page. Combining four slides on one page seems to save space and money. You could use several printer drivers but what if you want to provide such a file to your audience? In this case you could use pdfnup from the pdfjam package. This is a nice and easy to use tool:

$ pdfnup --orient landscape --frame true --delta "3mm 3mm" --scale 0.91 --nup 2x2 slides.pdf

This will create a PDF with four slides on one page like the following layout:
+---------+---------+
|         |         |
|    1    |    2    |
|         |         |
+---------+---------+
|         |         |
|    3    |    4    |
|         |         |
+---------+---------+
Have a look at the options of pdfnup. There are plenty of things to tweak. For example if you want to change the layout so that slide two and three get exchanged then use the option --column true.

One thing left to mention is that if you do not specify the option --orient landscape then you won't get an A4 paper or letter sized result. Even if you explicitly specify the option --paper a4paper (which is the default) you will get a paper which has a different size. This is not a bug, this is a feature ;-) pdfnup, or more specific LaTeX, tries to find the best size of the paper to fit all your four pages into. But in most cases you will want to have a standard A4 or letter sized paper. Therefore, use the option --orient landscape/portrait to accomplish that.

Cut or paste pages of and into a PDF

Consider the following example, you have a PDF document with 123 pages but you only need the pages from 23 to 42 and the pages 97, 99 and 101. To create a new PDF document containing only the before mentioned pages you could use pdftk:

$ pdftk in.pdf cat 23-42 97 99 101 output out.pdf

There exist numerous options e.g. if you want to include all pages starting from 42 to the end of the document, you could use cat 42-end. If you want to join two PDF documents you can specify them separated by whitespace like the following:

$ pdftk in1.pdf in2.pdf in3.pdf output out.pdf

I prefer pdftk over pdfjoin for cutting and joining documents, but you can use the latter for this too.

$ pdfjoin in1.pdf in2.pdf in3.pdf

Nevertheless both apps have its pros and cons. For example pdfjoin supports via the option --fitpaper true the possibility to fit all pages according to the first one. Just have a look at both apps and consider the best one for yourself.

Application details

flpsed

is a graphical tool to add text lines to a Postscript document (website). You cannot edit or remove existing elements. Just adding new ones. Strictly speaking this is no PDF tool because you have to convert your PDF file into Postscript before editing but it may be useful in some situations, therefore it's listed here.

PDFedit

supports a graphical user interface to edit and manipulate PDF documents in a way like WYSIWYG (website). A library and a command line is also supported. It is not perfect but still an ongoing project. Tip: If you need to delinearize a PDF document, then PDFedit may help you out.

PDFjam

contains three applications which are described from the official website like this:

pdfposter

does nearly the same job as its Postscript pedant called poster (website): Scales a PDF document to print on multiple pages.

PDF-Shuffler

is a graphical tool which makes it easy to perform the following tasks (website): All this is done via an easy to use graphical interface written in python and GTK.

pdftk

is a powerful command line tool which reminds me to the Unix philosophy ;-)
Underneath is a list of all features from the official website:

Xournal

is a graphical tool and originally intended for notetaking but it can also be used to comment on PDF files (website). For example adding and highlighting text or using a simple pen to draw something. It is one of the faster tools and written in GTK+.

Comments

If you are aware of a nice, cool Open Source Tool to process PDF documents which is missing here, then just drop me an e-mail: stefan@seekline.net