Some test text!

Hamburger Icon

Ruby / Guides

Advanced PDF page manipulation in Ruby

To combine or impose multiple PDF pages into one.

PDFDoc in_doc = new PDFDoc(filename);

// Create a list of pages to import from one PDF document to another.
ArrayList import_list = new ArrayList();
for (PageIterator itr = in_doc.GetPageIterator(); itr.HasNext(); itr.Next())

PDFDoc new_doc = new PDFDoc(); //  Create a new document
ElementBuilder builder = new ElementBuilder();
ElementWriter  writer  = new ElementWriter();

ArrayList imported_pages = new_doc.ImportPages(import_list);

// Paper dimension for A3 format in points. Because one inch has
// 72 points, 11.69 inch 72 = 841.69 points
Rect media_box= new Rect(0, 0, 1190.88, 841.69);
double mid_point = media_box.Width()/2;

for (int i=0; i<imported_pages.Count; ++i)
	// Create a blank new A3 page and place on it two pages from the input document.
	Page new_page = new_doc.PageCreate(media_box);

	// Place the first page
	Page src_page = (Page)imported_pages[i];
	Element element = builder.CreateForm(src_page);

	double sc_x = mid_point / src_page.GetPageWidth();
	double sc_y = media_box.Height() / src_page.GetPageHeight();
	double scale = Math.Min(sc_x, sc_y);
	element.GetGState().SetTransform(scale, 0, 0, scale, 0, 0);

	// Place the second page
	if (i<imported_pages.Count)
		src_page = (Page)imported_pages[i];
		element = builder.CreateForm(src_page);
		sc_x = mid_point / src_page.GetPageWidth();
		sc_y = media_box.Height() / src_page.GetPageHeight();
		scale = Math.Min(sc_x, sc_y);
		element.GetGState().SetTransform(scale, 0, 0, scale, mid_point, 0);


PDF imposition
Full code sample which illustrates how multiple pages can be combined/imposed using PDFNet. Page imposition can be used to arrange/order pages prior to printing or to assemble a 'master' page from several 'source' pages.

About advanced page manipulation

A Page can also be copied from one document to another (or replicated within an existing document) using the PDFDoc.PageInsert(where, pg), PDFDoc.PagePushFront(pg), PDFDoc.PagePushBack(pg) and PDFDoc.ImportPages(list) methods.

PagePushBack(page) appends the given Page at the end of page sequence, whereas PagePushFront(page) inserts the Page at the front of the sequence. PageInsert(**where**, page) inserts the page in front the page currently pointed to by the where PageIterator.

// Append three copies of the page to the document.

// Create a new page and insert it just before
// the second page
doc.PageInsert(doc.GetPageIterator(2), doc.PageCreate());

Note that it is possible to replicate a given page within a document by repeatedly adding the same page.

The same methods can also be used to merge documents or copy pages from one document to another.

In a PDF document, every page object contains references to images, fonts, color spaces, and other objects required to render the page. In order to accurately copy a page from one document to another, these PageInsert / PagePushFront / PagePushBack methods must copy all referenced resources.

If you are copying several pages between two documents, it's better to use PDFDoc.ImportPages(page_list) because the resulting document will be much smaller and the copy operation will be faster.

ImportPages() is better than other methods for multi page copy because it preserves resource sharing in the target document. This is illustrated in following figures.

Copying pages between two documents using PageInsert/PagePushFront/PagePushBack.

In a PDF document, page resources (such as fonts, images, color-spaces, or forms) can be shared across several pages. Sharing these resources reduces file size and speeds up page processing. In figure above, all three pages of 'Document 1' share the same font and color space object. 'Document 2' was created by direct page copy using PageInsert, PagePushFront or PagePushBack methods. Note that each page now refers to its own separate instances of resource objects.

On the other hand, the result of page copy using ImportPages() is identical to the original document. Note that in 'Document 2', in figure below, resource objects are shared across pages.

Copying pages between two documents using ImportPages().

Also note that, if pages are copied/replicated within the same document (not between two different documents), all methods behave the same and resources are always shared.

The following code copies pages individually:

using (PDFDoc in_doc = new PDFDoc("in.pdf"))
  using (PDFDoc new_doc = new PDFDoc())
    for (PageIterator itr=in_doc.GetPageIterator();
            itr.HasNext(); itr.Next())

    // save new_doc...

But, as explained above, it's better to import multiple pages with PDFDoc.ImportPages().

ImportPages(page_list) creates a copy of pages given in the argument list, while preserving shared resources. Note that the pages in the returned list are ordered in the same way as pages in the argument list and that, although pages are copied, they are not inserted into the document's page sequence. Therefore, in order to be visible, imported or copied pages should be appended or inserted at a specific location within the document's page sequence. For example:

using (PDFDoc in_doc = new PDFDoc("in.pdf"))
  using (PDFDoc new_doc = new PDFDoc())
    // Create a list of pages to copy.
    ArrayList copy_pages = new ArrayList();
    for (PageIterator itr=in_doc.GetPageIterator();
            itr.HasNext(); itr.Next())

    // Import all the pages in 'copy_pages' list
    ArrayList imported_pages = new_doc.ImportPages(copy_pages);

    // Note that pages in 'imported_pages' list are not yet placed in
    // document's page sequence. This is done in the following step:
    for (int i=0; i!=imported_pages.Count; ++i)

    // save new_doc...

Media box adjustments

The media box defines the boundaries of the physical medium on which the page is to be printed. It may include any extended area surrounding the finished page for bleed, printing marks, or other such purposes. It may also include areas close to the edges of the medium that cannot be marked because of physical limitations of the output device. Content falling outside this boundary can safely be discarded without affecting the visible output of the PDF document. A new value for a page's media box can be specified as follows:

page.SetMediaBox(Rect.CreateSDFRect(0, 0, 500, 600));

Shift page content

Page content can be horizontally and vertically translated by adjusting the media box. For example, the following code will translate all page contents 2 inches= 72 units per inch * 2 inches to the left.

Rect media_box = page.GetMediaBox();
// translate the page 2 inches horizontally
media_box.x1 += 144;
media_box.x2 += 144;

Get the answers you need: Chat with us