Section:

Examples of how to convert PDF to SVG using command-line

Apryse's PDF2SVG is a command-line application designed to convert PDF files to SVG, the open-standard W3C recommendation for high-end graphics on the web. The flawless conversion process creates web-ready SVG documents. This section covers the basic use of PDF2SVG explaining all the available options.

Basic Syntax

The basic command-line syntax is:

pdf2svg [options] file1 file2 folder1 file3 ...

See more options in Command-Line Summary for PDF2SVG

General Usage Examples

Example 1. The simplest command line: Convert PDF to SVG.

Notes:

  • The '-o' (or --output) parameter is used to specify the output folder. If this option is not specified, all converted SVG-s will be stored in the current working folder.

pdf2svg -o outfolder in.pdf

Example 2. Convert PDF to compressed SVG and without thumbnails and XML summary.

Notes:

  • The '--noxmldo' option disables generation of thumbnails.
  • The '--nothumbs' option disables generation of thumbnails.
  • The '--svgz' option instructs PDF2SVG to compress SVG using GZIP compression.
  • The '--verb' option instructs PDF2SVG to output more feedback in the console window.

pdf2svg --output test_out/ex2 --svgz --nothumbs --noxmldoc --verb 3 in.pdf

Example 3. Convert a password protected file to SVG.

Notes:

  • The '-p' (or --pass) parameter is used to specify the password (i.e. 'secret') required to open the encrypted document.
  • The '--pages' option instructs PDF2SVG to convert only the first page.

pdf2svg -p secret -o ex3 --nothumbs --noxmldoc --pages 1 secret.pdf

Example 4. Convert all PDF document in a given folders to stand alone SVG.

Notes:

  • The '--bbox' parameter instructs PDF2SVG to use media box for clipping instead of crop box, which is the default.
  • The '--embedimages' option (or -i in the short form) instructs PDF2SVG to embed all images as inline resources. This option produces stand-alone SVG files (i.e. SVG files without external references).

pdf2svg -o OUT --embedimages --box media "My Folder1" "MyFolder2"

Batch Processing and the Use of Wildcards

PDF2SVG supports processing of multiple input documents in the same run. For example, it is possible to specify multiple PDF folders and PDF2SVG will automatically process all PDF documents matching a given file extension. For example, the following command-line will process all PDF documents in folders 'test1' and 'test2'

c:\>pdf2svg -o c:/output_folder c:/test1 c:/test2

Wildcard characters can also be used to process multiple input files.

For example, if a directory contains the following PDF documents:

sh

1C:\test1 >dir
2 Directory of C:\test1
3 01/04/2007 03:35 PM <DIR> .
4 01/04/2007 03:35 PM <DIR> ..
5 05/21/2004 02:27 PM A1.pdf
6 05/03/2005 09:38 AM A2.pdf
7 05/20/2003 08:46 AM B1.pdf
8 05/15/2003 12:50 PM B2.pdf

To process all PDF documents in this folder, you could specify:

pdf2svg -o c:/output_folder c:/test1/*.pdf

To process all PDF documents starting with 'A', you could specify:

pdf2svg -o c:/output_folder c:/test1/A*.pdf

Or to process all PDF documents ending with '1', you could specify:

pdf2svg -o c:/output_folder c:/test1/*1.pdf

You can use either of the two standard wildcards --- the question mark (?) and the asterisk (*) --- to specify filename and path arguments on the command line.

The wildcards are expanded in the same manner as operating system commands. (See your operating system user's guide if you are unfamiliar with wildcards). Enclosing an argument in double quotation marks (" ") suppresses the wildcard expansion. Within quoted arguments, you can represent quotation marks literally by preceding the double-quotation-mark character with a backslash (\). If no matches are found for the wildcard argument, the argument is passed literally.

Exit Codes

To provide additional feedback, PDF2SVG returns exit codes after completing processing. The exit codes can be used to provide user feedback, for logging etc. This is particularly important for applications running in an unattended environment.

The following table lists possible exit codes and their description:

sh

1Exit Code Description
2----------- ------------------------------------
30 All files converted successfully.
41 Unspecified error.
52 Bad license key.
63 Failed to create output directory.
74 Failed to read the input document.
85 The PDF password is incorrect.
96 Conversion error.
107 Failed to connect to server.

All codes other then '0' indicate that there was an error during the conversion process.

To get detailed information on an error, set the --verb parameter to 2.

The following illustrates a sample Windows batch script that processes exit codes:

sh

1@echo off rem convert all PDF files in 'data' folder
2pdf2svg data
3if errorlevel 1 goto inputerr
4if errorlevel 2 goto passwd
5if errorlevel 3 goto converr
6if errorlevel 4 goto othererror
7if errorlevel 0 goto exit
8
9:passwd
10echo Document is protected. Need a valid password to open the document.
11goto exit
12
13:inputerr
14echo No input files specified.
15goto exit
16
17:converr
18echo A file conversion error was encountered.
19goto exit
20
21:othererror
22echo An error encountered during processing.
23goto exit
24
25:exit

XML Summary Document

This section describes the XML Summary Document that can be generated using PDF2SVG and its potential use in various applications.

By default PDF2SVG generates an XML Summary Document for every PDF document. The XML Summary Document contains document-level information that is not part of SVG files that describe individual pages. The information includes general information about the document (such as author, subject, title, keywords), as well as a listing of document parts and relationships such as pages, thumbnails, annotations, and bookmarks.

The following is a sample XML snippet generated by converting this user manual to SVG:

sh

1<?xml version="1.0" encoding="UTF-8"?>
2
3<doc name="1" ext="svg">
4 <info>
5 <title>Apryse PDF2SVG User Manual</title>
6 <author>Apryse Systems</author>
7 <subject>Apryse PDF2SVG User Manual</subject>
8 <keywords />
9 <creator>Acrobat PDFMaker 7.0.7 for Word</creator>
10 <producer>Acrobat Distiller 7.0.5 (Windows)</producer>
11 </info>
12 <pages>
13 <page id="1" href="1_1.svg" width="612.0000" height="792.0000">
14 <thumb href="1_1_thumb.jpg" />
15 </page>
16 <page id="2" href="1_2.svg" width="612.0000" height="792.0000">
17 <thumb href="1_2_thumb.jpg" />
18 </page>
19 ...
20 </pages>
21 <bookmarks>
22 <bookmark title="2.0 Installing and Uninstalling PDF2SVG" open="true" goto="7" href="1_7.svg">
23 <bookmark title="2.1 PDF2SVG Installation" open="false" goto="7" href="1_7.svg" />
24 <bookmark title="2.2 Demo Version Installation" open="false" goto="7" href="1_7.svg" />
25 <bookmark title="2.3 Uninstalling PDF2SVG" open="false" goto="7" href="1_7.svg" />
26 ...
27 </bookmarks>
28</doc>

Most of the elements and attributes are self explanatory. The 'info' element lists document information properties, the 'pages' element lists all 'page' elements that are part of the high level 'document', and the 'bookmarks' element specifies the outline tree that can be used for quick navigation between pages.

The summary document can be used as a map of the abstract document that contains many SVG files representing document pages, as well as outline tree and annotations describing how different document parts are related.

In most cases, the summary document is further consumed by an XML consumer/processor (e.g. XML DOM/SAX Library or XSLT). For example, an application may read XML summary to create database records for archiving purposes. Another application may implement interactive navigation through SVG pages using the document outline.

Yet another example of the XML wrapper consumer is an eBook generator that converts the XML Summary Document to HTML. The generated HTML would wrap converted SVG files and would provide web-based eBook interface for navigation between different pages, including bookmark tree, thumbnail index, etc. The end result would look like what is illustrated in the following figure:

Apryse Docs Image

The process used to create HTML eBook wrapping converted SVG-s is illustrated in the following figure:

Apryse Docs Image

Using PDF2SVG, a PDF document is converted to a set of SVG images and their thumbnails, as well as the XML Summary Document. The fastest way to create HTML wrappers around SVG is using XSLT. XSLT is a very simple language for transforming XML documents. A simple XSLT transform may look as follows:

sh

1<?xml version="1.0" encoding="UTF-8"?>
2<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
3 <xsl:output method="html" indent="yes" doctype-public="-//W3C//DTD HTML 3.2 Final//EN" />
4 <xsl:template match="/">
5 <HTML>
6 <HEAD>
7 <TITLE>HTML SVG Wrapper</TITLE>
8 </HEAD>
9 <BODY>
10 <xsl:apply-templates select="doc/info" />
11 <HR />
12 <xsl:apply-templates select="doc/pages" />
13 </BODY>
14 </HTML>
15 </xsl:template>
16 <xsl:template match="info">
17 <table border="0" cellspacing="0" cellpadding="4">
18 <tr>
19 <td>Title:</td>
20 <td>
21 <xsl:value-of select="title" />
22 </td>
23 </tr>
24 <tr>
25 <td>Author:</td>
26 <td>
27 <xsl:value-of select="author" />
28 </td>
29 </tr>
30 <tr>
31 <td>Subject:</td>
32 <td>
33 <xsl:value-of select="subject" />
34 </td>
35 </tr>
36 <tr>
37 <td>Keywords:</td>
38 <td>
39 <xsl:value-of select="keywords" />
40 </td>
41 </tr>
42 <tr>
43 <td>Creator:</td>
44 <td>
45 <xsl:value-of select="creator" />
46 </td>
47 </tr>
48 <tr>
49 <td>Producer:</td>
50 <td>
51 <xsl:value-of select="producer" />
52 </td>
53 </tr>
54 </table>
55 </xsl:template>
56 <xsl:template match="pages">
57 <TABLE BORDER="1">
58 <xsl:apply-templates />
59 </TABLE>
60 </xsl:template>
61 <xsl:template match="page">
62 <TR>
63 <TD>
64 <A TARGET="view" HREF="{@href}">
65 Page
66 <xsl:value-of select="\@id" />
67 </A>
68 </TD>
69 <TD>
70 <A TARGET="view" HREF="{@href}">
71 <IMG SRC="{thumb/@href}" />
72 </A>
73 </TD>
74 </TR>
75 </xsl:template>
76</xsl:stylesheet>

The above XSLT template will create an HTML page containing general information about the documents such as it title, subject, keywords, etc. The HTML will also contain a thumbnail index of all pages in the document. Clicking on page labels or on thumbnails will open SVG graphics in the right pane of the browser window. The final result would look as follows:

Apryse Docs Image

To run XSLT transforms you can use your favorite XSLT processor. As a starting point, PDF2SVG distribution comes with a sample project illustrating how to run XSLT transform using Microsoft .NET Framework.

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales