Some test text!

Search
Hamburger Icon

Python / Info / Optional modules

Add-on modules for Python library

Some functionality available in the Apryse SDK requires add-on modules in order to function. This is usually because including the module within the main library would make it too large.

All of the Add-on modules require the Apryse SDK to function correctly.

OCR Module

This is the default OCR module, powered by Tesseract 4

The OCR module is packaged as a zip archive, and is meant to be expanded directly into the directory of your previously Apryse SDK download. For example, if you previously downloaded the 64 bit C/C++ package, then the OCR module would be expanded directly into the PDFNetC64 directory, overwriting files if required.

The module can be downloaded here:

The archive contains the module binary itself, as well as some sample documents for testing. There is an OCRTest sample application available in the main SDK download package that should be fully functional once this module is extracted as described above.

IRIS OCR Module

This is an enhanced OCR module, licensed separately and powered by IRIS iDRS.

The IRIS OCR module is packaged as a zip archive, and is meant to be expanded directly into the directory of your previously Apryse SDK download. For example, if you previously downloaded the 64 bit C/C++ package, then the OCR module would be expanded directly into the PDFNetC64 directory, overwriting files if required.

The module can be downloaded here:

The archive contains the module binary itself, as well as some sample documents for testing. There is an OCRTest sample application available in the main SDK download package that should be fully functional once this module is extracted as described above.

CAD Module

The CAD module is packaged as a zip archive, and is meant to be expanded directly into the directory of your previously Apryse SDK download. For example, if you previously downloaded the 64 bit C/C++ package, then the CAD module would be expanded directly into the PDFNetC64 directory, overwriting files if required.

The module can be downloaded here:

The archive contains the module binary itself, as well as a sample document for testing. There is a CAD2PDF sample application available in the main SDK download package that should be fully functional once this module is extracted as described above.

Advanced Imaging Module

The Advanced Imaging module is packaged as a zip archive, and is meant to be expanded directly into the directory of your previously Apryse SDK download. For example, if you previously downloaded the 64 bit C/C++ package, then the module would be expanded directly into the PDFNetC64 directory, overwriting files if required.

The module can be downloaded here:

The archive contains the module binary itself, as well as a sample document for testing. There is an AdvancedImagingTest sample application available in the main SDK download package that should be fully functional once this module is extracted as described above.

Supported advanced imaging formats

The module supports the following file formats.

  • AAI
  • ARW
  • CR2
  • CRW
  • CUR, ICO
  • DCM
  • DCR
  • DDS
  • HEIC, HEIF
  • MRW
  • NEF
  • ORF
  • PICT
  • PFM
  • PSB, PSD
  • RAF

Structured Output Module

New in Apryse SDK 9.2, the Structured Output module provides PDF to Word, Excel, PowerPoint and HTML conversion functionality.

The Structured Output module is packaged as a zip archive, and is meant to be expanded directly into the directory of your previous Apryse SDK download. For example, if you previously downloaded the 64 bit C/C++ package, then the Structured Output module would be expanded directly into the PDFNetC64 directory, overwriting files if required.

Trial mode page limit and random characters
When in trial mode output is limited to 6 pages, and random characters within individual words will be mixed up in the output content. Once licensed there is no page limit and text will no longer be mixed up.

The module can be downloaded here:

The archive contains the module binary itself. There are two sample applications called PDF2OfficeTest and PDF2HtmlTest available in the main SDK download package that should be fully functional once this module is extracted as described above.

Data Extraction Module

New in Apryse SDK 9.5, the Data Extraction module provides tabular data, document structure and form fields extraction functionality.

The Data Extraction module is packaged as a zip archive, and is meant to be expanded directly into the directory of your previous Apryse SDK download. For example, if you previously downloaded the 64 bit C/C++ package, then the Data Extraction module would be expanded directly into the PDFNetC64 directory, overwriting files if required.

Trial mode page limit
When in trial mode, output is limited to 6 pages, and a random evaluation page is inserted in the output content. Once licensed, there is no page limit and the demo page will no longer be inserted.

The module can be downloaded here:

The archive contains the module binaries themselves. There is a sample application called DataExtractionTest available in the main SDK download package that should be fully functional once this module is extracted as described above.

When using Node.js on Windows or Linux you can install the package via NPM with this command:

npm install @pdftron/data-extraction

When using Python on Windows or Linux you can install the package via PIP with this command:

pip install --extra-index-url=https://pypi.apryse.com apryse-data-extraction

Installing these packages will allow you to call data extraction functions without any other setup work.

PDF2Word Module

Note: The PDF2Word module has been replaced by the new Structured Output module.

PDF2HTML Reflow Paragraph Module

The PDF2HTML reflow paragraph module is packaged as a zip archive, and is meant to be expanded directly into the directory of your previous Apryse SDK download. For example, if you previously downloaded the 64 bit C/C++ package, then the PDF2HTML reflow paragraph module would be expanded directly into the PDFNetC64 directory, overwriting files if required.

The module can be downloaded here:

The archive contains the module binary itself.

HTML2PDF Module

The HTML2PDF module is packaged as a zip archive, and is meant to be expanded directly into the directory of your previously Apryse SDK download. For example, if you previously downloaded the 64-bit x86 or 64-bit ARM C++ package, then the HTML2PDF module would be expanded directly into the PDFNetC64 or PDFNetCArm64 directory, overwriting files if required.

The module can be downloaded here:

The archive contains the module binary itself. There is an HTML2PDF sample application available in the main SDK download package that should be fully functional once this module is extracted as described above.


The Linux distribution used for HTML2PDF requires shared object dependencies that may not be installed by default. Here are instructions for detecting missing dependencies and their installation:
Linux Dependencies

The HTML2PDF module can be used in an Azure App Service in both the Linux consumption plan and by using a custom container in a Premium Service Plan. For the Premium Service Plan using a custom container, please see these instructions for installing Linux dependencies in your custom container:
Azure Linux Consumption Plan

The Chromium HTML2PDF module is supported on Windows but is not supported on the Azure App Service Windows platform. More information can be found here:
Windows Azure Services

Print To PDF Module

The PrintToPDF module enables the Aprsye SDK PrintToPDFModule class, a dedicated interface which operates on any file type which can be printed, converting them directly to PDF. This module is available on Windows systems; it makes use of the print verb associated with a given file type, and includes a high performance (not xps based) Windows-certified virtual printer driver. Installation instructions are very simple and are included within the PrintToPDF package (see Install.md).

The module can be downloaded here:

Get the answers you need: Chat with us