Product:

Get started

Release notes

Viewer

Basic operations

Learn more

Annotation

MS Office

Generate via template

Conversion

Overview

Convert any printable document

Web/HTML to PDF

MS Office to PDF

SVG to PDF

PDF to MS Office

PDF to HTML

Convert to PDF

Convert from PDF

Convert from DICOM

Convert from CAD

Samples

APIs

Smart Data Extraction

Augmenting LLMs with Smart Data Extraction

PDF/A

Accessibility

Forms

Create

Page manipulation

PDF Editing

OCR

Digital signature

Comparison

Bookmark

Optimization

Layer (OCG)

Redaction

Security

Portfolio

Low-level PDF API

Changelogs

Convert PDF to HTML on Server/Desktop

There are two HTML conversion modules.

Fixed position
The built-in HTML module is used to convert PDF documents to fixed-position HTML documents.

Full reflow
To convert PDF Documents to HTML format with full reflow. This module is an optional add-on.

Convert with fixed positioning

To convert PDF Documents to HTML format with fixed positioning.

1// Convert PDF document to HTML with fixed positioning option turned on (default)
2Convert.ToHtml(filename, output_filename);

1// Convert PDF document to HTML with fixed positioning option turned on (default)
2Convert::ToHtml(filename, output_filename);

1// Convert PDF document to HTML with fixed positioning option turned on (default)
2ConvertToHtml(filename, output_filename)

1// Convert PDF document to HTML with fixed positioning option turned on (default)
2Convert.toHtml(filename, output_filename);

1async function main() {
2  // Convert PDF document to HTML with fixed positioning option turned on (default)
3  await PDFNet.Convert.fileToHtml(filename, output_filename);
4}
5PDFNet.runWithCleanup(main);

1// Convert PDF document to HTML with fixed positioning option turned on (default)
2[PTConvert ToHtmlWithFilename:filename out_filename:output_filename options:nil];

1// Convert PDF document to HTML with fixed positioning option turned on (default)
2Convert::ToHtml($filename, $output_filename);

1# Convert PDF document to HTML with fixed positioning option turned on (default)
2Convert.ToHtml(filename, output_filename)

1# Convert PDF document to HTML with fixed positioning option turned on (default)
2Convert.toHtml(filename, output_filename)

1' Convert PDF document to HTML with fixed positioning option turned on (default)
2Convert.ToHtml(filename, output_filename)

PDF Converter (SVG, XPS, TIFF, JPG, RTF, TXT, More)
Full sample code which shows how to use PDFNet Convert for direct, high-quality conversion between PDF, XPS, EMF, SVG, TIFF, PNG, JPEG, and other image formats. Samples provided in Python, C#, C++, Java, node.js (JavaScript), Go, PHP, VB, Objective-C, Swift and Kotlin.

Convert with full reflow

To convert PDF Documents to HTML format with full reflow.

The Structured Output module is an optional add-on

Only available on Desktop and Server (Windows, Linux, or Mac) You can find more details about how to install the Structured Output module.

1Convert.HTMLOutputOptions htmlOutputOptions = new Convert.HTMLOutputOptions();
2// Set e_reflow_full content reflow setting
3htmlOutputOptions.SetContentReflowSetting(Convert.HTMLOutputOptions.ContentReflowSetting.e_reflow_full);
4// Convert PDF document to HTML with full reflow option turned on
5// But requires the Structured Output module
6Convert.ToHtml(filename, output_filename, htmlOutputOptions);

1Convert::HTMLOutputOptions htmlOutputOptions;
2// Set e_reflow_full content reflow setting
3htmlOutputOptions.SetContentReflowSetting(Convert::HTMLOutputOptions::e_reflow_full);
4// Convert PDF document to HTML with full reflow option turned on
5// But requires the Structured Output module
6Convert::ToHtml(filename, output_filename, htmlOutputOptions);

1htmlOutputOptions := NewHTMLOutputOptions()
2// Set e_reflow_full content reflow setting
3htmlOutputOptions.SetContentReflowSetting(HTMLOutputOptionsE_reflow_full);
4// Convert PDF document to HTML with full reflow option turned on
5// But requires the Structured Output module
6ConvertToHtml(filename, output_filename, htmlOutputOptions)

1Convert.HTMLOutputOptions htmlOutputOptions = new Convert.HTMLOutputOptions();
2// Set e_reflow_full content reflow setting
3htmlOutputOptions.setContentReflowSetting(Convert.HTMLOutputOptions.e_reflow_full);
4// Convert PDF document to HTML with full reflow option turned on
5// But requires the Structured Output module
6Convert.toHtml(filename, output_filename, htmlOutputOptions);

1async function main() {
2  const htmlOutputOptions = new PDFNet.Convert.HTMLOutputOptions();
3  // Set e_reflow_full content reflow setting
4  htmlOutputOptions.setContentReflowSetting(PDFNet.Convert.HTMLOutputOptions.ContentReflowSetting.e_reflow_full);
5  // Convert PDF document to HTML with full reflow option turned on
6  // But requires the Structured Output module
7  await PDFNet.Convert.fileToHtml(filename, output_filename, htmlOutputOptions);
8}
9PDFNet.runWithCleanup(main);

1$htmlOutputOptions = new HTMLOutputOptions();
2// Set e_reflow_full content reflow setting
3$htmlOutputOptions->SetContentReflowSetting(HTMLOutputOptions::e_reflow_full);
4// Convert PDF document to HTML with full reflow option turned on
5// But requires the Structured Output module
6Convert::ToHtml($filename, $output_filename, htmlOutputOptions);

1htmlOutputOptions = HTMLOutputOptions()
2# Set e_reflow_full content reflow setting
3htmlOutputOptions.SetContentReflowSetting(HTMLOutputOptions.e_reflow_full)
4# Convert PDF document to HTML with full reflow option turned on
5# But requires the Structured Output module
6Convert.ToHtml(filename, output_filename, htmlOutputOptions)

1$htmlOutputOptions = Convert::HTMLOutputOptions.new()
2# Set e_reflow_full content reflow setting
3$htmlOutputOptions.SetContentReflowSetting(Convert::HTMLOutputOptions::E_reflow_full)
4# Convert PDF document to HTML with full reflow option turned on
5# But requires the Structured Output module
6Convert.toHtml(filename, output_filename, $htmlOutputOptions)

1Dim htmlOutputOptions As pdftron.PDF.Convert.HTMLOutputOptions = New pdftron.PDF.Convert.HTMLOutputOptions()
2' Set e_reflow_full content reflow setting
3htmlOutputOptions.SetContentReflowSetting(pdftron.PDF.Convert.HTMLOutputOptions.ContentReflowSetting.e_reflow_full)
4' Convert PDF document to HTML with full reflow option turned on
5' But requires the Structured Output module
6Convert.ToHtml(filename, output_filename, htmlOutputOptions)

Convert with reflow paragraphs (depreciated)

To convert PDF Documents to HTML format with reflow paragraphs.

The HTML reflow paragraphs module is an optional add-on

Only available on Desktop and Server (Windows, Linux, or Mac) You can find more details about how to install PDF2HTML reflow paragraph module here .

1Convert.HTMLOutputOptions htmlOutputOptions = new Convert.HTMLOutputOptions();
2// Set e_reflow_paragraphs content reflow setting
3htmlOutputOptions.SetContentReflowSetting(Convert.HTMLOutputOptions.ContentReflowSetting.e_reflow_paragraphs);
4// Optionally set to flow paragraphs across the entire browser window.
5htmlOutputOptions.SetNoPageWidth(true);
6// Convert PDF document to HTML with reflow paragraphs option turned on
7// But requires the PDF2HtmlReflowParagraphsModule
8Convert.ToHtml(filename, output_filename, htmlOutputOptions);

1Convert::HTMLOutputOptions htmlOutputOptions;
2// Set e_reflow_paragraphs content reflow setting
3htmlOutputOptions.SetContentReflowSetting(Convert::HTMLOutputOptions::e_reflow_paragraphs);
4// Optionally set to flow paragraphs across the entire browser window.
5htmlOutputOptions.SetNoPageWidth(true);
6// Convert PDF document to HTML with reflow paragraphs option turned on
7// But requires the PDF2HtmlReflowParagraphsModule
8Convert::ToHtml(filename, output_filename, htmlOutputOptions);

1htmlOutputOptions := NewHTMLOutputOptions()
2// Set e_reflow_paragraphs content reflow setting
3htmlOutputOptions.SetContentReflowSetting(HTMLOutputOptionsE_reflow_paragraphs);
4// Optionally set to flow paragraphs across the entire browser window.
5htmlOutputOptions.SetNoPageWidth(true);
6// Convert PDF document to HTML with reflow paragraphs option turned on
7// But requires the PDF2HtmlReflowParagraphsModule
8ConvertToHtml(filename, output_filename, htmlOutputOptions)

1Convert.HTMLOutputOptions htmlOutputOptions = new Convert.HTMLOutputOptions();
2// Set e_reflow_paragraphs content reflow setting
3htmlOutputOptions.setContentReflowSetting(Convert.HTMLOutputOptions.e_reflow_paragraphs);
4// Optionally set to flow paragraphs across the entire browser window.
5htmlOutputOptions.setNoPageWidth(true);
6// Convert PDF document to HTML with reflow paragraphs option turned on
7// But requires the PDF2HtmlReflowParagraphsModule
8Convert.toHtml(filename, output_filename, htmlOutputOptions);

1Convert.HTMLOutputOptions htmlOutputOptions = new Convert.HTMLOutputOptions();
2// Set e_reflow_paragraphs content reflow setting
3htmlOutputOptions.setContentReflowSetting(Convert.HTMLOutputOptions.e_reflow_paragraphs);
4// Optionally set to flow paragraphs across the entire browser window.
5htmlOutputOptions.setNoPageWidth(true);
6// Convert PDF document to HTML with reflow paragraphs option turned on
7// But requires the PDF2HtmlReflowParagraphsModule
8Convert.toHtml(filename, output_filename, htmlOutputOptions);

1$htmlOutputOptions = new HTMLOutputOptions();
2// Set e_reflow_paragraphs content reflow setting
3$htmlOutputOptions->SetContentReflowSetting(HTMLOutputOptions::e_reflow_paragraphs);
4// Optionally set to flow paragraphs across the entire browser window.
5$htmlOutputOptions->SetNoPageWidth(true);
6// Convert PDF document to HTML with reflow paragraphs option turned on
7// But requires the PDF2HtmlReflowParagraphsModule
8Convert::ToHtml($filename, $output_filename, htmlOutputOptions);

1htmlOutputOptions = HTMLOutputOptions()
2# Set e_reflow_paragraphs content reflow setting
3htmlOutputOptions.SetContentReflowSetting(HTMLOutputOptions.e_reflow_paragraphs)
4# Optionally set to flow paragraphs across the entire browser window.
5htmlOutputOptions.SetNoPageWidth(True)
6# Convert PDF document to HTML with reflow paragraphs option turned on
7# But requires the PDF2HtmlReflowParagraphsModule
8Convert.ToHtml(filename, output_filename, htmlOutputOptions)

1$htmlOutputOptions = Convert::HTMLOutputOptions.new()
2# Set e_reflow_paragraphs content reflow setting
3$htmlOutputOptions.SetContentReflowSetting(Convert::HTMLOutputOptions::E_reflow_paragraphs)
4# Optionally set to flow paragraphs across the entire browser window.
5$htmlOutputOptions.SetNoPageWidth(true)
6# Convert PDF document to HTML with reflow paragraphs option turned on
7# But requires the PDF2HtmlReflowParagraphsModule
8Convert.toHtml(filename, output_filename, $htmlOutputOptions)

1Dim htmlOutputOptions As pdftron.PDF.Convert.HTMLOutputOptions = New pdftron.PDF.Convert.HTMLOutputOptions()
2' Set e_reflow_paragraphs content reflow setting
3htmlOutputOptions.SetContentReflowSetting(pdftron.PDF.Convert.HTMLOutputOptions.ContentReflowSetting.e_reflow_paragraphs)
4' Optionally set to flow paragraphs across the entire browser window.
5htmlOutputOptions.SetNoPageWidth(True)
6' Convert PDF document to HTML with reflow paragraphs option turned on
7' But requires the PDF2HtmlReflowParagraphsModule
8Convert.ToHtml(filename, output_filename, htmlOutputOptions)

Convert PDF to HTML - Sample Code
Full sample code which shows how to convert generic PDF documents to HTML format. Sample code provided in Python, C++, C#, Java, Node.js (JavaScript), PHP, Ruby, Go and VB.

HTMLOutputOptions

The following table illustrates which options apply to which conversion engines.

Settings API	Fixed Position	Reflow Paragraphs (Deprecated)	Full Reflow
SetContentReflowSetting	X	X	X
SetDPI	X
SetExternalLinks	X
SetInternalLinks	X
SetMaximumImagePixels	X
SetReportFile	X
SetScale	X
SetSimplifyText	X
SetJPGQuality	X	X
SetPreferJPG	X	X
SetDisableVerticalSplit		X
SetImageDPI		X
SetFileConversionTimeoutSeconds		X
SetNoPageWidth		X
SetSimpleLists		X
SetTitle		X
SetConnectHyphens		X	X
SetEmbedImages		X	X
SetPages		X	X
SetPDFPassword		X	X
SetSearchableImageSetting		X	X
SetLanguage			X

About PDF to HTML

Coding samples are provided here in Python, C#, C++, Java, node.js (JavaScript), PHP, Ruby, Objective-C, Go and VB. Need any more assistance? Please contact sales.

Depending on your use case, PDF to HTML can be used for rendering with high fidelity and accuracy or to primarily be used in content extraction. This means our tools can help you to display the output or be used in data analysis workflows.

Here are the different options for PDF to HTML conversion depending on your requirements:

PDF to HTML for the highest rendering accuracy

Here are the options for maintaining the original PDF layout and visual accuracy.

WebViewer
To convert PDF to HTML canvas in real-time client-side.

PDF to HTML/ePub
To convert PDF to fixed layout HTML/ePub where one PDF page becomes one HTML file.

PDF2SVG
To convert PDF to SVG to create a vector based image that can be embedded in an HTML file.

PDF2Image
To convert PDF to Image (PNG, JPG, TIFF, Raw) to create a raster based image that can be embedded in an HTML file.

PDF to HTML for extracting semantic content

Here are the options for extracting semantic content from the output.

PDF2HTML
To convert PDF to a single HTML file that preserves the PDF content using a custom heuristic method.

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales

Product:

Product:

Convert PDF to HTML on Server/Desktop

Convert with fixed positioning

Convert with full reflow

The Structured Output module is an optional add-on

Convert with reflow paragraphs (depreciated)

The HTML reflow paragraphs module is an optional add-on

HTMLOutputOptions

About PDF to HTML

PDF to HTML for the highest rendering accuracy

PDF to HTML for extracting semantic content

On this page