There are two HTML conversion modules and one of them is an optional Add-on.
Fixed position The built-in HTML module is used to convert PDF documents to fixed-position HTML documents.
Full reflow To convert PDF Documents to HTML format with full reflow.
To convert PDF Documents to HTML format with fixed positioning.
C# C++ Go Java JavaScript Obj-C PHP Python Ruby VB
1 // Convert PDF document to HTML with fixed positioning option turned on (default)
2 Convert. ToHtml (filename, output_filename);
1 // Convert PDF document to HTML with fixed positioning option turned on (default)
2 Convert :: ToHtml (filename, output_filename);
1 // Convert PDF document to HTML with fixed positioning option turned on (default)
2 ConvertToHtml (filename, output_filename)
1 // Convert PDF document to HTML with fixed positioning option turned on (default)
2 Convert. toHtml (filename, output_filename);
1 async function main () {
2 // Convert PDF document to HTML with fixed positioning option turned on (default)
3 await PDFNet.Convert. fileToHtml (filename, output_filename);
4 }
5 PDFNet. runWithCleanup (main);
1 // Convert PDF document to HTML with fixed positioning option turned on (default)
2 [PTConvert ToHtmlWithFilename :filename out_filename :output_filename options : nil ];
1 // Convert PDF document to HTML with fixed positioning option turned on (default)
2 Convert :: ToHtml ($filename, $output_filename);
1 # Convert PDF document to HTML with fixed positioning option turned on (default)
2 Convert.ToHtml(filename, output_filename)
1 # Convert PDF document to HTML with fixed positioning option turned on (default)
2 Convert .toHtml(filename, output_filename)
1 ' Convert PDF document to HTML with fixed positioning option turned on (default)
2 Convert . ToHtml (filename, output_filename)
PDF Converter (SVG, XPS, TIFF, JPG, RTF, TXT, More) Full sample code which shows how to use PDFNet Convert for direct, high-quality conversion between PDF, XPS, EMF, SVG, TIFF, PNG, JPEG, and other image formats.
To convert PDF Documents to HTML format with full reflow.
The Structured Output module is an optional add-on Only available on Desktop and Server (Windows, Linux, or Mac) You can find more details about how to install the Structured Output module .
C# C++ Go Java JavaScript PHP Python Ruby VB
1 Convert . HTMLOutputOptions htmlOutputOptions = new Convert . HTMLOutputOptions ();
2 // Set e_reflow_full content reflow setting
3 htmlOutputOptions. SetContentReflowSetting (Convert.HTMLOutputOptions.ContentReflowSetting.e_reflow_full);
4 // Convert PDF document to HTML with full reflow option turned on
5 // But requires the Structured Output module
6 Convert. ToHtml (filename, output_filename, htmlOutputOptions);
1 Convert :: HTMLOutputOptions htmlOutputOptions;
2 // Set e_reflow_full content reflow setting
3 htmlOutputOptions. SetContentReflowSetting (Convert :: HTMLOutputOptions :: e_reflow_full);
4 // Convert PDF document to HTML with full reflow option turned on
5 // But requires the Structured Output module
6 Convert :: ToHtml (filename, output_filename, htmlOutputOptions);
1 htmlOutputOptions := NewHTMLOutputOptions ()
2 // Set e_reflow_full content reflow setting
3 htmlOutputOptions. SetContentReflowSetting (HTMLOutputOptionsE_reflow_full);
4 // Convert PDF document to HTML with full reflow option turned on
5 // But requires the Structured Output module
6 ConvertToHtml (filename, output_filename, htmlOutputOptions)
1 Convert . HTMLOutputOptions htmlOutputOptions = new Convert. HTMLOutputOptions ();
2 // Set e_reflow_full content reflow setting
3 htmlOutputOptions. setContentReflowSetting (Convert.HTMLOutputOptions.e_reflow_full);
4 // Convert PDF document to HTML with full reflow option turned on
5 // But requires the Structured Output module
6 Convert. toHtml (filename, output_filename, htmlOutputOptions);
1 async function main () {
2 const htmlOutputOptions = new PDFNet.Convert. HTMLOutputOptions ();
3 // Set e_reflow_full content reflow setting
4 htmlOutputOptions. setContentReflowSetting (PDFNet.Convert.HTMLOutputOptions.ContentReflowSetting.e_reflow_full);
5 // Convert PDF document to HTML with full reflow option turned on
6 // But requires the Structured Output module
7 await PDFNet.Convert. fileToHtml (filename, output_filename, htmlOutputOptions);
8 }
9 PDFNet. runWithCleanup (main);
1 $htmlOutputOptions = new HTMLOutputOptions ();
2 // Set e_reflow_full content reflow setting
3 $htmlOutputOptions -> SetContentReflowSetting ( HTMLOutputOptions :: e_reflow_full );
4 // Convert PDF document to HTML with full reflow option turned on
5 // But requires the Structured Output module
6 Convert :: ToHtml ($filename, $output_filename, htmlOutputOptions );
1 htmlOutputOptions = HTMLOutputOptions()
2 # Set e_reflow_full content reflow setting
3 htmlOutputOptions.SetContentReflowSetting(HTMLOutputOptions.e_reflow_full)
4 # Convert PDF document to HTML with full reflow option turned on
5 # But requires the Structured Output module
6 Convert.ToHtml(filename, output_filename, htmlOutputOptions)
1 $htmlOutputOptions = Convert :: HTMLOutputOptions . new ()
2 # Set e_reflow_full content reflow setting
3 $htmlOutputOptions. SetContentReflowSetting ( Convert :: HTMLOutputOptions :: E_reflow_full )
4 # Convert PDF document to HTML with full reflow option turned on
5 # But requires the Structured Output module
6 Convert .toHtml(filename, output_filename, $htmlOutputOptions)
1 Dim htmlOutputOptions As pdftron .PDF. Convert .HTMLOutputOptions = New pdftron.PDF. Convert . HTMLOutputOptions ()
2 ' Set e_reflow_full content reflow setting
3 htmlOutputOptions. SetContentReflowSetting (pdftron.PDF. Convert .HTMLOutputOptions.ContentReflowSetting.e_reflow_full)
4 ' Convert PDF document to HTML with full reflow option turned on
5 ' But requires the Structured Output module
6 Convert . ToHtml (filename, output_filename, htmlOutputOptions)
To convert PDF Documents to HTML format with reflow paragraphs.
The HTML reflow paragraphs module is an optional add-on Only available on Desktop and Server (Windows, Linux, or Mac) You can find more details about how to install PDF2HTML reflow paragraph module here .
C# C++ Go Java JavaScript PHP Python Ruby VB
1 Convert . HTMLOutputOptions htmlOutputOptions = new Convert . HTMLOutputOptions ();
2 // Set e_reflow_paragraphs content reflow setting
3 htmlOutputOptions. SetContentReflowSetting (Convert.HTMLOutputOptions.ContentReflowSetting.e_reflow_paragraphs);
4 // Optionally set to flow paragraphs across the entire browser window.
5 htmlOutputOptions. SetNoPageWidth ( true );
6 // Convert PDF document to HTML with reflow paragraphs option turned on
7 // But requires the PDF2HtmlReflowParagraphsModule
8 Convert. ToHtml (filename, output_filename, htmlOutputOptions);
1 Convert :: HTMLOutputOptions htmlOutputOptions;
2 // Set e_reflow_paragraphs content reflow setting
3 htmlOutputOptions. SetContentReflowSetting (Convert :: HTMLOutputOptions :: e_reflow_paragraphs);
4 // Optionally set to flow paragraphs across the entire browser window.
5 htmlOutputOptions. SetNoPageWidth ( true );
6 // Convert PDF document to HTML with reflow paragraphs option turned on
7 // But requires the PDF2HtmlReflowParagraphsModule
8 Convert :: ToHtml (filename, output_filename, htmlOutputOptions);
1 htmlOutputOptions := NewHTMLOutputOptions ()
2 // Set e_reflow_paragraphs content reflow setting
3 htmlOutputOptions. SetContentReflowSetting (HTMLOutputOptionsE_reflow_paragraphs);
4 // Optionally set to flow paragraphs across the entire browser window.
5 htmlOutputOptions. SetNoPageWidth ( true );
6 // Convert PDF document to HTML with reflow paragraphs option turned on
7 // But requires the PDF2HtmlReflowParagraphsModule
8 ConvertToHtml (filename, output_filename, htmlOutputOptions)
1 Convert . HTMLOutputOptions htmlOutputOptions = new Convert. HTMLOutputOptions ();
2 // Set e_reflow_paragraphs content reflow setting
3 htmlOutputOptions. setContentReflowSetting (Convert.HTMLOutputOptions.e_reflow_paragraphs);
4 // Optionally set to flow paragraphs across the entire browser window.
5 htmlOutputOptions. setNoPageWidth ( true );
6 // Convert PDF document to HTML with reflow paragraphs option turned on
7 // But requires the PDF2HtmlReflowParagraphsModule
8 Convert. toHtml (filename, output_filename, htmlOutputOptions);
1 Convert.HTMLOutputOptions htmlOutputOptions = new Convert. HTMLOutputOptions ();
2 // Set e_reflow_paragraphs content reflow setting
3 htmlOutputOptions. setContentReflowSetting (Convert.HTMLOutputOptions.e_reflow_paragraphs);
4 // Optionally set to flow paragraphs across the entire browser window.
5 htmlOutputOptions. setNoPageWidth ( true );
6 // Convert PDF document to HTML with reflow paragraphs option turned on
7 // But requires the PDF2HtmlReflowParagraphsModule
8 Convert. toHtml (filename, output_filename, htmlOutputOptions);
1 $htmlOutputOptions = new HTMLOutputOptions ();
2 // Set e_reflow_paragraphs content reflow setting
3 $htmlOutputOptions -> SetContentReflowSetting ( HTMLOutputOptions :: e_reflow_paragraphs );
4 // Optionally set to flow paragraphs across the entire browser window.
5 $htmlOutputOptions -> SetNoPageWidth ( true );
6 // Convert PDF document to HTML with reflow paragraphs option turned on
7 // But requires the PDF2HtmlReflowParagraphsModule
8 Convert :: ToHtml ($filename, $output_filename, htmlOutputOptions );
1 htmlOutputOptions = HTMLOutputOptions()
2 # Set e_reflow_paragraphs content reflow setting
3 htmlOutputOptions.SetContentReflowSetting(HTMLOutputOptions.e_reflow_paragraphs)
4 # Optionally set to flow paragraphs across the entire browser window.
5 htmlOutputOptions.SetNoPageWidth( True )
6 # Convert PDF document to HTML with reflow paragraphs option turned on
7 # But requires the PDF2HtmlReflowParagraphsModule
8 Convert.ToHtml(filename, output_filename, htmlOutputOptions)
1 $htmlOutputOptions = Convert :: HTMLOutputOptions . new ()
2 # Set e_reflow_paragraphs content reflow setting
3 $htmlOutputOptions. SetContentReflowSetting ( Convert :: HTMLOutputOptions :: E_reflow_paragraphs )
4 # Optionally set to flow paragraphs across the entire browser window.
5 $htmlOutputOptions. SetNoPageWidth ( true )
6 # Convert PDF document to HTML with reflow paragraphs option turned on
7 # But requires the PDF2HtmlReflowParagraphsModule
8 Convert .toHtml(filename, output_filename, $htmlOutputOptions)
1 Dim htmlOutputOptions As pdftron .PDF. Convert .HTMLOutputOptions = New pdftron.PDF. Convert . HTMLOutputOptions ()
2 ' Set e_reflow_paragraphs content reflow setting
3 htmlOutputOptions. SetContentReflowSetting (pdftron.PDF. Convert .HTMLOutputOptions.ContentReflowSetting.e_reflow_paragraphs)
4 ' Optionally set to flow paragraphs across the entire browser window.
5 htmlOutputOptions. SetNoPageWidth ( True )
6 ' Convert PDF document to HTML with reflow paragraphs option turned on
7 ' But requires the PDF2HtmlReflowParagraphsModule
8 Convert . ToHtml (filename, output_filename, htmlOutputOptions)
Convert PDF to HTML Sample Code Full sample code which shows how to convert generic PDF documents to HTML format
The following table illustrates which options apply to which conversion engines.
Settings API
Fixed Position
Reflow Paragraphs (Deprecated)
Full Reflow
SetContentReflowSetting
X
X
X
SetDPI
X
SetExternalLinks
X
SetInternalLinks
X
SetMaximumImagePixels
X
SetReportFile
X
SetScale
X
SetSimplifyText
X
SetJPGQuality
X
X
SetPreferJPG
X
X
SetDisableVerticalSplit
X
SetImageDPI
X
SetFileConversionTimeoutSeconds
X
SetNoPageWidth
X
SetSimpleLists
X
SetTitle
X
SetConnectHyphens
X
X
SetEmbedImages
X
X
SetPages
X
X
SetPDFPassword
X
X
SetSearchableImageSetting
X
X
SetLanguage
X
Depending on your use case, PDF to HTML can be used for rendering with high fidelity and accuracy or to primarily be used in content extraction. This means our tools can help you to display the output or be used in data analysis workflows.
Here are the different options for PDF to HTML conversion depending on your requirements:
Here are the options for maintaining the original PDF layout and visual accuracy.
WebViewer To convert PDF to HTML canvas in real-time client-side.
PDF to HTML/ePub To convert PDF to fixed layout HTML/ePub where one PDF page becomes one HTML file.
PDF2SVG To convert PDF to SVG to create a vector based image that can be embedded in an HTML file.
PDF2Image To convert PDF to Image (PNG, JPG, TIFF, Raw) to create a raster based image that can be embedded in an HTML file.
Here are the options for extracting semantic content from the output.
PDF2HTML To convert PDF to a single HTML file that preserves the PDF content using a custom heuristic method.