All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
pdftron::PDF::HTMLOutputOptions Class Reference

#include <Convert.h>

Public Types

enum  ContentReflowSetting { e_fixed_position = 0, e_reflow_paragraphs, e_reflow_full }
 
enum  SearchableImageSetting {
  e_ocr_image_text = 0, e_ocr_image, e_ocr_text, e_ocr_off,
  e_ocr_always
}
 

Public Member Functions

 HTMLOutputOptions ()
 
void SetPreferJPG (bool prefer_jpg)
 
void SetJPGQuality (UInt32 quality)
 
void SetDPI (UInt32 dpi)
 
void SetMaximumImagePixels (UInt32 max_pixels)
 
void SetContentReflowSetting (ContentReflowSetting reflow)
 
void SetScale (double scale)
 
void SetExternalLinks (bool enable)
 
void SetInternalLinks (bool enable)
 
void SetSimplifyText (bool enable)
 
void SetReportFile (const UString &path)
 
void SetTitle (const UString &title)
 
void SetImageDPI (UInt32 dpi)
 
void SetEmbedImages (bool embed)
 
void SetFileConversionTimeoutSeconds (int seconds)
 
void SetPages (int page_from, int page_to)
 
void SetPDFPassword (const UString &password)
 
void SetSearchableImageSetting (SearchableImageSetting setting)
 
void SetSimpleLists (bool enable)
 
void SetConnectHyphens (bool connect)
 
void SetDisableVerticalSplit (bool disable)
 
void SetNoPageWidth (bool enable)
 
void SetLanguage (OutputOptionsOCR::LanguageChoice language)
 

Protected Attributes

TRN_Obj m_obj
 
SDF::ObjSet m_objset
 

Friends

class Convert
 

Detailed Description

A class containing options common to ToHtml and ToEpub functions

Definition at line 1578 of file Convert.h.

Member Enumeration Documentation

Enumerator
e_fixed_position 
e_reflow_paragraphs 
e_reflow_full 

Definition at line 1620 of file Convert.h.

Enumerator
e_ocr_image_text 
e_ocr_image 
e_ocr_text 
e_ocr_off 
e_ocr_always 

Definition at line 1723 of file Convert.h.

Constructor & Destructor Documentation

pdftron::PDF::HTMLOutputOptions::HTMLOutputOptions ( )

Creates an HTMLOutputCommonOptions object with default settings

Member Function Documentation

void pdftron::PDF::HTMLOutputOptions::SetConnectHyphens ( bool  connect)

Specifies whether hyphens in the PDF should be connected. Default is false.

Note
This option is only available for e_reflow_paragraphs and e_reflow_full modes.
Parameters
connectif true, hyphens in the PDF will be connected.
void pdftron::PDF::HTMLOutputOptions::SetContentReflowSetting ( ContentReflowSetting  reflow)

Switch between fixed (pre-paginated) and reflowable HTML generation. Default is e_fixed_position. In e_reflow_paragraphs mode (now deprecated), conversions require that the optional PDFTron HTML reflow paragraphs add-on module is available. In e_reflow_full mode, conversions require that the optional PDFTron StructuredOutput add-on module is available.

Parameters
reflowthe generated HTML will be either fixed or reflowable.
See Also
ContentReflowSetting
StructuredOutputModule
PDF2HtmlReflowParagraphsModule
void pdftron::PDF::HTMLOutputOptions::SetDisableVerticalSplit ( bool  disable)

Specifies whether to disable the detection of section columns. Default is false. Enable this if your tables are coming out as section columns.

Note
This option is only available for e_reflow_paragraphs mode. In e_reflow_full mode, columns are detected automatically.
Parameters
disableif true, the detection of section columns are disabled.
void pdftron::PDF::HTMLOutputOptions::SetDPI ( UInt32  dpi)

The output resolution, from 1 to 1000, in Dots Per Inch (DPI) at which to render elements which cannot be directly converted. Default is 140.

Note
This option is only available for e_fixed_position mode.
Parameters
dpithe resolution in Dots Per Inch
void pdftron::PDF::HTMLOutputOptions::SetEmbedImages ( bool  embed)

Specifies whether images are embedded in the HTML without having to link to external files. Default is true.

Note
This option is only available for e_reflow_paragraphs and e_reflow_full modes.
Parameters
embedif true, images are embedd in the HTML, otherwise, images are saved as external files.
void pdftron::PDF::HTMLOutputOptions::SetExternalLinks ( bool  enable)

Enable the conversion of external URL navigation. Default is false.

Parameters
enableif true, links that specify external URL's are converted into HTML.
Note
This option is only available for e_fixed_position mode.
void pdftron::PDF::HTMLOutputOptions::SetFileConversionTimeoutSeconds ( int  seconds)

Specifies the amount of time in seconds after which the conversion fails. Default is 300. Very long files need more time to convert.

Note
This option is only available for e_reflow_paragraphs mode. The timeout feature is not necessary in other modes.
Parameters
secondsthe timeout in seconds.
void pdftron::PDF::HTMLOutputOptions::SetImageDPI ( UInt32  dpi)

Specifies the output image resolution, from 8 to 600, in Pixels Per Inch (PPI). The higher the PPI, the larger the image. Default is 192.

Note
This option is only available for e_reflow_paragraphs mode. In other modes, image resolution is determined automatically for an optimal result.
Parameters
dpithe resolution in Pixels Per Inch.
void pdftron::PDF::HTMLOutputOptions::SetInternalLinks ( bool  enable)

Enable the conversion of internal document navigation. Default is false.

Parameters
enableif true, links that specify page jumps are converted into HTML.
Note
This option is only available for e_fixed_position mode.
void pdftron::PDF::HTMLOutputOptions::SetJPGQuality ( UInt32  quality)

Specifies the compression quality to use when generating JPEG images.

Note
This option is only available for e_fixed_position and e_reflow_paragraphs modes. In e_reflow_full mode, the optimal JPEG quality is chosen automatically for best balance between size and quality.
Parameters
qualitythe JPEG compression quality, from 0 (highest compression) to 100 (best quality).
void pdftron::PDF::HTMLOutputOptions::SetLanguage ( OutputOptionsOCR::LanguageChoice  language)

Specifies the OCR language. Default is automatic language detection.

Note
This option is only available for e_reflow_full mode.
Parameters
languagethe OCR language.
void pdftron::PDF::HTMLOutputOptions::SetMaximumImagePixels ( UInt32  max_pixels)

Specifies the maximum image slice size in pixels. Default is 2000000.

Note
This setting now will no longer reduce the total number of image pixels. Instead a lower value will just produce more slices and vice versa.
Since image compression works better with more pixels a larger max pixels should generally create smaller files.
This option is only available for e_fixed_position mode.
Parameters
max_pixelsthe maximum number of pixels an image can have
void pdftron::PDF::HTMLOutputOptions::SetNoPageWidth ( bool  enable)

Determines whether to flow contents across the entire browser window. Default is false.

Note
This option is only available for e_reflow_paragraphs mode. In e_reflow_full mode, content always flows across the entire browser window.
Parameters
enableif true, content will flow across entire page.
void pdftron::PDF::HTMLOutputOptions::SetPages ( int  page_from,
int  page_to 
)

Specifies a range of pages to be converted. By default all pages are converted. The first page has the page number of 1.

Note
This option is only available for e_reflow_paragraphs and e_reflow_full modes.
Parameters
page_fromthe first page to be converted.
page_tothe last page to be converted (inclusive). Use a negative value to specify the last page in the PDF.
void pdftron::PDF::HTMLOutputOptions::SetPDFPassword ( const UString password)

Specifies the password if the PDF requires one.

Note
This option is only available for e_reflow_paragraphs and e_reflow_full modes.
Parameters
passwordthe PDF password, if required; an empty string otherwise.
void pdftron::PDF::HTMLOutputOptions::SetPreferJPG ( bool  prefer_jpg)

Use JPG files rather than PNG. This will apply to all generated images. Default is true.

Note
This option is only available for e_fixed_position and e_reflow_paragraphs modes.
Parameters
prefer_jpgif true JPG images will be used whenever possible.
void pdftron::PDF::HTMLOutputOptions::SetReportFile ( const UString path)

Generate a XML file that contains additional information about the conversion process. By default no report is generated.

Parameters
paththe file path to which the XML report is written to.
Note
This option is only available for e_fixed_position mode.
void pdftron::PDF::HTMLOutputOptions::SetScale ( double  scale)

Set an overall scaling of the generated HTML pages. Default is 1.0.

Parameters
scaleA number greater than 0 which is used as a scale factor. For example, calling SetScale(0.5) will reduce the HTML body of the page to half its original size, whereas SetScale(2) will double the HTML body dimensions of the page and will rescale all page content appropriately.
Note
This option is only available for e_fixed_position mode.
void pdftron::PDF::HTMLOutputOptions::SetSearchableImageSetting ( SearchableImageSetting  setting)

Specifies how scanned image pages should be converted. Default is e_ocr_image_text.

Note
This option is only available for e_reflow_paragraphs and e_reflow_full modes.
Parameters
settingthe searchable image setting.
Remarks
In e_reflow_paragraphs mode, this feature does not perform OCR, but instead it relies on pre-existing text from previous OCR. Both images and pre-existing hidden text are kept by default. In e_reflow_full mode, pre-existing OCRed content is ignored and a new OCR is performed from scratch by default. e_ocr_off can be used to disable OCR.
See Also
SearchableImageSetting
void pdftron::PDF::HTMLOutputOptions::SetSimpleLists ( bool  enable)

Determines whether to use tags for list items. Default is false.

Note
This option is only available for e_reflow_paragraphs mode. In e_reflow_full mode, list items always use tags.
Parameters
enableif true, tags are used for list items.
void pdftron::PDF::HTMLOutputOptions::SetSimplifyText ( bool  enable)

Controls whether converter optimizes DOM or preserves text placement accuracy. Default is false.

Parameters
enableif true, converter will try to reduce DOM complexity at the expense of text placement accuracy.
Note
This option is only available for e_fixed_position mode.
void pdftron::PDF::HTMLOutputOptions::SetTitle ( const UString title)

Specifies the title for the output HTML.

Note
This option is only available for e_reflow_paragraphs mode. HTML titles are not supported in other modes at the moment.
Parameters
titlethe title of the output HTML.

Friends And Related Function Documentation

friend class Convert
friend

Definition at line 1785 of file Convert.h.

Member Data Documentation

TRN_Obj pdftron::PDF::HTMLOutputOptions::m_obj
protected

Definition at line 1784 of file Convert.h.

SDF::ObjSet pdftron::PDF::HTMLOutputOptions::m_objset
protected

Definition at line 1786 of file Convert.h.


The documentation for this class was generated from the following file: