public static class

Convert.WordOutputOptions

extends Object
java.lang.Object
   ↳ com.pdftron.pdf.Convert.WordOutputOptions

Class Overview

A class containing options common to toWord functions

Summary

Constants
int e_ocr_always The Constant e_ocr_always.
int e_ocr_image The Constant e_ocr_image.
int e_ocr_image_text The Constant e_ocr_image_text.
int e_ocr_off The Constant e_ocr_off.
int e_ocr_text The Constant e_ocr_text.
int e_wof_doc The Constant e_wof_doc.
int e_wof_docx The Constant e_wof_docx.
int e_wof_rtf The Constant e_wof_rtf.
int e_wof_txt The Constant e_wof_txt.
Public Constructors
WordOutputOptions()
Creates a WordOutputOptions object with default settings
Public Methods
void setConnectHyphens(boolean connect)
Specifies whether hyphens in the PDF should be connected.
void setCustomOCRLanguage(String ocrlang)
Specifies the custom OCR languages to use.
void setLanguage(Convert.OutputOptionsOCR.LanguageChoice language)
Specifies the OCR language.
void setPDFPassword(String password)
Specifies the password if the PDF requires one.
void setPages(int page_from, int page_to)
Specifies a range of pages to be converted.
void setPreferredOCREngine(Convert.OutputOptionsOCR.PreferredOCREngine engine)
Specifies the preferred OCR engine.
void setPrioritizeVisualAppearance(boolean appearance)
Specifies whether to prefer an exact visual replica of the PDF at the expense of preventing reflow of document paragraphs.
void setSearchableImageSetting(int setting)
Specifies how scanned image pages should be converted.
void setWordOutputFormat(int format)
Specifies the output document format (DOCX, RTF, TXT).
[Expand]
Inherited Methods
From class java.lang.Object

Constants

public static final int e_ocr_always

The Constant e_ocr_always. Indicates that OCR will always be performed on all pages, and the recognized text replaces the image pixels underneath.

Constant Value: 4 (0x00000004)

public static final int e_ocr_image

The Constant e_ocr_image. Deprecated. OCR will not be performed.

Constant Value: 1 (0x00000001)

public static final int e_ocr_image_text

The Constant e_ocr_image_text. Deprecated. OCR will be performed.

Constant Value: 0 (0x00000000)

public static final int e_ocr_off

The Constant e_ocr_off. Indicates that OCR will not be performed.

Constant Value: 3 (0x00000003)

public static final int e_ocr_text

The Constant e_ocr_text. Indicates that OCR will be performed on scanned pages, and the recognized text replaces the image pixels underneath (default).

Constant Value: 2 (0x00000002)

public static final int e_wof_doc

The Constant e_wof_doc. Indicates a DOC output.

Constant Value: 1 (0x00000001)

public static final int e_wof_docx

The Constant e_wof_docx. Indicates a DOCX output (default).

Constant Value: 0 (0x00000000)

public static final int e_wof_rtf

The Constant e_wof_rtf. Indicates an RTF output.

Constant Value: 2 (0x00000002)

public static final int e_wof_txt

The Constant e_wof_txt. Indicates a TXT output.

Constant Value: 3 (0x00000003)

Public Constructors

public WordOutputOptions ()

Creates a WordOutputOptions object with default settings

Public Methods

public void setConnectHyphens (boolean connect)

Specifies whether hyphens in the PDF should be connected. Default is false.

Parameters
connect if true, hyphens in the PDF will be connected.

public void setCustomOCRLanguage (String ocrlang)

Specifies the custom OCR languages to use. Use 3-letter ISO 639-2 language codes, separated by spaces. Example: "eng deu spa fra". The default is English.

Parameters
ocrlang the OCR language(s).

public void setLanguage (Convert.OutputOptionsOCR.LanguageChoice language)

Specifies the OCR language. Default is automatic language detection.

Parameters
language the OCR language.

public void setPDFPassword (String password)

Specifies the password if the PDF requires one.

Parameters
password the PDF password, if required; an empty string otherwise.

public void setPages (int page_from, int page_to)

Specifies a range of pages to be converted. By default all pages are converted. The first page has the page number of 1.

Parameters
page_from the first page to be converted.
page_to the last page to be converted (inclusive). Use a negative value to specify the last page in the PDF.

public void setPreferredOCREngine (Convert.OutputOptionsOCR.PreferredOCREngine engine)

Specifies the preferred OCR engine.

Parameters
engine The PreferredOCREngine to use.

public void setPrioritizeVisualAppearance (boolean appearance)

Specifies whether to prefer an exact visual replica of the PDF at the expense of preventing reflow of document paragraphs. Default is false.

Parameters
appearance False is preferred for most documents that contain paragraphs. Consider using true for documents that don't flow, such as CAD drawings, Illustrator-generated files.

public void setSearchableImageSetting (int setting)

Specifies how scanned image pages should be converted. Default is e_ocr_text.

Parameters
setting the searchable image setting.

public void setWordOutputFormat (int format)

Specifies the output document format (DOCX, RTF, TXT). It is the most useful when the output file extension is not .docx, .doc or .rtf.

Parameters
format the output document format (DOCX, RTF, TXT).