All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
pdftron::PDF::OCROptions Class Reference

#include <OCROptions.h>

Public Member Functions

 OCROptions ()
 
 ~OCROptions ()
 
OCROptionsAddIgnoreZonesForPage (const RectCollection &regions, int page_num)
 
OCROptionsAddLang (const UString &lang)
 
OCROptionsAddTextZonesForPage (const RectCollection &regions, int page_num)
 
OCROptionsAddDPI (int dpi)
 
OCROptionsSetUsePDFPageCoords (bool value)
 
OCROptionsSetIgnoreExistingText (bool value)
 
bool GetAutoRotate ()
 
OCROptionsSetAutoRotate (bool value)
 
UString GetOCREngine ()
 
OCROptionsSetOCREngine (const UString &value)
 

Detailed Description

Definition at line 10 of file OCROptions.h.

Constructor & Destructor Documentation

pdftron::PDF::OCROptions::OCROptions ( )
pdftron::PDF::OCROptions::~OCROptions ( )

Member Function Documentation

OCROptions& pdftron::PDF::OCROptions::AddDPI ( int  dpi)

Knowing proper image resolution is important, as it enables the OCR engine to translate pixel heights of characters to their respective font sizes. We do our best to retrieve resolution information from the input's metadata, however it occasionally can be corrupt or missing. Hence we allow manual override of source's resolution, which supersedes any metadata found (both explicit as in image metadata and implicit as in PDF).

Parameters
dpi,:image resolution
Returns
this object, for call chaining
OCROptions& pdftron::PDF::OCROptions::AddIgnoreZonesForPage ( const RectCollection regions,
int  page_num 
)

Adds a collection of ignorable regions for the given page, an optional list of page areas not to be included in analysis

Parameters
regions,:the zones to be added to the ignore list
page_num,:the page number the added regions belong to
Returns
this object, for call chaining
OCROptions& pdftron::PDF::OCROptions::AddLang ( const UString lang)

Adds a language to the list of to be considered when processing this document

Parameters
lang,:the new language to be added to the language list
Returns
this object, for call chaining
OCROptions& pdftron::PDF::OCROptions::AddTextZonesForPage ( const RectCollection regions,
int  page_num 
)

Adds a collection of text regions of interest for the given page, an optional list of known text zones that will be used to improve OCR quality

Parameters
regions,:the zones to be added to the text region list
page_num,:the page number the added regions belong to
Returns
this object, for call chaining
bool pdftron::PDF::OCROptions::GetAutoRotate ( )

Gets the value AutoRotate from the options object Default value is false. Setting to true will deskew the image before conducting OCR.

Returns
a bool, the current value for AutoRotate.
UString pdftron::PDF::OCROptions::GetOCREngine ( )

Gets the value OCREngine from the options object Backend engine to use for OCR processing. Options include 'default', 'any', or 'iris'. Chosen module must be present and correctly licensed.

Returns
a UString, the current value for OCREngine.
OCROptions& pdftron::PDF::OCROptions::SetAutoRotate ( bool  value)

Sets the value for AutoRotate in the options object Default value is false. Setting to true will deskew the image before conducting OCR.

Parameters
value,:the new value for AutoRotate
Returns
this object, for call chaining
OCROptions& pdftron::PDF::OCROptions::SetIgnoreExistingText ( bool  value)

Sets the value for IgnoreExistingText in the options object Default value is false, so that areas with existing text will be automatically skipped during OCR. Setting to true probably only makes sense when used with GetOCRJson/XML, as pre-existing text might end up being duplicated in the document when used with ImageToPDF and ProcessPDF.

Parameters
value,:the new value for IgnoreExistingText
Returns
this object, for call chaining
OCROptions& pdftron::PDF::OCROptions::SetOCREngine ( const UString value)

Sets the value for OCREngine in the options object Backend engine to use for OCR processing. Options include 'default', 'any', or 'iris'. Chosen module must be present and correctly licensed.

Parameters
value,:the new value for OCREngine
Returns
this object, for call chaining
OCROptions& pdftron::PDF::OCROptions::SetUsePDFPageCoords ( bool  value)

Sets the value for UsePDFPageCoords in the options object Sets origin of the coordinate system for input/output

Parameters
value,:If true, sets origin of the coordinate system for input/output to the bottom left corner and reverses the direction of y-coordinate axis from downward to upward, otherwise top left corner is used as the origin and the y-coordinate axis direction is downward
Returns
this object, for call chaining

The documentation for this class was generated from the following file: