Class OCROptions
Inherited Members
Namespace: pdftron.PDF
Assembly: PDFNet.dll
Syntax
public class OCROptions : OptionsBase
Constructors
OCROptions()
Constructor.
Declaration
public OCROptions()
Methods
AddDPI(int)
Knowing proper image resolution is important, as it enables the OCR engine to translate pixel heights of characters to their respective font sizes. We do our best to retrieve resolution information from the input's metadata, however it occasionally can be corrupt or missing. Hence we allow manual override of source's resolution, which supersedes any metadata found (both explicit as in image metadata and implicit as in PDF).
Declaration
public OCROptions AddDPI(int dpi)
Parameters
Type | Name | Description |
---|---|---|
int | dpi | image resolution |
Returns
Type | Description |
---|---|
OCROptions | this object, for call chaining |
AddIgnoreZonesForPage(RectCollection, int)
Adds a collection of ignorable regions for the given page Optional list of page areas that will be not be processed
Declaration
public OCROptions AddIgnoreZonesForPage(RectCollection regions, int pageNum)
Parameters
Type | Name | Description |
---|---|---|
RectCollection | regions | optional list of page areas to be excluded from analysis |
int | pageNum | the page number the added regions belong to |
Returns
Type | Description |
---|---|
OCROptions | this object, for call chaining |
AddLang(string)
Adds a to the Langs array The list of languages
Declaration
public OCROptions AddLang(string value)
Parameters
Type | Name | Description |
---|---|---|
string | value | The list of languages |
Returns
Type | Description |
---|---|
OCROptions | this object, for call chaining |
AddTextZonesForPage(RectCollection, int)
Adds a collection of known text regions for the given page. This information will be used as a hint to improve OCR quality.
Declaration
public OCROptions AddTextZonesForPage(RectCollection regions, int pageNum)
Parameters
Type | Name | Description |
---|---|---|
RectCollection | regions | optional list of known text regions |
int | pageNum | the page number the added regions belong to |
Returns
Type | Description |
---|---|
OCROptions | this object, for call chaining |
GetAutoRotate()
Gets the value AutoRotate from the options object Default value is false. Setting to true will deskew the image before conducting OCR.
Declaration
public bool GetAutoRotate()
Returns
Type | Description |
---|---|
bool | a bool, Default value is false. Setting to true will deskew the image before conducting OCR.. |
SetAutoRotate(bool)
Sets the value for AutoRotate in the options object Default value is false. Setting to true will deskew the image before conducting OCR.
Declaration
public OCROptions SetAutoRotate(bool value)
Parameters
Type | Name | Description |
---|---|---|
bool | value | Default value is false. Setting to true will deskew the image before conducting OCR. |
Returns
Type | Description |
---|---|
OCROptions | this object, for call chaining |
SetIgnoreExistingText(bool)
Sets the value for IgnoreExistingText in the options object Default value is false, so that areas with existing text will be automatically skipped during OCR. Setting to true probably only makes sense when used with GetOCRJson/XML, as pre-existing text might end up being duplicated in the document when used with ImageToPDF and ProcessPDF.
Declaration
public OCROptions SetIgnoreExistingText(bool value)
Parameters
Type | Name | Description |
---|---|---|
bool | value | Default value is false, so that areas with existing text will be automatically skipped during OCR. Setting to true probably only makes sense when used with GetOCRJson/XML, as pre-existing text might end up being duplicated in the document when used with ImageToPDF and ProcessPDF. |
Returns
Type | Description |
---|---|
OCROptions | this object, for call chaining |
SetOCREngine(string)
Set the backend processing engine to use for OCR operations Options include 'default', 'any', or 'iris'. Chosen module must be present and correctly licensed.
Declaration
public OCROptions SetOCREngine(string value)
Parameters
Type | Name | Description |
---|---|---|
string | value | the new value for the OCR Engine |
Returns
Type | Description |
---|---|
OCROptions | this object, for call chaining |
SetUsePDFPageCoords(bool)
Sets the value for UsePDFPageCoords in the options object Sets origin of the coordinate system for input/output
Declaration
public OCROptions SetUsePDFPageCoords(bool value)
Parameters
Type | Name | Description |
---|---|---|
bool | value | If true, sets origin of the coordinate system for input/output to the bottom left corner and reverses the direction of y-coordinate axis from downward to upward, otherwise top left corner is used as the origin and the y-coordinate axis direction is downward |
Returns
Type | Description |
---|---|
OCROptions | this object, for call chaining |