Class: DataExtractionOptions

PDFNet.DataExtractionModule. DataExtractionOptions


new DataExtractionOptions()

Options for PDFNet.DataExtractionModule.extractData, PDFNet.DataExtractionModule.extractToXSLX, and PDFNet.DataExtractionModule.extractToXSLXWithFilter

Methods


getDeepLearningAssist()

Gets the value DeepLearningAssist from the options object Specifies if Deep Learning is used with table recognition in the DocStructure engine. The default is false. When true, table recognition accuracy improves at the cost of increased processing time. This only affects the DocStructure engine.
Returns:
the current value for DeepLearningAssist.
Type
boolean

getFormExtractionEngine()

Gets the value FormExtractionEngine from the options object Specifies the form extraction engine used in DetectAndAddFormFieldsToPDF, either 'Form' or 'FormKeyValue'. The default is 'Form'. Note: The 'FormKeyValue' engine is experimental and subject to change.
Returns:
the current value for FormExtractionEngine.
Type
string

getLanguage()

Gets the value Language from the options object Specifies the OCR language(s). Use 3-letter ISO 639-2 language codes, separated by spaces. Example: "eng deu spa fra". The default is English.
Returns:
the current value for Language.
Type
string

getOverlappingFormFieldBehavior()

Gets the value OverlappingFormFieldBehavior from the options object When a detected form field overlaps with an existing one, keep either the old field (value 'KeepOld'), or the new one (value 'KeepNew', default).
Returns:
the current value for OverlappingFormFieldBehavior.
Type
string

getPages()

Gets the value Pages from the options object Specifies a range of pages to be converted, such as "1-5". By default all pages are converted. The first page has the page number of 1.
Returns:
the current value for Pages.
Type
string

getPDFPassword()

Gets the value PDFPassword from the options object Specifies the password if the PDF requires one. The default is no password.
Returns:
the current value for PDFPassword.
Type
string

setDeepLearningAssist(value)

Sets the value for DeepLearningAssist in the options object Specifies if Deep Learning is used with table recognition in the DocStructure engine. The default is false. When true, table recognition accuracy improves at the cost of increased processing time. This only affects the DocStructure engine.
Parameters:
Name Type Description
value boolean the new value for DeepLearningAssist
Returns:
this object, for call chaining
Type
PDFNet.DataExtractionModule.DataExtractionOptions

setFormExtractionEngine(value)

Sets the value for FormExtractionEngine in the options object Specifies the form extraction engine used in DetectAndAddFormFieldsToPDF, either 'Form' or 'FormKeyValue'. The default is 'Form'. Note: The 'FormKeyValue' engine is experimental and subject to change.
Parameters:
Name Type Description
value string the new value for FormExtractionEngine
Returns:
this object, for call chaining
Type
PDFNet.DataExtractionModule.DataExtractionOptions

setLanguage(value)

Sets the value for Language in the options object Specifies the OCR language(s). Use 3-letter ISO 639-2 language codes, separated by spaces. Example: "eng deu spa fra". The default is English.
Parameters:
Name Type Description
value string the new value for Language
Returns:
this object, for call chaining
Type
PDFNet.DataExtractionModule.DataExtractionOptions

setOverlappingFormFieldBehavior(value)

Sets the value for OverlappingFormFieldBehavior in the options object When a detected form field overlaps with an existing one, keep either the old field (value 'KeepOld'), or the new one (value 'KeepNew', default).
Parameters:
Name Type Description
value string the new value for OverlappingFormFieldBehavior
Returns:
this object, for call chaining
Type
PDFNet.DataExtractionModule.DataExtractionOptions

setPages(value)

Sets the value for Pages in the options object Specifies a range of pages to be converted, such as "1-5". By default all pages are converted. The first page has the page number of 1.
Parameters:
Name Type Description
value string the new value for Pages
Returns:
this object, for call chaining
Type
PDFNet.DataExtractionModule.DataExtractionOptions

setPDFPassword(value)

Sets the value for PDFPassword in the options object Specifies the password if the PDF requires one. The default is no password.
Parameters:
Name Type Description
value string the new value for PDFPassword
Returns:
this object, for call chaining
Type
PDFNet.DataExtractionModule.DataExtractionOptions