new DataExtractionModule()
The class DataExtractionModule.
static interface to Apryse SDKs data extraction functionality
Classes
Members
-
<static> DataExtractionEngine
-
Type:
- number
Properties:
Name Type Description e_Tabular
number e_Form
number e_DocStructure
number e_FormKeyValue
number
Methods
-
<static> createDataExtractionOptions()
-
Method to create a DataExtractionOptions object
Returns:
A promise that resolves to a PDFNet.DataExtractionModule.DataExtractionOptions.- Type
- Promise.<PDFNet.DataExtractionModule.DataExtractionOptions>
-
<static> detectAndAddFormFieldsToPDF(doc [, options])
-
Perform automatic form field detection, then insert the fields into the PDF. Note: The FormKeyValue engine is experimental and subject to change.
Parameters:
Name Type Argument Description doc
PDFNet.PDFDoc | PDFNet.SDFDoc | PDFNet.FDFDoc - The PDF document where fields are detected from and inserted into. options
PDFNet.DataExtractionModule.DataExtractionOptions <optional>
- Data extraction options (optional). Returns:
- Type
- Promise.<void>
-
<static> extractData(input_pdf_file, output_json_file, engine [, options])
-
Perform data extraction on a PDF file using the specified engine. Note: The FormKeyValue engine is experimental and subject to change.
Parameters:
Name Type Argument Description input_pdf_file
string - The source document filename. output_json_file
string - The resulting JSON filename. engine
number PDFNet.DataExtractionModule.DataExtractionEngine = { e_Tabular : 0 e_Form : 1 e_DocStructure : 2 e_FormKeyValue : 3 }
-- The extraction engine.options
PDFNet.DataExtractionModule.DataExtractionOptions <optional>
- Data extraction options (optional). Returns:
- Type
- Promise.<void>
-
<static> extractDataAsString(input_pdf_file, engine [, options])
-
Perform data extraction on a PDF file using the specified engine and return the resulting JSON string. Note: The FormKeyValue engine is experimental and subject to change.
Parameters:
Name Type Argument Description input_pdf_file
string - The source document filename. engine
number PDFNet.DataExtractionModule.DataExtractionEngine = { e_Tabular : 0 e_Form : 1 e_DocStructure : 2 e_FormKeyValue : 3 }
-- The extraction engine.options
PDFNet.DataExtractionModule.DataExtractionOptions <optional>
- Data extraction options (optional). Returns:
A promise that resolves to jSON string representing the extracted results.- Type
- Promise.<string>
-
<static> extractToXLSX(input_pdf_file, output_xlsx_file [, options])
-
Perform data extraction on a PDF in XLSX output format.
Parameters:
Name Type Argument Description input_pdf_file
string - The source document filename. output_xlsx_file
string - The resulting XLSX filename. options
PDFNet.DataExtractionModule.DataExtractionOptions <optional>
- Data extraction options (optional). Returns:
- Type
- Promise.<void>
-
<static> extractToXLSXWithFilter(input_pdf_file, output_xlsx_stream [, options])
-
Perform data extraction on a PDF in XLSX output format.
Parameters:
Name Type Argument Description input_pdf_file
string - The source document filename. output_xlsx_stream
PDFNet.Filter - The resulting XLSX filter. options
PDFNet.DataExtractionModule.DataExtractionOptions <optional>
- Data extraction options (optional). Returns:
- Type
- Promise.<void>
-
<static> isModuleAvailable(engine)
-
Find out whether the specified data extraction module is available (and licensed).
Parameters:
Name Type Description engine
number PDFNet.DataExtractionModule.DataExtractionEngine = { e_Tabular : 0 e_Form : 1 e_DocStructure : 2 e_FormKeyValue : 3 }
-- The extraction engine.Returns:
A promise that resolves to returns true if data extraction operations can be performed.- Type
- Promise.<boolean>