Class DataExtractionModule
static interface to Apryse SDKs data extraction functionality
Inherited Members
Namespace: pdftron.PDF
Assembly: PDFTronDotNet.dll
Syntax
public static class DataExtractionModule
Methods
DetectAndAddFormFieldsToPDF(PDFDoc)
Declaration
public static void DetectAndAddFormFieldsToPDF(PDFDoc doc)
Parameters
Type | Name | Description |
---|---|---|
PDFDoc | doc |
DetectAndAddFormFieldsToPDF(PDFDoc, DataExtractionOptions)
Declaration
public static void DetectAndAddFormFieldsToPDF(PDFDoc doc, DataExtractionOptions options)
Parameters
Type | Name | Description |
---|---|---|
PDFDoc | doc | |
DataExtractionOptions | options |
ExtractData(string, string, DataExtractionEngine)
Perform data extraction on a PDF file using the specified engine. Note: The FormKeyValue engine is experimental and subject to change.
Declaration
public static void ExtractData(string input_pdf_file, string output_json_file, DataExtractionModule.DataExtractionEngine engine)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
string | output_json_file | The resulting JSON filename |
DataExtractionModule.DataExtractionEngine | engine | The extraction engine |
ExtractData(string, string, DataExtractionEngine, DataExtractionOptions)
Perform data extraction on a PDF file using the specified engine. Note: The FormKeyValue engine is experimental and subject to change.
Declaration
public static void ExtractData(string input_pdf_file, string output_json_file, DataExtractionModule.DataExtractionEngine engine, DataExtractionOptions options)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
string | output_json_file | The resulting JSON filename |
DataExtractionModule.DataExtractionEngine | engine | The extraction engine |
DataExtractionOptions | options | Data extraction options |
ExtractData(string, DataExtractionEngine)
Perform data extraction on a PDF file using the specified engine and return the resulting JSON string. Note: The FormKeyValue engine is experimental and subject to change.
Declaration
public static string ExtractData(string input_pdf_file, DataExtractionModule.DataExtractionEngine engine)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
DataExtractionModule.DataExtractionEngine | engine | The extraction engine |
Returns
Type | Description |
---|---|
string | JSON string representing the extracted results |
ExtractData(string, DataExtractionEngine, DataExtractionOptions)
Perform data extraction on a PDF file using the specified engine and return the resulting JSON string. Note: The FormKeyValue engine is experimental and subject to change.
Declaration
public static string ExtractData(string input_pdf_file, DataExtractionModule.DataExtractionEngine engine, DataExtractionOptions options)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
DataExtractionModule.DataExtractionEngine | engine | The extraction engine |
DataExtractionOptions | options | Data extraction options |
Returns
Type | Description |
---|---|
string | JSON string representing the extracted results |
ExtractToXLSX(string, string)
Perform data extraction on a PDF in XLSX output format.
Declaration
public static void ExtractToXLSX(string input_pdf_file, string output_xlsx_file)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
string | output_xlsx_file | The resulting XLSX filename |
ExtractToXLSX(string, string, DataExtractionOptions)
Perform data extraction on a PDF in XLSX output format.
Declaration
public static void ExtractToXLSX(string input_pdf_file, string output_xlsx_file, DataExtractionOptions options)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
string | output_xlsx_file | The resulting XLSX filename |
DataExtractionOptions | options | Data extraction options |
ExtractToXLSX(string, Filter)
Perform data extraction on a PDF in XLSX output format.
Declaration
public static void ExtractToXLSX(string input_pdf_file, Filter output_xlsx_stream)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
Filter | output_xlsx_stream | The resulting XLSX filter |
ExtractToXLSX(string, Filter, DataExtractionOptions)
Perform data extraction on a PDF in XLSX output format.
Declaration
public static void ExtractToXLSX(string input_pdf_file, Filter output_xlsx_stream, DataExtractionOptions options)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
Filter | output_xlsx_stream | The resulting XLSX filter |
DataExtractionOptions | options | Data extraction options |
IsModuleAvailable(DataExtractionEngine)
Find out whether the specified data extraction module is available (and licensed).
Declaration
public static bool IsModuleAvailable(DataExtractionModule.DataExtractionEngine engine)
Parameters
Type | Name | Description |
---|---|---|
DataExtractionModule.DataExtractionEngine | engine | The extraction engine |
Returns
Type | Description |
---|---|
bool | returns true if data extraction operations can be performed |