Class DataExtractionModule
static interface to PDFTron SDKs data extraction functionality
Inherited Members
Namespace: pdftron.PDF
Assembly: PDFNet.dll
Syntax
public sealed class DataExtractionModule
Constructors
DataExtractionModule()
Declaration
public DataExtractionModule()
Methods
DetectAndAddFormFieldsToPDF(PDFDoc)
Perform automatic form field detection, then insert the fields into the PDF.
Declaration
public static void DetectAndAddFormFieldsToPDF(PDFDoc doc)
Parameters
Type | Name | Description |
---|---|---|
PDFDoc | doc | The PDF document where fields are detected from and inserted into |
DetectAndAddFormFieldsToPDF(PDFDoc, DataExtractionOptions)
Perform automatic form field detection, then insert the fields into the PDF. Note: The FormKeyValue engine is experimental and subject to change.
Declaration
public static void DetectAndAddFormFieldsToPDF(PDFDoc doc, DataExtractionOptions options)
Parameters
Type | Name | Description |
---|---|---|
PDFDoc | doc | The PDF document where fields are detected from and inserted into |
DataExtractionOptions | options | Data extraction options |
ExtractData(string, string, DataExtractionEngine)
Perform data extraction on a PDF file using the specified engine. Note: The FormKeyValue engine is experimental and subject to change.
Declaration
public static void ExtractData(string input_pdf_file, string output_json_file, DataExtractionModule.DataExtractionEngine engine)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
string | output_json_file | The resulting JSON filename |
DataExtractionModule.DataExtractionEngine | engine | The extraction engine |
ExtractData(string, string, DataExtractionEngine, DataExtractionOptions)
Perform data extraction on a PDF file using the specified engine. Note: The FormKeyValue engine is experimental and subject to change.
Declaration
public static void ExtractData(string input_pdf_file, string output_json_file, DataExtractionModule.DataExtractionEngine engine, DataExtractionOptions options)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
string | output_json_file | The resulting JSON filename |
DataExtractionModule.DataExtractionEngine | engine | The extraction engine |
DataExtractionOptions | options | Data extraction options |
ExtractData(string, DataExtractionEngine)
Perform data extraction on a PDF file using the specified engine and return the resulting JSON string. Note: The FormKeyValue engine is experimental and subject to change.
Declaration
public static string ExtractData(string input_pdf_file, DataExtractionModule.DataExtractionEngine engine)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
DataExtractionModule.DataExtractionEngine | engine | The extraction engine |
Returns
Type | Description |
---|---|
string | JSON string representing the extracted results |
ExtractData(string, DataExtractionEngine, DataExtractionOptions)
Perform data extraction on a PDF file using the specified engine and return the resulting JSON string. Note: The FormKeyValue engine is experimental and subject to change.
Declaration
public static string ExtractData(string input_pdf_file, DataExtractionModule.DataExtractionEngine engine, DataExtractionOptions options)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
DataExtractionModule.DataExtractionEngine | engine | The extraction engine |
DataExtractionOptions | options | Data extraction options |
Returns
Type | Description |
---|---|
string | JSON string representing the extracted results |
ExtractToXLSX(string, string)
Perform data extraction on a PDF in XLSX output format.
Declaration
public static void ExtractToXLSX(string input_pdf_file, string output_xlsx_file)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
string | output_xlsx_file | The resulting XLSX filename |
ExtractToXLSX(string, string, DataExtractionOptions)
Perform data extraction on a PDF in XLSX output format.
Declaration
public static void ExtractToXLSX(string input_pdf_file, string output_xlsx_file, DataExtractionOptions options)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
string | output_xlsx_file | The resulting XLSX filename |
DataExtractionOptions | options | Data extraction options |
ExtractToXLSX(string, Filter)
Perform data extraction on a PDF in XLSX output format.
Declaration
public static void ExtractToXLSX(string input_pdf_file, Filter output_xlsx_stream)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
Filter | output_xlsx_stream | The resulting XLSX filter |
ExtractToXLSX(string, Filter, DataExtractionOptions)
Perform data extraction on a PDF in XLSX output format.
Declaration
public static void ExtractToXLSX(string input_pdf_file, Filter output_xlsx_stream, DataExtractionOptions options)
Parameters
Type | Name | Description |
---|---|---|
string | input_pdf_file | The source document filename |
Filter | output_xlsx_stream | The resulting XLSX filter |
DataExtractionOptions | options | Data extraction options |
IsModuleAvailable(DataExtractionEngine)
Find out whether the specified data extraction module is available (and licensed).
Declaration
public static bool IsModuleAvailable(DataExtractionModule.DataExtractionEngine engine)
Parameters
Type | Name | Description |
---|---|---|
DataExtractionModule.DataExtractionEngine | engine | The extraction engine |
Returns
Type | Description |
---|---|
bool | returns true if data extraction operations can be performed |