Requirements These packages are required to use these features in production. Trial keys have unlimited access to all features.
Apryse's Key-Value Extraction engine helps you automatically identify and extract meaningful key-value pairs from PDFs — even when the document is unstructured or doesn't contain form fields. Whether you're processing invoices, resumes, or complex reports, KVE saves hours of manual tagging by turning documents into structured JSON.
Exclusive to the Apryse SDK, Key-Value Extraction also supports pulling data from CAD and other technical drawing title blocks.
The engine scans each page for likely key terms (e.g., labels, field names, categories) and maps them to associated values (e.g., specific data, answers, identifiers). This hyponymic relationship allows KVE to structure data from real-world documents without templates or prior annotation.
Specify the name of the input PDF file and the name of the output JSON file, then select the Generic Key Value engine:
C# C++ Go Java JavaScript PHP Python Ruby VB 
1 DataExtractionModule. ExtractData ( " newsletter.pdf " ,  " newsletter.json " , DataExtractionModule.DataExtractionEngine.e_generic_key_value); 
1 DataExtractionModule :: ExtractData ( " newsletter.pdf " ,  " newsletter.json " , DataExtractionModule :: e_GenericKeyValue); 
1 DataExtractionModuleExtractData ( " newsletter.pdf " ,  " newsletter.json " , DataExtractionModuleE_GenericKeyValue) 
1 DataExtractionModule. extractData ( " newsletter.pdf " ,  " newsletter.json " , DataExtractionModule.DataExtractionEngine.e_generic_key_value); 
1 await  PDFNet.DataExtractionModule. extractData ( " newsletter.pdf " ,  " newsletter.json " , PDFNet.DataExtractionModule.DataExtractionEngine.e_GenericKeyValue); 
1 DataExtractionModule :: ExtractData ( " newsletter.pdf " ,  " newsletter.json " ,  DataExtractionModule :: e_GenericKeyValue ); 
1 DataExtractionModule.ExtractData( " newsletter.pdf " ,  " newsletter.json " , DataExtractionModule.e_GenericKeyValue) 
1 DataExtractionModule . ExtractData ( " newsletter.pdf " ,  " newsletter.json " ,  DataExtractionModule :: E_GenericKeyValue ) 
1 DataExtractionModule. ExtractData ( " newsletter.pdf " ,  " newsletter.json " , DataExtractionModule.DataExtractionEngine.e_generic_key_value) 
Specify the name of the input PDF file, then select the Generic Key Value engine:
C# C++ Go Java JavaScript PHP Python Ruby VB 
1 string  json  =  DataExtractionModule. ExtractData ( " newsletter.pdf " , DataExtractionModule.DataExtractionEngine.e_generic_key_value); 
1 UString json  =  DataExtractionModule :: ExtractData ( " newsletter.pdf " , DataExtractionModule :: e_GenericKeyValue); 
1 json  :=  DataExtractionModuleExtractData ( " newsletter.pdf " , DataExtractionModuleE_GenericKeyValue).( string ) 
1 String  json  =  DataExtractionModule. extractData ( " newsletter.pdf " , DataExtractionModule.DataExtractionEngine.e_generic_key_value); 
1 const  json  = await  PDFNet.DataExtractionModule. extractDataAsString ( ' newsletter.pdf ' , PDFNet.DataExtractionModule.DataExtractionEngine.e_GenericKeyValue); 
1 $json  =  DataExtractionModule :: ExtractData ( " newsletter.pdf " ,  DataExtractionModule :: e_GenericKeyValue ); 
1 json  =  DataExtractionModule.ExtractData( " newsletter.pdf " , DataExtractionModule.e_GenericKeyValue) 
1 json =  DataExtractionModule . ExtractData ( " newsletter.pdf " ,  DataExtractionModule :: E_GenericKeyValue ) 
1 Dim  json  As  String  =  DataExtractionModule. ExtractData ( " newsletter.pdf " , DataExtractionModule.DataExtractionEngine.e_generic_key_value) 
Select OCR Language 
Password-Protected PDFs 
Page Range 
Region of Interest 
Detect Empty Fields 
By default, DetectAndAddFormFieldsToPDF uses the Form Field Detection engine . You can force the function to use the Form Field Key-Value Extraction engine  using the "Form Extraction Engine" option.
C# C++ Go Java JavaScript PHP Python Ruby VB 
1 PDFDoc  doc  = new  PDFDoc ( " formfields.pdf " ); 
2 DataExtractionOptions  options  = new  DataExtractionOptions (); 
3 options. SetFormExtractionEngine ( " FormKeyValue " ); 
4 DataExtractionModule. DetectAndAddFormFieldsToPDF (doc, options); 
1 PDFDoc  doc ( " formfields.pdf " ); 
2 DataExtractionOptions options; 
3 options. SetFormExtractionEngine ( " FormKeyValue " ); 
4 DataExtractionModule :: DetectAndAddFormFieldsToPDF (doc,  & options); 
1 doc  =  NewPDFDoc ( " formfields.pdf " ) 
2 options  :=  NewDataExtractionOptions () 
3 options. SetFormExtractionEngine ( " FormKeyValue " ) 
4 DataExtractionModuleDetectAndAddFormFieldsToPDF (doc, options) 
1 PDFDoc  doc  =  new  PDFDoc ( " formfields.pdf " ); 
2 DataExtractionOptions  options  =  new  DataExtractionOptions (); 
3 options. setFormExtractionEngine ( " FormKeyValue " ); 
4 DataExtractionModule. detectAndAddFormFieldsToPDF (doc, options); 
1 const  doc  = await  PDFNet.PDFDoc. createFromFilePath ( " formfields.pdf " ); 
2 const  options  =  new  PDFNet.DataExtractionModule. DataExtractionOptions (); 
3 options. setFormExtractionEngine ( ' FormKeyValue ' ); 
4 await  PDFNet.DataExtractionModule. detectAndAddFormFieldsToPDF (doc, options); 
1 $doc  = new  PDFDoc ( " formfields.pdf " ); 
2 $options  = new  DataExtractionOptions (); 
3 $options -> SetFormExtractionEngine ( " FormKeyValue " ); 
4 DataExtractionModule :: DetectAndAddFormFieldsToPDF ($doc, $options); 
1 doc  =  PDFDoc( " formfields.pdf " ) 
2 options  =  DataExtractionOptions() 
3 options.SetFormExtractionEngine( " FormKeyValue " ) 
4 DataExtractionModule.DetectAndAddFormFieldsToPDF(doc, options) 
1 doc =  PDFDoc . new ( " formfields.pdf " ) 
2 options =  DataExtractionOptions . new () 
3 options. SetFormExtractionEngine ( " FormKeyValue " ) 
4 DataExtractionModule . DetectAndAddFormFieldsToPDF (doc, options) 
1 Dim  doc  as  PDFDoc  = New  PDFDoc ( " formfields.pdf " ) 
2 Dim  options  = New  DataExtractionOptions () 
3 options. SetFormExtractionEngine ( " FormKeyValue " ) 
4 DataExtractionModule. DetectAndAddFormFieldsToPDF (doc, options) 
NOTE This option only has an effect on the `DetectAndAddFormFieldsToPDF` function. Passing this option to `ExtractData` will have no effect, as the `engine` parameter will take precedence.