Some test text!

Search
Hamburger Icon

Core / Guides / Document Context Example

Using Intelligent Data Extraction to Augment Contextual LLM Queries - Document Context Example

Example 1 - Document Context

The following is a simple example that shows how one might include document context with a query to an LLM, while leveraging information about the document structure contained in the PDF. The process can be broken down into a few steps:

  1. Extract document structure information using the Apryse Data Extraction Module
  2. Convert the document structure information to a more concise and recognizable format. We have chosen HTML, which works well with Open AI's GPT.
  3. Include the context and structure information in a query to the LLM.

To run the example, use the following command (with your virtual environment active, if using):

python3 ./doc_context.py

You should see some text indicating progress, with a question and answer about the document appearing at the end. LLM's aren't guaranteed to produce identical output between runs, but you should see something similar to the following:

Extracting Document Structure from <your-absolute-path>/doc_context_guide/data/
    pdf/travel_expenses.pdf...
Extracted data to <your-absolute-path>c/doc_context_guide/data/output/
    doc_context_example/travel_expenses/json/travel_expenses.json

================================================================================

Question: How much did the employee spend on airfare?

Answer: To calculate the total amount spent on airfare by the employee, we need 
to sum up all the expenses categorized under "Travel" that specifically mention 
flights. According to the expense report, these are the relevant entries:

1. Flight to Toronto, ON, Canada (03/17/2023 - 03/20/2023): $500.00
2. Flight to Boston, MA, USA (05/01/2023 - 05/02/2023): $400.75
3. Flight to Toronto, ON, Canada (06/07/2023 - 06/08/2023): $450.25
4. Flight to Miami, FL, USA (08/19/2023 - 08/24/2023): $600.35

Adding these amounts together gives:

$500.00 + $400.75 + $450.25 + $600.35 = $1951.35

Therefore, the employee spent a total of $1951.35 on airfare.

Next Steps

Document RAG Example

In this section, we introduce the concept of Retrieval Augmented Generation (RAG), and show how you can break down larger documents into searchable chunks to use with your queries.

Have questions? Connect with our experts on Discord.