Some test text!

Search
Hamburger Icon

Core / Guides / Setup

Using Intelligent Data Extraction to Augment Contextual LLM Queries - Setup for Linux and Windows

Setup

To run the provided examples, you will need to do some initial setup. All commands provided should be run from within the idp_rag_guide folder, unless otherwise stated. We'll ## assume you already have Python version >= 3.5 installed (If not, see instructions here). Perform the following setup steps:

1. Download Sample Code

Download the the sample code and unpack it. You should see an idp_rag_guide folder with the following structure:

idp_rag_guide/
├── data
│   └── pdf
│       └── travel_expenses.pdf
├── doc_context.py
├── idp_rag_utils
│   ├── bookmark_utils.py
│   ├── document_structure.py
│   └── __init__.py
├── iso32000_rag.py
└── requirements.txt

2. Obtain an Apryse SDK license.

Your license should include our IDP offerings. If you don't already have one, you can request a demo key. A demo key will be sufficient to run the simpler Document Context Example , but won't be able to run the more complex Document RAG Example .

Get your Apryse trial key:

Apryse collects some data regarding your usage of the SDK for product improvement.

If you wish to continue without data collection, contact us and we will email you a no-tracking trial key for you to get started.

3. Obtain an Open AI API Key.

Running the code included in this guide will make requests to Open AI that are not free, so you will need a funded account.

Linux

Windows

4. Install the required Python modules.

We will do so in a virtual environment.

python3 -m venv idp-venv
source idp-venv/bin/activate
python3 -m pip install -r requirements.txt

5. Export license keys

Export your Apryse SDK license key and Open AI API key as environment variables:

export OPENAI_API_KEY=<your-api-key>
export APRYSE_SDK_LICENSE_KEY=<your-license-key>

6. Download

Download the Structured Output Module from Apryse. If you don't already have this installed, you can download it as follows:

New-Item -ItemType Directory -Force -Path apryse_sdk_modules
Set-Location apryse_sdk_modules
Invoke-WebRequest -Uri https://www.pdftron.com/downloads/StructuredOutputModuleWindows.zip -OutFile StructuredOutputModuleWindows.zip
tar -xf StructuredOutputModuleWindows.zip
Remove-Item StructuredOutputModuleWindows.zip
Set-Location ..

7. Optional download

If you plan on running the Document RAG example , you should also download the ISO_32000-2 PDF standard, which is used in this sample, and is available for free download from Adobe. Place the downloaded file at the following location: idp_rag_guide/data/pdf/PDF_ISO_32000-2.pdf

You should now be ready to run the examples.

Next Steps

Document Context Example

In this section, we show how to run a simple example that demonstrates how to attach contextual information from a document to your queries.

Trial setup questions? Ask experts on Discord
Need other help? Contact Support
Pricing or product questions? Contact Sales