Python 3.x PDF library integration

Welcome to Apryse. Python 3.x for the Apryse SDK is cross-platform and supported on Windows, Linux and macOS.

There are three ways to use Apryse with Python:

This guide will help you get started using the precompiled Python wrappers for 3.x. To get started, choose your preferred platform from the tabs below.

Precompiled Python3 & Linux PDF library integration

This guide will help you run Apryse samples and integrate a free trial of the Apryse SDK into Python applications on Linux. Your free trial includes unlimited trial usage and support from solution engineers.

This guide will help you set up Apryse SDK serverless AWS Lambda functions. Your free trial includes unlimited trial usage and support from solution engineers.

Prerequisites

Initial setup

In this particular guide, we will demonstrate how to set up an AWS Lambda function to use Apryse SDK.

First, prepare a zip package with apryse-sdk embedded and your lambda_function.py.

From the command line, check your Python3 version. This information will be needed when you create your function later.

sh

1python3 --version

From the command line:

sh

1mkdir YOUR_FUNCTION_FOLDER
2cd YOUR_FUNCTION_FOLDER
3python3 -m pip install --target . apryse-sdk --extra-index-url=https://pypi.apryse.com

Copy your lambda source (i.e. lambda_function.py) to YOUR_FUNCTION_FOLDER. Then zip your package before uploading it to your AWS Lambda account.

sh

1zip -r ./YOUR_FUNCTION_FOLDER.zip .

You can now upload YOUR_FUNCTION_FOLDER.zip to your AWS Lambda.

Second, create a lambda function in your AWS account and upload the zip package:

  • Lambda > Functions > Create function > Author from scratch > Function name [YOUR_FUNCTION_NAME] > Runtime [Python3.x] (Choose the version that matches your Python3 version from the checking above.) > choose Create Function
  • Upload from .zip file > Upload [your zip package] > Save
  • Add triger > API Gateway > Create an API > REST > Security [Open]
  • Configuration > General configuration > Edit > memory [Choose 10240MB]

Integrate into your application

Once you have followed the initial setup instructions, you can begin calling Apryse SDK APIs in your lambda function source. For example:

Python

1from base64 import b64encode, b64decode
2import json
3from apryse_sdk import *
4def lambda_handler(event, context):
5 if event["httpMethod"] == "GET":
6 return {
7 'statusCode': 200,
8 'body': json.dumps('Hello from Apryse!')
9 }
10 elif event["httpMethod"] == "POST":
11 try:
12 body = json.loads(event["body"])
13 PDFNet.Initialize("YOUR_APRYSE_LICENSE_KEY") # if you use apryse-sdk 9.1.0 and above. Otherwise use PDFNet.Initialize()
14 # your AWS lambda function goes here
15 message = {
16 'statusCode': 200,
17 'headers': {'Content-Type': 'application/json'},
18 'body': json.dumps(base64_string),
19 }
20 return message
21 except Exception as e:
22 message = {
23 'statusCode': 500,
24 'body': e
25 }
26 return (message)

Run Sample Code

You can update python script of your AWS Lambda function with the following code or download the source code from our github respository. This snippet shows how to process a request sent from a client to convert an office document to PDF and send the output to client.

Python

1# This example shows how to create AWS Lambda functions using Apryse SDK.
2# A REST API request was posted with base64 encoded data by the client.
3# The request would be processed by the server and a response with base64 encoded data of OfficeToPDF output would be sent to the client.
4from base64 import b64encode, b64decode
5import json
6from apryse_sdk import *
7def lambda_handler(event, context):
8 if event["httpMethod"] == "GET":
9 return {
10 'statusCode': 200,
11 'body': json.dumps('Hello, please send base64 doc to use this Lambda!')
12 }
13 elif event["httpMethod"] == "POST":
14 try:
15 body = json.loads(event["body"])
16 base64str = body["file"]["data"]
17 filename = body["file"]["filename"]
18 base64_bytes = b64decode(base64str)
19 # save input doc
20 output_path = '/tmp/'
21 input_filename = filename.split('.')[0] + '.docx'
22 with open(output_path + input_filename, 'wb') as open_file:
23 byte_content = open_file.write(base64_bytes)
24 # Start with a PDFDoc
25 PDFNet.Initialize("YOUR_APRYSE_LICENSE_KEY") # if you use apryse-sdk 9.1.0 and above. Otherwise use PDFNet.Initialize()
26 pdfdoc = PDFDoc()
27 # perform the conversion with no optional parameters and save to /temp
28 Convert.OfficeToPDF(pdfdoc, output_path + input_filename, None)
29 # save the result
30 output_filename = filename.split('.')[0] + '.pdf'
31 pdfdoc.Save(output_path + output_filename, SDFDoc.e_linearized)
32 # sending data
33 with open(output_path + output_filename, 'rb') as open_file:
34 byte_content = open_file.read()
35 base64_bytes = b64encode(byte_content)
36 base64_string = base64_bytes.decode('utf-8')
37 print("Sending " + output_filename )
38 message = {
39 'statusCode': 200,
40 'headers': {'Content-Type': 'application/json'},
41 'body': json.dumps(base64_string),
42 }
43 return message
44 except Exception as e:
45 print(e)
46 message = {
47 'statusCode': 500,
48 'body': e
49 }
50 return (message)

Testing

After you have uploaded your zip package to your AWS Lambda function and get its API endpoint in Configuration > Triggers, you can now do a simple test using REST API.

In order to use this function to convert an office document to PDF, the client needs to post a REST API request to the server. The request must include json data structured as in the code below.

Python

1json_data = {
2 "file": {
3 "encoding": "base64",
4 "data": base64_string_of_your_office_document,
5 "filename": filename,
6 "content-type": "application/pdf"
7 }
8 }

Upon receiving a client's request, a response will be sent back to the client including the base64 encoded pdf output using the same json structure. All the client needs to do now is decode the encoded data into pdf. That's it!
Access the sample python code. After cloning the repository and installing necessary packages, please refer to /client/README.txt for detailed instructions. Navigate to the client folder, run the following command, and see the reponse to client's request in the console and check the output in the output folder:

sh

1python AWSLambdaExample.py --url <YOUR PUBLISHED FUNCTION URL>

The client will send a REST API request to convert /input/simple-word_2007.docx to pdf and the server will send back the encoded data, which will then be saved as pdf in the output folder.

You can experiment with your own office document by putting it inside input folder:

sh

1python AWSLambdaExample.py --url <YOUR PUBLISHED FUNCTION URL> --filename <YOUR OFFICE FILENAME>

We have shown how to set an AWS Lambda function using Apryse SDK. You can now experiment making your own functions, URLs, and can fully utilize Apryse SDK. If your have any questions, please don't hesitate to contact us!

How to Build Optical Character Recognition (OCR) in Python - 1/16/25

Splitting a PDF Using Python - 9/11/24

PDF to Office Document Conversion Using Apryse and Python - 4/4/24

Generating Documents and Reports from DOCX Templates and JSON using Apryse and Python - 10/9/23

A Guide to PDF Data Extraction Using Apryse SDK and Python - 7/20/23

Adding a Digital Signature to a PDF With the Python SDK - 7/13/23

How to Extract Text from a PDF Using Python - 12/9/22

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales