Search & Replace PDF Text and Images - Python Sample Code

Sample code to use Apryse SDK for searching and replacing text strings and images inside existing PDF files (e.g. business cards and other PDF templates). Unlike PDF forms, the ContentReplacer works on actual PDF content and is not limited to static rectangular annotation regions. Samples provided in Python, C++, C#, Java, Node.js (JavaScript), PHP, Ruby, Go and VB. Learn more about our Server SDK and PDF Editing & Manipulation Library.

It's mandatory to use square brackets for target strings in the original PDF doc when using ContentReplacer methods like AddString(). Otherwise, the content replacer won't recognize it as a template to replace.

For example, in the PDF document, you add a template for recognition: [NAME]. In the code, you tie the tag specified within the square brackets to what you want it to be replaced with: replacer.AddString("NAME", "John Smith") . After processing, both the square brackets and the tag will be replaced with "John Smith".

1#---------------------------------------------------------------------------------------
2# Copyright (c) 2001-2023 by Apryse Software Inc. All Rights Reserved.
3# Consult LICENSE.txt regarding license information.
4#---------------------------------------------------------------------------------------
5
6import site
7site.addsitedir("../../../PDFNetC/Lib")
8import sys
9from PDFNetPython import *
10
11sys.path.append("../../LicenseKey/PYTHON")
12from LicenseKey import *
13
14# Relattive path to the folder containing the test files.
15input_path = "../../TestFiles/"
16output_path = "../../TestFiles/Output/"
17
18#-----------------------------------------------------------------------------------------
19# The sample code illustrates how to use the ContentReplacer class to make using
20# 'template' pdf documents easier.
21#-----------------------------------------------------------------------------------------
22def main():
23 PDFNet.Initialize(LicenseKey)
24
25 # Example 1) Update a business card template with personalized info
26
27 doc = PDFDoc(input_path + "BusinessCardTemplate.pdf")
28 doc.InitSecurityHandler()
29
30 # first, replace the image on the first page
31 replacer = ContentReplacer()
32 page = doc.GetPage(1)
33 img = Image.Create(doc.GetSDFDoc(), input_path + "peppers.jpg")
34 replacer.AddImage(page.GetMediaBox(), img.GetSDFObj())
35 # next, replace the text place holders on the second page
36 replacer.AddString("NAME", "John Smith")
37 replacer.AddString("QUALIFICATIONS", "Philosophy Doctor")
38 replacer.AddString("JOB_TITLE", "Software Developer")
39 replacer.AddString("ADDRESS_LINE1", "#100 123 Software Rd")
40 replacer.AddString("ADDRESS_LINE2", "Vancouver, BC")
41 replacer.AddString("PHONE_OFFICE", "604-730-8989")
42 replacer.AddString("PHONE_MOBILE", "604-765-4321")
43 replacer.AddString("EMAIL", "info@pdftron.com")
44 replacer.AddString("WEBSITE_URL", "http://www.pdftron.com")
45 # finally, apply
46 replacer.Process(page)
47
48 doc.Save(output_path + "BusinessCard.pdf", SDFDoc.e_linearized)
49 doc.Close()
50
51 print("Done. Result saved in BusinessCard.pdf")
52
53 # Example 2) Replace text in a region with new text
54
55 doc = PDFDoc(input_path + "newsletter.pdf")
56 doc.InitSecurityHandler()
57
58 replacer = ContentReplacer()
59 page = doc.GetPage(1)
60 replacer.AddText(page.GetMediaBox(), "hello hello hello hello hello hello hello hello hello hello")
61 replacer.Process(page)
62
63 doc.Save(output_path + "ContentReplaced.pdf", SDFDoc.e_linearized)
64 doc.Close()
65
66 print("Done. Result saved in ContentReplaced.pdf")
67 PDFNet.Terminate()
68 print("Done.")
69
70if __name__ == '__main__':
71 main()

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales