Redact PDFs - Ruby Sample Code

Sample code for using Apryse SDK to remove potentially sensitive content within PDF documents. Using 'pdftron.PDF.Redactor' makes sure that if a portion of an image, text, or vector graphics is contained in a redaction region, that portion is destroyed and is not simply hidden with clipping or image masks. Sample code provided in Python, C++, C#, Java, Node.js (JavaScript), PHP, Ruby and VB.

Learn more about our Server SDK.

1#---------------------------------------------------------------------------------------
2# Copyright (c) 2001-2023 by Apryse Software Inc. All Rights Reserved.
3# Consult LICENSE.txt regarding license information.
4#---------------------------------------------------------------------------------------
5
6require '../../../PDFNetC/Lib/PDFNetRuby'
7include PDFNetRuby
8require '../../LicenseKey/RUBY/LicenseKey'
9
10$stdout.sync = true
11
12
13# PDF Redactor is a separately licensable Add-on that offers options to remove
14# (not just covering or obscuring) content within a region of PDF.
15# With printed pages, redaction involves blacking-out or cutting-out areas of
16# the printed page. With electronic documents that use formats such as PDF,
17# redaction typically involves removing sensitive content within documents for
18# safe distribution to courts, patent and government institutions, the media,
19# customers, vendors or any other audience with restricted access to the content.
20#
21# The redaction process in PDFNet consists of two steps:
22#
23# a) Content identification: A user applies redact annotations that specify the
24# pieces or regions of content that should be removed. The content for redaction
25# can be identified either interactively (e.g. using 'pdftron.PDF.PDFViewCtrl'
26# as shown in PDFView sample) or programmatically (e.g. using 'pdftron.PDF.TextSearch'
27# or 'pdftron.PDF.TextExtractor'). Up until the next step is performed, the user
28# can see, move and redefine these annotations.
29# b) Content removal: Using 'pdftron.PDF.Redactor.Redact' the user instructs
30# PDFNet to apply the redact regions, after which the content in the area specified
31# by the redact annotations is removed. The redaction function includes number of
32# options to control the style of the redaction overlay (including color, text,
33# font, border, transparency, etc.).
34#
35# PDFTron Redactor makes sure that if a portion of an image, text, or vector graphics
36# is contained in a redaction region, that portion of the image or path data is
37# destroyed and is not simply hidden with clipping or image masks. PDFNet API can also
38# be used to review and remove metadata and other content that can exist in a PDF
39# document, including XML Forms Architecture (XFA) content and Extensible Metadata
40# Platform (XMP) content.
41
42def Redact(input, output, vec, app)
43 doc = PDFDoc.new(input)
44 if doc.InitSecurityHandler
45 Redactor.Redact(doc, vec, app, false, true)
46 doc.Save(output, SDFDoc::E_linearized)
47 end
48end
49
50 # Relative path to the folder containing the test files.
51 input_path = "../../TestFiles/"
52 output_path = "../../TestFiles/Output/"
53
54 PDFNet.Initialize(PDFTronLicense.Key)
55
56 vec = [Redaction.new(1, Rect.new(100, 100, 550, 600), false, "Top Secret"),
57 Redaction.new(2, Rect.new(30, 30, 450, 450), true, "Negative Redaction"),
58 Redaction.new(2, Rect.new(0, 0, 100, 100), false, "Positive"),
59 Redaction.new(2, Rect.new(100, 100, 200, 200), false, "Positive"),
60 Redaction.new(2, Rect.new(300, 300, 400, 400), false, ""),
61 Redaction.new(2, Rect.new(500, 500, 600, 600), false, ""),
62 Redaction.new(3, Rect.new(0, 0, 700, 20), false, "")]
63
64 app = Appearance.new
65 app.RedactionOverlay = true
66 app.Border = false
67 app.ShowRedactedContentRegions = true
68
69 Redact(input_path + "newsletter.pdf", output_path + "redacted.pdf", vec, app)
70 PDFNet.Terminate
71 puts "Done..."

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales