Convert to PDF/UA - Ruby Sample Code

Sample code for using Apryse SDK to programmatically convert generic PDF documents into ISO-compliant, VeraPDF-valid PDF/UA files. Supports PDF/UA-1. Sample code provided in Python, C++, C#, Java, Node.js (JavaScript), PHP, Ruby and VB.

Learn more about our Server SDK and PDF/UA Library.

1#---------------------------------------------------------------------------------------
2# Copyright (c) 2001-2024 by Apryse Software Inc. All Rights Reserved.
3# Consult LICENSE.txt regarding license information.
4#---------------------------------------------------------------------------------------
5require '../../../PDFNetC/Lib/PDFNetRuby'
6include PDFNetRuby
7require '../../LicenseKey/RUBY/LicenseKey'
8$stdout.sync = true
9#---------------------------------------------------------------------------------------
10# The following sample illustrates how to make sure a file meets the PDF/UA standard, using the PDFUAConformance class object.
11# Note: this feature is currently experimental and subject to change
12#
13# DataExtractionModule is required (Mac users can use StructuredOutputModule instead)
14# https://docs.apryse.com/documentation/core/info/modules/#data-extraction-module
15# https://docs.apryse.com/documentation/core/info/modules/#structured-output-module (Mac)
16#---------------------------------------------------------------------------------------
17# Relative path to the folder containing the test files.
18$input_path = "../../TestFiles/"
19$output_path = "../../TestFiles/Output/"
20# DataExtraction library location, replace if desired, should point to a folder that includes the contents of <DataExtractionModuleRoot>/Lib.
21# If using default, unzip the DataExtraction zip to the parent folder of Samples, and merge with existing "Lib" folder.
22$extraction_module_path = "../../../PDFNetC/Lib/"
23def main()
24 input_file1 = $input_path + "autotag_input.pdf"
25 input_file2 = $input_path + "table.pdf"
26 output_file1 = $output_path + "autotag_pdfua.pdf"
27 output_file2 = $output_path + "table_pdfua_linearized.pdf"
28 PDFNet.Initialize(PDFTronLicense.Key)
29 puts "AutoConverting..."
30 PDFNet.AddResourceSearchPath($extraction_module_path)
31 if !DataExtractionModule.IsModuleAvailable(DataExtractionModule::E_DocStructure) then
32 puts ""
33 puts "Unable to run Data Extraction: PDFTron SDK Structured Output module not available."
34 puts "-----------------------------------------------------------------------------"
35 puts "The Data Extraction suite is an optional add-on, available for download"
36 puts "at https://docs.apryse.com/documentation/core/info/modules/. If you have already"
37 puts "downloaded this module, ensure that the SDK is able to find the required files"
38 puts "using the PDFNet.AddResourceSearchPath() function."
39 puts ""
40 PDFNet.Terminate
41 return
42 end
43 begin
44 pdf_ua = PDFUAConformance.new()
45 puts "Simple Conversion..."
46 # Perform conversion using default options
47 pdf_ua.AutoConvert(input_file1, output_file1)
48 puts "Converting With Options..."
49 pdf_ua_opts = PDFUAOptions.new()
50 pdf_ua_opts.SetSaveLinearized(true) # Linearize when saving output
51 # Note: if file is password protected, you can use pdf_ua_opts.SetPassword()
52 # Perform conversion using the options we specify
53 pdf_ua.AutoConvert(input_file2, output_file2, pdf_ua_opts)
54 rescue => error
55 puts error.message
56 end
57 PDFNet.Terminate
58 puts "PDFUAConformance test completed."
59end
60main()

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales