PDF2Office - Convert PDF to DOCX, XSLX - Ruby Sample Code

Sample code for using Apryse SDK to programmatically convert generic PDF documents to Word, Excel, PowerPoint; provided in Python, C++, C#, Go, Java, Node.js (JavaScript), PHP, Ruby and VB.

To run this sample:

  1. Complete the Get started with Server SDK process in your language/framework.
  2. After you complete the Get Started with Server SDK work in your language/framework from Step 1 above, next, download the Structured Output Module.

Learn more about our Server SDK and PDF to Office Conversion

1#---------------------------------------------------------------------------------------
2# Copyright (c) 2001-2023 by Apryse Software Inc. All Rights Reserved.
3# Consult LICENSE.txt regarding license information.
4#---------------------------------------------------------------------------------------
5
6require '../../../PDFNetC/Lib/PDFNetRuby'
7include PDFNetRuby
8require '../../LicenseKey/RUBY/LicenseKey'
9
10$stdout.sync = true
11
12#---------------------------------------------------------------------------------------
13# The following sample illustrates how to use the PDF.Convert utility class to convert
14# documents and files to Word, Excel and PowerPoint.
15#
16# The Structured Output module is an optional PDFNet Add-on that can be used to convert PDF
17# and other documents into Word, Excel, PowerPoint and HTML format.
18#
19# The PDFTron SDK Structured Output module can be downloaded from
20# https://docs.apryse.com/core/info/modules/
21#
22# Please contact us if you have any questions.
23#---------------------------------------------------------------------------------------
24
25# Relative path to the folder containing the test files.
26$inputPath = "../../TestFiles/"
27$outputPath = "../../TestFiles/Output/"
28
29def main()
30 # The first step in every application using PDFNet is to initialize the
31 # library. The library is usually initialized only once, but calling
32 # Initialize() multiple times is also fine.
33 PDFNet.Initialize(PDFTronLicense.Key)
34
35 PDFNet.AddResourceSearchPath("../../../PDFNetC/Lib/");
36
37 if !StructuredOutputModule.IsModuleAvailable() then
38 puts ""
39 puts "Unable to run the sample: PDFTron SDK Structured Output module not available."
40 puts "-----------------------------------------------------------------------------"
41 puts "The Structured Output module is an optional add-on, available for download"
42 puts "at https://docs.apryse.com/core/info/modules/. If you have already"
43 puts "downloaded this module, ensure that the SDK is able to find the required files"
44 puts "using the PDFNet::AddResourceSearchPath() function."
45 puts ""
46 return
47 end
48
49 #-----------------------------------------------------------------------------------
50
51 begin
52 # Convert PDF document to Word
53 puts "Converting PDF to Word"
54
55 $outputFile = $outputPath + "paragraphs_and_tables.docx"
56
57 Convert.ToWord($inputPath + "paragraphs_and_tables.pdf", $outputFile)
58
59 puts "Result saved in " + $outputFile
60 rescue => error
61 puts "Unable to convert PDF document to Word, error: " + error.message
62 end
63
64 #-----------------------------------------------------------------------------------
65
66 begin
67 # Convert PDF document to Word with options
68 puts "Converting PDF to Word with options"
69
70 $outputFile = $outputPath + "paragraphs_and_tables_first_page.docx"
71
72 $wordOutputOptions = Convert::WordOutputOptions.new()
73
74 # Convert only the first page
75 $wordOutputOptions.SetPages(1, 1);
76
77 Convert.ToWord($inputPath + "paragraphs_and_tables.pdf", $outputFile, $wordOutputOptions)
78 puts "Result saved in " + $outputFile
79 rescue => error
80 puts "Unable to convert PDF document to Word, error: " + error.message
81 end
82
83 #-----------------------------------------------------------------------------------
84
85 begin
86 # Convert PDF document to Excel
87 puts "Converting PDF to Excel"
88
89 $outputFile = $outputPath + "paragraphs_and_tables.xlsx"
90
91 Convert.ToExcel($inputPath + "paragraphs_and_tables.pdf", $outputFile)
92
93 puts "Result saved in " + $outputFile
94 rescue => error
95 puts "Unable to convert PDF document to Excel, error: " + error.message
96 end
97
98 #-----------------------------------------------------------------------------------
99
100 begin
101 # Convert PDF document to Excel with options
102 puts "Converting PDF to Excel with options"
103
104 $outputFile = $outputPath + "paragraphs_and_tables_second_page.xlsx"
105
106 $excelOutputOptions = Convert::ExcelOutputOptions.new()
107
108 # Convert only the second page
109 $excelOutputOptions.SetPages(2, 2);
110
111 Convert.ToExcel($inputPath + "paragraphs_and_tables.pdf", $outputFile, $excelOutputOptions)
112 puts "Result saved in " + $outputFile
113 rescue => error
114 puts "Unable to convert PDF document to Excel, error: " + error.message
115 end
116
117 #-----------------------------------------------------------------------------------
118
119 begin
120 # Convert PDF document to PowerPoint
121 puts "Converting PDF to PowerPoint"
122
123 $outputFile = $outputPath + "paragraphs_and_tables.pptx"
124
125 Convert.ToPowerPoint($inputPath + "paragraphs_and_tables.pdf", $outputFile)
126
127 puts "Result saved in " + $outputFile
128 rescue => error
129 puts "Unable to convert PDF document to PowerPoint, error: " + error.message
130 end
131
132 #-----------------------------------------------------------------------------------
133
134 begin
135 # Convert PDF document to PowerPoint with options
136 puts "Converting PDF to PowerPoint with options"
137
138 $outputFile = $outputPath + "paragraphs_and_tables_first_page.pptx"
139
140 $powerPointOutputOptions = Convert::PowerPointOutputOptions.new()
141
142 # Convert only the first page
143 $powerPointOutputOptions.SetPages(1, 1);
144
145 Convert.ToPowerPoint($inputPath + "paragraphs_and_tables.pdf", $outputFile, $powerPointOutputOptions)
146 puts "Result saved in " + $outputFile
147 rescue => error
148 puts "Unable to convert PDF document to PowerPoint, error: " + error.message
149 end
150
151 #-----------------------------------------------------------------------------------
152
153 PDFNet.Terminate
154 puts "Done."
155end
156
157main()

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales