Compress PDF Image JBIG5 - Ruby Sample Code

Sample code for using Apryse SDK to recompress bitonal (black and white) images in existing PDF documents using JBIG2 compression (lossless or lossy). The sample is intended to show how to specify hint information for the image encoder and is not meant to be a generic PDF optimization tool. To demonstrate the possible compression rates, we recompressed a document containing 17 scanned pages. The original input document is ~1.4MB and is using standard CCITT Fax compression. Lossless JBIG2 compression shrunk the filesize to 641KB, while lossy JBIG2 compression shrunk it to 176KB. Capabilities include programatically creating new fields and widget annotations, form filling, modifying existing field values, form templating, and flattening form fields.

Learn more about our Server SDK.

1#---------------------------------------------------------------------------------------
2# Copyright (c) 2001-2023 by Apryse Software Inc. All Rights Reserved.
3# Consult LICENSE.txt regarding license information.
4#---------------------------------------------------------------------------------------
5
6require '../../../PDFNetC/Lib/PDFNetRuby'
7include PDFNetRuby
8require '../../LicenseKey/RUBY/LicenseKey'
9
10$stdout.sync = true
11
12# This sample project illustrates how to recompress bi-tonal images in an
13# existing PDF document using JBIG2 compression. The sample is not intended
14# to be a generic PDF optimization tool.
15#
16# You can download the entire document using the following link:
17# http://www.pdftron.com/net/samplecode/data/US061222892.pdf
18
19 PDFNet.Initialize(PDFTronLicense.Key)
20
21 pdf_doc = PDFDoc.new("../../TestFiles/US061222892-a.pdf")
22 pdf_doc.InitSecurityHandler
23
24 cos_doc = pdf_doc.GetSDFDoc
25 num_objs = cos_doc.XRefSize
26
27 i = 1
28 while i < num_objs do
29 obj = cos_doc.GetObj(i)
30 if !obj.nil? and !obj.IsFree and obj.IsStream
31 # Process only images
32 itr = obj.Find("Subtype")
33 if !itr.HasNext or !itr.Value.GetName == "Image"
34 i = i + 1
35 next
36 end
37
38 input_image = Image.new(obj)
39 # Process only gray-scale images
40 if input_image.GetComponentNum != 1
41 i = i + 1
42 next
43 end
44
45 # Skip images that are already compressed using JBIG2
46 itr = obj.Find("Filter")
47 if itr.HasNext and itr.Value.IsName and itr.Value.GetName == "JBIG2Decode"
48 i = i + 1
49 next
50 end
51
52 filter = obj.GetDecodedStream
53 reader = FilterReader.new(filter)
54
55 hint_set = ObjSet.new # hint to image encoder to use JBIG2 compression
56 hint = hint_set.CreateArray
57
58 hint.PushBackName("JBIG2")
59 hint.PushBackName("Lossless")
60
61 new_image = Image.Create(cos_doc, reader,
62 input_image.GetImageWidth,
63 input_image.GetImageHeight,
64 1,
65 ColorSpace.CreateDeviceGray,
66 hint)
67
68 new_img_obj = new_image.GetSDFObj
69 itr = obj.Find("Decode")
70
71 if itr.HasNext
72 new_img_obj.Put("Decode", itr.Value)
73 end
74 itr = obj.Find("ImageMask")
75 if itr.HasNext
76 new_img_obj.Put("ImageMask", itr.Value)
77 end
78 itr = obj.Find("Mask")
79 if itr.HasNext
80 new_img_obj.Put("Mask", itr.Value)
81 end
82
83 cos_doc.Swap(i, new_img_obj.GetObjNum)
84 end
85 i = i + 1
86 end
87
88 pdf_doc.Save("../../TestFiles/Output/US061222892_JBIG2.pdf", SDFDoc::E_remove_unused)
89 pdf_doc.Close
90 PDFNet.Terminate

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales