Compare PDFs by overlaying PDFs and creating comparison PDF

WebViewer can take two PDF files and output the visual difference between them by overlaying the PDFs and generating a new PDF out of it. The generated PDF will preserve text and searchability. This can be useful in situations where you want to visually see the difference between two versions of a document (a blueprint for example). Check out the demo.

In our config.js file (see this guide for more information on config files), we start by waiting for WebViewer to fully initialize by waiting for the viewerLoaded event to fire. Once this is done, we can initialize the full API and get the documents into memory.

We'll start by writing a function that takes a URL and resolves with a document, and then use that function to load two sample documents.

The following code snippets are written using ES6+ syntax, which will only work in modern browsers. You may, however, transpile this code down to ES5 to ensure proper browser support. See this guide for more details.

1window.addEventListener('viewerLoaded', async () => {
2 // initialize PDFNet
3 await PDFNet.initialize('Insert commercial license key here after purchase');
4
5 const getDocument = async (url) => {
6 const newDoc = await Core.createDocument(url);
7 return await newDoc.getPDFDoc();
8 };
9
10 const [doc1, doc2] = await Promise.all([
11 getDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_1.pdf'),
12 getDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_2.pdf')
13 ])
14});

Now we need to get the pages that we want to diff. In this example, we will diff all pages. We'll write a helper function to help us get the pages into an array, and then use that function to get the pages for both our documents.

JavaScript

1// inside `viewerLoaded`
2const getPageArray = async (doc) => {
3 const arr = [];
4 const itr = await doc.getPageIterator(1);
5
6 for (itr; await itr.hasNext(); itr.next()) {
7 const page = await itr.current();
8 arr.push(page);
9 }
10
11 return arr;
12}
13
14const [doc1Pages, doc2Pages] = await Promise.all([
15 getPageArray(doc1),
16 getPageArray(doc2)
17]);

Now we can create a new blank document, and fill it with the diffed images from our two documents. Once that is done, we can tell WebViewer to display this new diffed document.

1// inside `viewerLoaded`
2const newDoc = await PDFNet.PDFDoc.create();
3newDoc.lock();
4
5// we'll loop over the doc with the most pages
6const biggestLength = Math.max(doc1Pages.length, doc2Pages.length)
7
8for(let i = 0; i < biggestLength; i++) {
9 let page1 = doc1Pages[i];
10 let page2 = doc2Pages[i];
11
12 // handle the case where one document has more pages than the other
13 if (!page1) {
14 page1 = await doc1.pageCreate(); // create a blank page
15 }
16 if (!page2) {
17 page2 = await doc2.pageCreate(); // create a blank page
18 }
19 await newDoc.appendVisualDiff(page1, page2)
20}
21
22newDoc.unlock();
23
24// display the document!
25// instance is a global variable thats automatically defined inside the config file.
26instance.UI.loadDocument(newDoc);

The full code sample should look like this:

JavaScript (v8.0+)

1Webviewer({
2 fullAPI: true,
3 path: '/lib',
4}, document.getElementById('viewer')).then(instance => {
5
6 const { PDFNet } = instance.Core;
7
8 instance.UI.addEventListener('viewerLoaded', async () => {
9 // initialize PDFNet
10 await PDFNet.initialize();
11
12 const getDocument = async (url) => {
13 const newDoc = await instance.Core.createDocument(url);
14 return await newDoc.getPDFDoc();
15 };
16
17 const [doc1, doc2] = await Promise.all([
18 getDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_1.pdf'),
19 getDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_2.pdf')
20 ])
21
22 // inside `viewerLoaded`
23 const getPageArray = async (doc) => {
24 const arr = [];
25 const itr = await doc.getPageIterator(1);
26
27 for (itr; await itr.hasNext(); itr.next()) {
28 const page = await itr.current();
29 arr.push(page);
30 }
31
32 return arr;
33 }
34
35 const [doc1Pages, doc2Pages] = await Promise.all([
36 getPageArray(doc1),
37 getPageArray(doc2)
38 ]);
39
40 console.log(doc1Pages, doc2Pages);
41
42 const newDoc = await PDFNet.PDFDoc.create();
43 newDoc.lock();
44
45 // we'll loop over the doc with the most pages
46 const biggestLength = Math.max(doc1Pages.length, doc2Pages.length)
47
48 for(let i = 0; i < biggestLength; i++) {
49 let page1 = doc1Pages[i];
50 let page2 = doc2Pages[i];
51
52 // handle the case where one document has more pages than the other
53 if (!page1) {
54 page1 = await doc1.pageCreate(); // create a blank page
55 }
56 if (!page2) {
57 page2 = await doc2.pageCreate(); // create a blank page
58 }
59 await newDoc.appendVisualDiff(page1, page2)
60 }
61
62 newDoc.unlock();
63
64 // display the document!
65 // instance is a global variable thats automatically defined inside the config file.
66 instance.UI.loadDocument(newDoc);
67
68 });
69});

WebViewer should now display the diffed document, like the image below.

Apryse Docs Image

In this example:

  • Blue represents content that is in document one and not document two.
  • Red represents content in document two that is not in document one.
  • Black represents overlap.

Behind the scenes, WebViewer blends the two documents using the Porter/Duff 'darken' operator and displays the output.

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales