Compare PDFs by using image pixel by pixel comparison

If you would prefer to implement your own diffing algorithm, we provide APIs to retrieve image data from documents. You can use these images to compare the pixels between two documents by overlaying them and generating the output image.

The setup is similar to the previous example, except this time we don't need to enable the full API. We can rewrite our getDocument function to look like this:

Please note that the following code snippets are very generic and assume both documents are the same size and have the same amount of pages. Please make sure you handle these cases yourself if you plan to implement this into your own project.
See this link for a working sample.

1const [doc1, doc2] = await Promise.all([
2 Core.createDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_1.pdf'),
3 Core.createDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_2.pdf')
4])

Now we can write a function to get image data from these documents.

JavaScript

1const getImageData = (doc, pageIndex = 0) => {
2 return new Promise(resolve => {
3 doc.loadCanvasAsync({
4 pageIndex,
5 drawComplete: (pageCanvas) => {
6 const ctx = pageCanvas.getContext('2d');
7 const imageData = ctx.getImageData(0, 0, pageCanvas.width, pageCanvas.height);
8 resolve(imageData);
9 }
10 })
11 })
12}
13
14// get image data for the first page of both our documents
15const [imageData1, imageData2] = await Promise.all([
16 getImageData(doc1, 0),
17 getImageData(doc2, 0)
18]);

Now, we can loop over these pixels, and compare them however we wish.

JavaScript

1// Get the actual pixels from the ImageData object
2const pixelData1 = imageData1.data;
3const pixelData2 = imageData2.data;
4
5const newImageData = new Uint8ClampedArray(pixelData1.length);
6
7for(let i = 0; i < imageData1.length; i += 4) {
8 // rgba values for each pixel in imageData1 (document 1)
9 const r1 = pixelData1[i];
10 const g1 = pixelData1[i + 1];
11 const b1 = pixelData1[i + 2];
12 const a1 = pixelData1[i + 3];
13
14 // rgba values for each pixel in imageData2 (document 2)
15 const r2 = pixelData2[i];
16 const g2 = pixelData2[i + 1];
17 const b2 = pixelData2[i + 2];
18 const a2 = pixelData2[i + 3];
19
20 // Implement your own diffing algorithm here
21 newImageData[i] = someDiffFunction(r1, r2);
22 newImageData[i+1] = someDiffFunction(g1, g2);
23 newImageData[i+2] = someDiffFunction(b1, b2);
24 newImageData[i+3] = someDiffFunction(a1, a2);
25}
26
27// Here you could create a new canvas with your diffed pixels,
28// and open it with webviewer
29const canvas = document.createElement('canvas');
30canvas.width = imageData1.width;
31canvas.height = imageData1.height;
32canvas.getContext('2d').putImageData(new ImageData(newImageData, imageData1.width), 0 , 0);
33canvas.toBlob((blob) => {
34 readerControl.loadDocument(blob, { filename: 'image.png' });
35})

A demo of a similar approach can be found here.

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales