Get file data without a viewer

A PDF can be loaded into a Document or PDFDoc object using their respective functions to acquire the document data.

Getting data from a Document object

To get document data by creating a new Document object, the following example shows how to use a createDocument to retrieve document data from an external URL.

1const licenseKey = 'Insert commercial license key here after purchase';
2const documentURL = 'Enter Document URL Here';
3const extension = 'pdf'; // pass extension to option if there is no file extension in documentURL
4
5// create Core.Document instance
6Core.createDocument(documentURL, { l: licenseKey, extension })
7 .then(doc => {
8
9 // optionally perform some document processing using read write operations
10 // found under 'Editing Page Content' or 'Page Manipulation'
11
12 doc.getFileData().then(data => {
13 const arr = new Uint8Array(data);
14 const blob = new Blob([arr], { type: 'application/pdf' });
15 // add code for handling Blob here
16 });
17 })
18 .catch(err => { })

Document objects also have a getPDFDoc function for retrieving an associated PDFDoc object. However getFileData is preferred over saveMemoryBuffer in most cases because getFileData gives easy control over the output and is safer (it ensure everything is loaded and it take care of locks). The only reason to use saveMemoryBuffer over getFileData, is if there isn't an easy way of acquiring a Document object or if you need other PDFDoc features for controlling the download process.

Getting data from a PDFDoc object

Another way of getting document data is by using the saveMemoryBuffer function found on PDFDoc objects. An example can be found below

JavaScript

1const licenseKey = 'Insert commercial license key here after purchase';
2
3async function main() {
4 try {
5 const documentURL = 'Enter Document URL Here';
6 const pdfDoc = await PDFNet.PDFDoc.createFromURL(documentURL);
7
8 pdfDoc.initSecurityHandler();
9 pdfDoc.lock();
10
11 // optionally perform some document processing using read write operations
12 // found under 'Editing Page Content' or 'Page Manipulation'
13
14 const data = await pdfDoc.saveMemoryBuffer(PDFNet.SDFDoc.SaveOptions.e_remove_unused);
15 const arr = new Uint8Array(data);
16 const blob = new Blob([arr], { type: 'application/pdf' });
17 } catch(err) {
18 console.log(err.stack)
19 }
20}
21PDFNet.runWithCleanup(main, licenseKey);

After acquiring a PDFDoc object, use the saveMemoryBuffer function to retrieve the document data. saveMemoryBuffer takes in an enum flag similar to getFileData flags. The flags can be found in PDFNet.SDFDoc.SaveOptions and their values are:

  • e_remove_unused
  • e_hex_strings
  • e_omit_xref
  • e_linearized
  • e_compatibility

There is also an e_incremental flag but it's ignored when using saveMemoryBuffer. saveMemoryBuffer modifies the PDFDoc data stored in memory, so it's best practice to acquire a write lock for the document.

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales