Some test text!

Search
Hamburger Icon

Web / Guides

Get file data without a viewer

A PDF can be loaded into a Document or PDFDoc object using their respective functions to acquire the document data.

Getting data from a Document object

To get document data by creating a new Document object, the following example shows how to use a createDocument to retrieve document data from an external URL.

const licenseKey = 'Insert commercial license key here after purchase';
const documentURL = 'Enter Document URL Here';
const extension = 'pdf'; // pass extension to option if there is no file extension in documentURL

// create Core.Document instance
Core.createDocument(documentURL, { l: licenseKey, extension })
  .then(doc => {

    // optionally perform some document processing using read write operations 
    // found under 'Editing Page Content' or 'Page Manipulation'

    doc.getFileData().then(data => {
      const arr = new Uint8Array(data);
      const blob = new Blob([arr], { type: 'application/pdf' });
      // add code for handling Blob here
    });
  })
  .catch(err => { })
Document objects also have a getPDFDoc function for retrieving an associated PDFDoc object. However getFileData is preferred over saveMemoryBuffer in most cases because getFileData gives easy control over the output and is safer (it ensure everything is loaded and it take care of locks). The only reason to use saveMemoryBuffer over getFileData, is if there isn't an easy way of acquiring a Document object or if you need other PDFDoc features for controlling the download process.

Getting data from a PDFDoc object

Make sure you have Full API enabled in WebViewer.

Another way of getting document data is by using the saveMemoryBuffer function found on PDFDoc objects. An example can be found below

const licenseKey = 'Insert commercial license key here after purchase';

async function main() {
  try {
    const documentURL = 'Enter Document URL Here';
    const pdfDoc = await PDFNet.PDFDoc.createFromURL(documentURL);
    
    pdfDoc.initSecurityHandler();
    pdfDoc.lock();
    
    // optionally perform some document processing using read write operations 
    // found under 'Editing Page Content' or 'Page Manipulation'

    const data = await pdfDoc.saveMemoryBuffer(PDFNet.SDFDoc.SaveOptions.e_remove_unused);
    const arr = new Uint8Array(data);
    const blob = new Blob([arr], { type: 'application/pdf' });
  } catch(err) {
    console.log(err.stack)
  }
}
PDFNet.runWithCleanup(main, licenseKey);

After acquiring a PDFDoc object, use the saveMemoryBuffer function to retrieve the document data. saveMemoryBuffer takes in an enum flag similar to getFileData flags. The flags can be found in PDFNet.SDFDoc.SaveOptions and their values are:

  • e_remove_unused
  • e_hex_strings
  • e_omit_xref
  • e_linearized
  • e_compatibility

There is also an e_incremental flag but it's ignored when using saveMemoryBuffer.   saveMemoryBuffer modifies the PDFDoc data stored in memory, so it's best practice to acquire a write lock for the document.

Trial setup questions? Ask experts on Discord
Need other help? Contact Support
Pricing or product questions? Contact Sales