Some test text!

Search
Hamburger Icon

Python / Guides / Open a document

Open a PDF in Python

To open a PDF document.

# open document from the filesystem
doc = PDFDoc(filename)

# optionally read a PDF document from a stream
file = MappedFile(filename)
doc_stream = PDFDoc(file)

# or pass-in a memory buffer
file_sz = file.FileSize()
file_reader = FilterReader(file)
mem = file_reader.Read(file_sz)
doc_mem = PDFDoc(bytearray(mem), file_sz)

# loading from a URL requires an additional
# module, e.g. "requests"
url = 'https://myserver.com/myfile.pdf'
file_content = requests.get(url)
doc = PDFDoc(bytearray(file_content.content), len(file_content.content))

Read & write a PDF file from/to memory buffer
Full source code which illustrates how to read/write a PDF document from/to memory buffer. This is useful for applications that work with dynamic PDF documents that don't need to be saved/read from a disk.

About opening a document

The PDFDoc constructor creates a PDF document from scratch:

PDFDoc.Close()
When you are finished with a PDFDoc object, the PDFDoc.Close() method should be called to clean up memory, file handles, and resources.
doc = PDFDoc()

A newly-created document does not yet contain any pages. See the accessing pages section for details on creating new pages and working with existing pages.

Using Apryse SDK, you can open a document from a serialized file, from a memory buffer, or from a Filter stream.

To open an existing PDF document from a file, specify its file path in the PDFDoc constructor:

doc = PDFDoc(filename)

Here's how to open an existing PDF document from a memory buffer:

file = MappedFile(filename)
file_sz = file.FileSize()
file_reader = FilterReader(file)
mem = file_reader.Read(file_sz)
doc = PDFDoc(bytearray(mem), file_sz)

It's also easy to open a PDF document from a MemoryFilter or a custom Filter .

After creating a PDFDoc object, it's good practice to call InitSecurityHandler() on it. If the document is encrypted, calling the method will decrypt it. If the document is not encrypted, calling the method is harmless.

doc = PDFDoc(filename)
if not doc.InitSecurityHandler():
  print("Document authentication error...")
  return

Get the answers you need: Chat with us