Version 11.8.0 Changelog (October 8th, 2025)

New In This Release

New Features

  • Added a new Document classification engine to the Data Extraction Module. (DataExtractionModule.e_DocClassification). Document classification is an AI-trained SDK API that identifies each file on upload based on a predefined set of 19 categories, so you can validate intake, route to the right workflow, or add metadata for processing later. The output includes the predicted label with a confidence score in structured JSON for easy integration into your solution.
  • CAD Title Block Extraction is an AI-powered API that automatically detects and extracts key information from engineering drawings. Building upon our Key-Value Extraction, it's designed to help architecture, engineering, and construction (AEC) projects by eliminating the need for time-consuming manual searches and data entry.
  • Node 24 is now supported.
  • Python 3.13 for DataExtraction is now supported.

See our latest Release Notes for more about our new Smart Data Extraction with Server features, including supporting documentation to use document classification and CAD title block extraction.

New Options

  • Added an option to the Generic Key-Value extraction engine to control if empty fields should be recognized. (DataExtractionOptions.SetDetectEmptyFields()).

End of Support

  • Node 8 and Node 9 are no longer supported.

Changed Behavior

  • [pdf] Exception is not raised when rendering an empty object.

Improvements

  • [pdf] Added support for formatting PDF widget content for cases when JavaScript is disabled. If the widget's presentation depends on AFNumber_Format() function call, it now formats the widget's text according to the function's parameters.
  • [pdf] Added APIs for setting font name and size in the redaction annotation.
  • [pdf] Added support for repairing corrupt PDF files, where regular stream objects were incorrectly marked as type "ObjStm". Previously, pages could turn blank after PDFDoc.InsertPages() when the input was corrupt.
  • [pdfa] Improved the output file size for conversion when a color space needs to be added.
  • [pdf] Improved handling of documents with missing color space resources.
  • [pdf] Improved handling of corrupted documents with empty objects.

Bugfixes

  • [pdf] Fixed a rare memory leak caused by duplicated PDF dictionary keys.
  • [pdf] Fixed a minor memory leak in TransPDF::ApplyXLIFF and FindReplace APIs.
  • [pdf] Fixed loss of PDF content when calling FindReplace::FindReplaceText.
  • [pdf] Fixed a null pointer dereference crash in TransPDF::ApplyXLIFF and FindReplace::FindReplaceText.
  • [pdf] Fixed the FindReplace.findReplaceText Java interface and the FindReplaceTest.java sample.
  • [pdf] Fixed an exception in TransPDF::ApplyXLIFF() caused by italic styling applied to CJK characters, which do not support italicization.
  • [pdf] Fixed an issue where TransPDF::ApplyXLIFF() reported an incorrect error message when an italic variant of a character was unavailable.
  • [pdf] Fixed an issue where TransPDF.ApplyXLIFF() could fail with an error "Invalid charcode" when Type3 fonts were present in PDF files.
  • [pdf] Fixed an issue when importing free text annotations that are vertically centered incorrectly.
  • [pdf] Fixed an issue where annotation text size may be incorrect.
  • [pdf] Fixed a crash that happens following a sequence of calls to optimize the document, linearize it, then render to get a bitmap of a page.
  • [pdf] Fixed an issue where TextExtractor incorrectly inserted two extra spaces in between words when SetRightToLeftLanguage(true) was set.
  • [pdf] Fixed Font parsing to properly handle CMaps referencing another embedded CMap via the usecmap operator.
  • [pdfa] Fixed a corruption that may occur when converting certain files with missing XMP Metadata in a specific order.
  • [pdfa] Fixed incorrect detection of "e_PDFA 351: Embedded composite (Type0) font program does not define all font glyphs." for Type 2 CIDFonts.
  • [xfdf] Fixed an issue in PDFDoc.FDFExtract() when working with files with self referring links.
  • [image] Fixed a bug where .jpc files weren't being converted into PDFs in some cases.
  • [.net] Fixed an issue with different behavior of some strings in Windows and Linux.

Office Fidelity

  • [doc] Fixed an issue that caused missing content in some DOC documents with endnotes.
  • [docx] Fixed an issue where East Asian characters in a text box were incorrectly rotated.
  • [office] Improved text layout accuracy when the font family in the document is not available on the system.
  • [docx] Fixed an issue causing missing highlight annotations for document comments.
  • [office] Improved font substitution on Linux for documents with non-covered characters.
  • [docx] Fixed an issue where drop caps backgrounds were extended across a page.
  • [office] Added support for error bars in office charts.
  • [docx] Improved handling of floating objects with top-and-bottom wrapping in multi-column text.
  • [ppt] Fixed an issue where unexpected bullet points appeared in PPT documents.
  • [office] Added support for "horizontal stripes" and "vertical stripes" pattern fills.
  • [docx] Fixed an issue with missing headers and footers on pages containing endnotes.
  • [xlsx] Fixed a rare issue with missing borders in Excel sheets.
  • [office] Fixed a rare crash in office documents containing math matrix elements.
  • [docx] Fixed an issue with incorrect date format in some Word documents.
  • [docx] Fixed an issue where some words in RTL text could wrap to the next line unnecessarily.
  • [docx] Fixed an issue where the teardrop shape was incorrectly rendered when using Crop to Shape in Word documents.
  • [docx] Fixed an issue where table cell contents were missing when a table had thick borders.
  • [office] Fixed superscript/subscript scale for Type 1 fonts.
  • [docx] Fixed an issue with missing section breaks in some Word documents.
  • [docx] Fixed an issue with incorrect column balancing when the columns contained paragraphs with "Keep with next" and "Keep lines together" flags.
  • [docx] Fixed an issue where paragraphs with borders had incorrect spacing between them.
  • [office] Improved font substitution for style variants on macOS.
  • [doc] Improved memory efficiency and parsing performance for binary DOC documents.
  • [docx] Fixed an issue with overlapping text in framed paragraphs.
  • [xlsx] Improved conversion performance for some Excel documents with extensive conditional formatting.
  • [docx] Fixed an issue where floating tables that got bumped to the next page were not rendered correctly because their width was set to zero.
  • [docx] Fixed several issues with comment annotations being highlighted incorrectly.
  • [office] Fixed a rare issue with a table being pushed to the next page indefinitely.
  • [docx] Fixed an issue with incorrectly positioned floating elements in headers/footers.
  • [docx] Fixed an issue where the first table rows would unexpectedly repeat on the next page.
  • [docx] Fixed layout issues with En Space (U+2002) and Em Space (U+2003) characters.
  • [docx] Improved table cell comment annotations.
  • [docx] Preventing missing content in documents with `IF` fields spanning section breaks.
  • [docx] Fixed an issue where some framed paragraphs were misplaced.
  • [office] Added support for pattern fills in some previously unsupported cases.
  • [xls] Fixed a rare "XML parse error" exception thrown during XLS conversion.
  • [office] Improved font substitution for Wingdings 3 and Webdings fonts.
  • [ppt] Fixed an issue with incorrect slide background colors in legacy PPT documents.
  • [docx] Fixed an issue with missing blank pages in some Word documents.
  • [docx] Enhanced support for “Crop to Shape” in Word documents, ensuring accurate rendering of cropped images and shapes.
  • [pptx] Fixed incorrect font size in some PowerPoint tables.
  • [xlsx] Fixed incorrect currency value formatting in some XLSX files.
  • [office] Fixed an issue causing incorrectly colored math content.
  • [doc] Fixed incorrect paragraph alignment in some legacy Word documents.

Fixes and improvements for the Structured Output Module

  • [office] Improved text encoding detection for complex glyphs like ligatures.
  • [office] Improved encoding recovery for self-intersected glyphs.
  • [office] Significant improvements to hyperlink detection with an emphasis on multiline hyperlinks.
  • [docx] Improved text transparency conversion.
  • [docx] Fixed pdf annotation to MSWord comments conversion.
  • [pptx] Fixed tabs placement. Prefer using space characters for small gaps.
  • [office] Fixed a bug preventing the accurate detection of vector glyphs in a document.
  • [docx] Fixed a bug causing borderless table headers to be detected as page headers in a document.

Did you find this helpful?

Trial setup questions?

Ask experts on Discord

Need other help?

Contact Support

Pricing or product questions?

Contact Sales