Some test text!

Search
Hamburger Icon

Java / Changelog / v11.0

Version 11.0.0 Changelog (October 30th, 2024)

New In This Release

PDF/UA-1 Conversion

  • A new experimental interface to automatically convert and add accessible tags to existing PDF files so that they conform to PDF/UA-1. (PDFUAConformance class and PDFUAConformance.AutoConvert())
  • This autoconversion feature requires that DataExtractionModule be present and accessible from the SDK.
  • Note that for full conformance with PDF/UA it is required that the generated output undergo manual review and editing as required.

Barcode Module

  • Added a new Barcode Module that can detect a wide range of barcode types in PDF files and provides output in JSON format. (BarcodeModule.ExtractBarcodes() or BarcodeModule.ExtractBarcodesAsString())

New Options

  • Added an option to export comments from Word documents as PDF annotations. (OfficeToPDFOptions.SetDisplayComments())

Improvements:

  • [ocr] The latest OCR module has significantly improved quality and run speed.
  • [.net] Added support for .NET 9.
  • [cad] Updated CAD module binaries to use ODA version 25.8.
  • [pdf] Improved the exception message in TextSearch.Run() for when the search string is empty. Previously it could throw a "The instance hasn't been initialized yet." exception.
  • [pdf] Reduced the memory required when loading PDF files with static XFA containing element names that are split into many parts. Previously loading these files could lead to very high memory usage.
  • [pdf] Improved support for parsing corrupt PDF files containing TJ operators where the operator argument has an extra unmatched left square bracket. Previously, rendering a file with this error could lead to missing text or other content.
  • [pdf] Improved support for rendering corrupt PDFs where there is an unexpected operator "n" within a text block. Previously this could cause words to incorrectly overlap with each other.
  • [cad] When converting CAD to PDF hidden layers will now be retained.
  • [pdf] Improved handling of corrupt PDF files with invalid path building commands in BT/ET blocks. Previously this could lead to rendering errors including incorrectly positioned text elements.
  • [pdf] Improved handling for corrupt PDF files containing garbage data after trailers or inconsistent trailer dictionaries. Previously, such files could fail to load with an "Attempt to load free object" exception.
  • [pdfa] Conversion to PDF/A-1 with transparency flattening enabled now supports flattening partially transparent text. Previously this text could become opaque after conversion.
  • [pdf] Adjusted PDF incremental download to more efficiently fetch document info which can be required for rendering.
  • [cad] Added support for using embedded DWF fonts on Linux. Previously, missing this support could cause incorrect glyphs or glyph spacing in the output.

Bugfixes:

  • [pdf] Fixed an issue where files with fields that have JavaScript actions containing whitespace-only values were being calculated incorrectly. Previously this could result in unexpected changes to unrelated field values.
  • [pdf] Fixed an error where font styles (e.g., Bold, Italic) may fail to be applied during Freetext annotation appearance generation if they are supplied via the annotation's default resources (DR) dictionary.
  • [pdfa] Fixed an issue with conversions to PDF/A-2 or higher which could occasionally lead to relevant characters mapped to glyph index 0 (the .notdef glyph) being removed from the content.
  • [pdfa] Fixed an issue with conversions to PDF/A-2 or higher where the output file could have a validation error: "tintTransform is different in Separations with the same colorant name is not fixed in the output file", or a similar error for alternateSpace.
  • [pdf] Fixed an issue with redacting rare PDF text that uses fonts with inaccurate or broken geometry information. Previously this could cause some text to be erroneously redacted or missed during redaction.
  • [pdfa] Fixed an issue with PDF/A conversion, where the output file would have a validation error: "Corrupt content stream". This can happen on input files that are generated by merging different pdfs.
  • [pdf] Fixed an issue with the appearance of applied redactions that include text. Previously, the top and bottom alignment of the text was noticeably offset from the top or bottom edge of the redaction rectangle.
  • [svg] Fixed an issue with PDF to SVG conversion issue where, in rare cases, content could be placed incorrectly. Previously, this could lead to content piling up top of itself in one corner of the SVG canvas.
  • [cad] Fixed an issue with DGN to PDF conversion where complex 3D objects could cause very high memory usage, sometimes leading to memory exhaustion.
  • [html] Forced rendering of text/html part of the content in multipart EML files. Previously the module would render the part that was defined first which typically was plain text.
  • [html] Fixed an issue handling html files with a large number of nested divs on Windows. Previously, an "HTML2PDF module crashed. To get more information, enable logging using SetLogFilePath." exception could be thrown.

Office Fidelity:

  • [doc] Fixed an issue where the left and right borders of table cells were incorrectly displayed in DOC documents.
  • [office] Improved the accuracy of displaying the category axis with correct tick marks, labels, and axis crossing values in XLS documents.
  • [xlsx] Fixed an issue with an unexpected exception thrown due to an incomplete dataBar element in an Excel document.
  • [xlsx] Fixed a crash during Excel document conversion caused by a document with an empty print area.
  • [xlsx] Fixed an indefinite Excel document conversion caused by page breaks outside the print area.
  • [ppt] Fixed an issue where the absence of a specified text flow type in OOXML led to incorrect text rotation and alignment.
  • [docx] Improved the spacing between paragraphs with the same default style.
  • [docx] Fixed a crash caused by paragraph style inheritance cycles.
  • [docx] Fixed bug where a list item bullet may persist after a template fill with a false conditional.
  • [docx] Improved the handling of floating objects when the distance from text setting is non-zero.
  • [docx] Fixed a rare infinite pagination bug related to placing right-to-left text in a table.
  • [xlsx] Improved handling of Excel workbooks with invalid print area.
  • [xlsx] Fixed an issue with missing print titles on the first page of Excel documents.
  • [office] Improved accuracy of data label placement in pie charts.
  • [office] Fixed incorrect intervals on a time axis in some charts.
  • [xlsx] Fixed an exception thrown due to invalid cell references in Excel documents.
  • [xlsx] Optimized Excel conversion speed for documents with large amounts of hidden rows.
  • [xlsx] Fixed an issue where drawings in hidden cells were incorrectly visible in Excel documents.
  • [docx] Improved the handling of documents with tables containing dataBinding elements in structured document tags (SDT).
  • [docx] Fixed a rare issue with incorrect table edge borders.
  • [docx] Fixed handling of documents that have page breaks in footnotes.
  • [docx] Fixed a rare crash caused by nested paragraphs in some Word documents.
  • [office] Fixed an issue where the bar chart displayed in a uniform color instead of the intended multiple colors.
  • [docx] Fixed issues with incorrect interaction of tables with floating elements in Word documents.
  • [doc] Fixed an issue where the table border color displayed incorrectly in some DOC documents.
  • [xlsx] Fixed an issue with extra space added to Excel sheets containing empty columns.
  • [xlsx] Fixed incorrect scaling of some Excel sheets with the "Fit sheet to page" option enabled.
  • [xls] Fixed a rare issue where text within cells was truncated or certain rows were missing in XLS documents.
  • [docx] Fixed an issue with missing paragraph borders between some paragraphs.
  • [docx] Fixed an issue with incorrect paragraph spacing in SmartArt shapes.
  • [docx] Improved the spacing between paragraphs with the same style when one paragraph is in an SDT.
  • [docx] Improved the clipping of text box contents.
  • [docx] Fixed an issue with incorrectly positioned images inside shape groups.
  • [docx] Added application of image color effects to images inside shape groups.
  • [pptx] Fixed an issue with extra spacing added to the first paragraphs in some text boxes.
  • [office] Fixed an issue where an EMF image was partially or completely missing due to an incorrectly applied clipping.
  • [docx] Added proper handling of decimal tabs in table cells.
  • [xlsx] Fixed an issue where ExcelMaxAllowedCellCount option did not work correctly with some Excel documents containing empty styled tables.
  • [doc] Fixed an issue where the hyperlinks were missing in Word documents due to unexpected placement of hyperlink elements within the OOXML tree.
  • [docx] Null input values in template replacement values are now treated as empty instead of resulting in error.
  • [xls] Fixed an issue where incorrect page margins caused the headers and footers in XLS documents to either overlap or disappear.
  • [doc] Fixed an issue with an extra page in DOC documents caused by incorrect handling of the "Allow row to break across pages" property.
  • [docx] Improved line breaking around hyphens in older Word documents.
  • [docx] Improved the balancing of columns of different widths.
  • [office] Improved visual appearance of dual-axis scatter charts.
  • [xlsx] Fixed an issue where large numbers were replaced with # in shrink-to-fit cells.
  • [xlsx] Fixed incorrectly rounded large numbers in some Excel documents.
  • [doc] Fixed an issue where texts in the table of contents were incorrectly formatted in DOC documents.
  • [xlsx] Implemented dynamic fitting and rounding of numbers in Excel cells.
  • [xlsx] Fixed a crash when converting documents with chart sheets and the SetExcelMaxAllowedCellCount option is set.
  • [docx] Fixed the unreadable content bug in docx to docx templating when filling table cells.
  • [ppt] Fixed an issue with incorrectly displayed custom geometry shapes in PPT documents.
  • [docx] Improved the positioning of floating objects with oversized content.
  • [docx] Fixed an issue where the 3D shape contours were not appearing in DOCX documents.
  • [docx] Fixed a rare crash in documents with list numbering in Normal style.
  • [docx] Fixed an issue with an extra blank page added to the document when a paragraph with "Page break before" option is placed at the bottom of the page.
  • [docx] Improved the application of cell formatting in spreadsheets. Previously some empty cells were missing formatting.
  • [doc] Fixed an issue where certain WMF images were duplicated in DOC documents.

Fixes and improvements for the Structured Output Module

  • [docx] Improved the detection of standard office bar charts and variants.
  • [docx] Improved the detection of Chinese language.
  • [docx] Improved the optical character recognition preprocessing of vector text.
  • [docx] Improved the column detection of left to right aligned text.
  • [docx] Improved the stability of graphic color detection.
  • [docx] Improved the detection of header content.
  • [docx] Improved detection of white text located on a dark background.
  • [docx] Improved handling of text where the text and background colour match.
  • [docx] Improved table detection.
  • [docx] Improved the detection of diagrams.
  • [docx] Improved detection of black text located on a grey background.
  • [office] Improved language and page orientation detection.
  • [docx] Fixed an issue causing Latin characters in a Chinese document to be misplaced.
  • [docx] Fixed an issue where a large graphic element caused text recovery failure.
  • [docx] Fixed an issue preventing the detection of the correct bounds of a graphic element.
  • [docx] Fixed a performance issue where dense vector graphics prevented successful optical character recognition of a file.
  • [docx] Fixed an issue causing conversion delay of complex one-page document.
  • [docx] Fixed a bug preventing the rendering of the first page of a detected Table of Contents.

Trial setup questions? Ask experts on Discord
Need other help? Contact Support
Pricing or product questions? Contact Sales