Releases · mini-software/MiniPdf

12 Mar 06:34

shps951023

v0.13.0

b8c2d1b

v0.13.0 — Excel Print Fidelity, Theme Colors & Online Demo Latest

Latest

v0.13.0 — Excel Print Fidelity, Theme Colors & Online Demo

Highlights

This release dramatically improves Excel-to-PDF conversion fidelity with print area support, page setup honoring, theme color rendering, and fit-to-page scaling. A new Blazor WASM online demo is introduced for browser-based conversion. The XLSX benchmark expands from 180 to 191 test cases with average overall score improving from 96.82% → 96.89%.

New Features

Excel Print Area Support

• Parse _xlnm.Print_Area defined names from workbook XML to determine printable cell ranges
• Trim rows, columns, merged cells, images, and charts to the defined print area before rendering
• Column-only ranges (e.g. $A:$L) handled with full-row expansion

Excel Page Setup & Paper Size

• Read pageSetup element: orientation (landscape/portrait), paper size (A4/A3/Letter), print scale, and custom margins
• Apply sheet-level margins (marginLeftPt, marginRightPt, marginTopPt, marginBottomPt) to PDF output
• Support A3 (842×1191pt) and A4 (595×842pt) paper sizes with landscape auto-swap

Theme Color Support for Fills & Fonts

• Parse theme1.xml to extract theme color palette
• Resolve theme + tint color attributes on font and fill elements via HSL luminance adjustment (ECMA-376 §18.8.19)
• New ResolveColorElement() and ApplyTint() helpers for accurate themed color rendering

Horizontal Centering

• Parse horizontalCentered print option from sheet XML
• Center sheet content horizontally within the usable page width when enabled

Online Demo (MiniPdf.Web)

• New Blazor WebAssembly project for browser-based XLSX/DOCX → PDF conversion
• Multi-language support with language switcher (EN/ZH-CN/ZH-TW/JA/KO/FR/IT)
• GitHub Pages deployment with CI workflow
• Accept specific MIME types for file input (.xlsx, .docx)
• Statcounter analytics integration

Improvements

Fit-to-Page Scaling

• fitToPage + fitToWidth: auto-scale column widths to fit all columns in one page width, with proportional row height reduction
• fitToPage + fitToHeight: recalculate print scale so all rows fit within the target number of vertical pages
• Cell font sizes optionally scaled by print scale factor (ScaleCellFonts) for accurate auto-row-height when compressed

Print Scale Rendering

• Apply print scale factor to font sizes, column padding, column widths, and row heights
• Explicit column widths from CharUnitsToPoints scaled to match print-scaled content
• Cell-level font size scaling for width calculations and text clipping

Chart Rendering Improvements

• Scale dominant twoCellAnchor charts (spanning >50% of sheet rows) to fill usable page width, matching LibreOffice output
• Inline charts (anchored within data area) rendered at anchor position without scale-up
• Overflow pages for dominant charts taller than one page
• Cap non-dominant chart height to 85% of page to prevent overflow

Column & Row Handling

• Trim trailing style-only columns (background/borders only, no text) to prevent excessive blank pages
• Skip hidden columns (width 0) entirely during rendering
• Skip column groups where no row has any text content (avoids blank pages for style-only ranges)
• Track customHeight rows and scope auto-row-height expansion to fitToPage sheets only
• Remove trailing empty pages (no text, images, rectangles, or lines) after column group rendering

Hidden Sheet Filtering

• Read state attribute from sheet elements; skip sheets marked as hidden or veryHidden

Benchmark

Format	Cases	Avg Score	🟢 Pass	🟡 Warning	🔴 Fail
DOCX	180	97.62%	178	2	0
XLSX	191 (+11)	96.89% (+0.07%)	169	22	0

Other Changes

• Added 11 new XLSX benchmark test cases (classic181–classic191) including payroll calculator scenarios
• New PdfDocument.RemoveEmptyPages() and RemoveLastPage() internal helpers
• Added online demo link to README files in multiple languages
• New Run-Benchmark_issues.ps1 script for issue-specific benchmark testing

Files Changed

• ExcelReader.cs — +467/−59 lines: theme colors, print area parsing, page setup, hidden sheet filtering, fill color resolution
• ExcelToPdfConverter.cs — +363/−17 lines: print area trimming, fit-to-page scaling, print scale rendering, chart scaling, horizontal centering, column trimming
• PdfDocument.cs — +17 lines: RemoveEmptyPages() and RemoveLastPage() methods
• MiniPdf.Web — +1,350 lines: new Blazor WASM online demo with i18n support
• 308 files changed total (including benchmark images, reports, test scripts, web project)

Full Changelog: v0.12.0...v0.13.0

Assets 2

08 Mar 12:46

shps951023

v0.12.0

cbe8b70

v0.12.0 — Helvetica Font Metrics, Header/Footer & Paragraph Borders

Highlights

This release brings precision font metrics, header/footer support, and paragraph borders to the DOCX-to-PDF converter, while also improving Excel-to-PDF layout accuracy. The DOCX benchmark expands from 150 to 180 test cases with average overall score improving from 96.96% → 97.62%.

New Features

DOCX Header & Footer Support

Parse headerReference / footerReference from sectPr and read header/footer .xml parts from the DOCX archive
Render header text centered in the top margin and footer text centered in the bottom margin on every page (9pt gray)

Paragraph Borders

Parse pBdr element with top/bottom/left/right border edges including width (sz in eighths of a point) and color
Render paragraph borders as PDF lines around the paragraph bounding box
New model records: DocxBorders, DocxBorderEdge

Improvements

Helvetica Font Metrics Engine (replaces fixed-width estimation)

NEW: EstimateTextWidth() using real Helvetica character widths (ASCII 32–126 glyph table, CJK full-width 1000 units)
NEW: GetHelveticaCharWidth() lookup with proper handling of CJK Unified Ideographs, Hiragana, Katakana, and fullwidth forms
Replaced all avgCharWidth = fontSize * 0.47f approximations across paragraphs, tables, tab stops, and word wrapping
Tab stop expansion now uses actual text width measurement instead of column-count estimation
Word wrapping decisions based on real glyph widths instead of character count × fixed width

Text Positioning & Layout

Ascent-aware positioning: text top aligns with margin using AscentRatio (1.075) at top-of-page boundaries
Font metrics factor tuned from 1.18 → 1.17 for tighter line spacing
Default paragraph spacing changed from fontSize * 0.35f to fixed 8f points for closer match to Word rendering
Bullet lists: render bullets as small filled rectangles instead of text glyphs — eliminates Helvetica • vs Symbol U+F0B7 text-extraction discrepancy

Tab Stop & Leader Improvements

Leader dot/hyphen/underscore fill count computed from actual glyph widths with Calibri-equivalent 0.725× scale factor
Tab leader lines use maxWidth (Tz operator) to compress expanded dot runs to fit intended tab position
Tab-stop-aware word wrap: extends effective line width when tab positions exceed available width, preventing premature line breaks

Table Rendering

Grid-line border drawing: borders drawn once per row boundary (shared edges) instead of per-cell — eliminates double-line artifacts
Cell text alignment: center and right alignment now supported within table cells
Nested table flattening: each nested row's cell text is joined into a single paragraph (instead of one paragraph per cell)

Excel-to-PDF Improvements

Overflow page accumulation: virtual overflow pages now emitted at end of sheet (matching LibreOffice layout) instead of inline per-row
Default column width: uses sheet's DefaultColumnWidth when available instead of hardcoded 8.43
Wrap-width calculation: accounts for cell content margins (~11pt) and Calibri fitting scale for more accurate row-height estimation
Pie chart legend: now renders category name text labels alongside color swatches

Benchmark

Format	Test Cases	Average Score	Excellent (≥90%)	Acceptable (80–90%)	Needs Improvement (<80%)
DOCX	180 (+30)	97.62% (+0.66%)	178	2	0
XLSX	180	96.82%	165	13	2

Other Changes

Removed 26 obsolete XLSX reference PDFs (classic35–classic60) to streamline the benchmark suite
Added 30 new DOCX benchmark test cases (classic121–classic150) covering additional paragraph, table, and formatting scenarios

Files Changed

DocxReader.cs — +103 lines: header/footer parsing, paragraph borders, nested table flattening
DocxToPdfConverter.cs — +299/−80 lines: Helvetica metrics engine, border rendering, ascent positioning, bullet rendering, tab leader improvements
ExcelToPdfConverter.cs — +62/−62 lines: overflow page accumulation, pie chart legend text, wrap-width fix
576 files changed total (including benchmark images, reports, test scripts)

Full Changelog: v0.11.0...v0.12.0

Assets 2

07 Mar 04:41

shps951023

v0.11.0

39f2f9d

v0.11.0

v0.11.0 — Word (.docx) to PDF Conversion

Highlights

This release adds DOCX-to-PDF conversion — a brand-new, zero-dependency Word document renderer. MiniPdf can now convert .docx files to PDF with paragraph, table, and image support, achieving a 96.96% average overall score across 150 benchmark test cases compared to LibreOffice reference output.

New Features

DOCX Reader (`DocxReader.cs` — 791 lines)

Full OOXML paragraph parsing: text runs, bold/italic/underline/strikethrough, font sizes, font colors, highlight colors
Heading styles (Heading1–Heading9) with automatic font size mapping from styles.xml
Paragraph alignment (left, center, right, justify) and indentation (left, right, hanging, firstLine)
Bullet and numbered list support with numId/ilvl detection
Tab stop parsing with position and alignment (left, center, right, decimal)
Paragraph shading / background color support
Table parsing with cell content, borders, shading, column spans (gridSpan), and grid column widths
Table border support: reads tblBorders and tcBorders from document and style definitions
Default paragraph properties (pPrDefault) applied from styles.xml
Embedded image extraction via relationships with EMU-to-point dimension conversion
Page layout reading from sectPr: page size, margins, orientation
Page break detection (w:br type page and lastRenderedPageBreak)

DOCX-to-PDF Converter (`DocxToPdfConverter.cs` — 730 lines)

Paragraph rendering with mixed formatting runs, line wrapping, and proper line spacing
Heading rendering with bold weight and scaled font sizes
Text alignment: left, center, right, justified
List rendering with bullet (•) and numbered (1., 2., …) prefixes at correct indentation
Tab stop handling with leader positioning
Paragraph shading rendered as filled rectangles behind text
Table rendering with cell borders, shading fills, column-width distribution, and automatic row height
Table border rendering with configurable stroke widths
Image embedding as inline JPEG XObjects with aspect-ratio-aware scaling
Page layout support: reads page dimensions and margins from DOCX sectPr
Automatic page breaks: content overflow and explicit w:br type="page" handling
Configurable ConversionOptions: font size, margins, line spacing, page dimensions

Unified API

MiniPdf.ConvertToPdf() now auto-detects .docx files by extension — no API change needed for existing callers
New MiniPdf.ConvertDocxToPdf(Stream) method for stream-based DOCX conversion
Updated NuGet description and tags to include word and docx

Tests

9 new unit tests in DocxToPdfConverterTests.cs covering: simple documents, bold text, tables, empty documents, multi-paragraph, stream input, and file output
150 DOCX benchmark test cases with visual comparison against LibreOffice reference PDFs

Benchmark

150 DOCX test cases (classic01–classic120 + themed documents): single paragraph, multiple paragraphs, headings, bold/italic, font sizes, font colors, alignment, bullet lists, numbered lists, simple tables, table shading, merged cells, mixed content, images, long documents, multi-page tables, comprehensive reports, and more
Average Overall Score: 0.9696 (text similarity + visual comparison vs LibreOffice)
147 Excellent (≥ 90%), 3 Acceptable (80–90%), 0 Needs Improvement (< 80%)
Benchmark scripts: Run-Benchmark_docx.ps1, generate_reference_pdfs_docx.py, compare_pdfs.py with DOCX mode

Other Changes

Added .gitattributes to configure GitHub Linguist for Python scripts
Updated README badges: replaced .NET badge with Gitee link across all language variants (EN, zh-CN, zh-TW, ja, ko, fr, it)
README now includes DOCX benchmark visual comparison table with MiniPdf vs Reference side-by-side images

Files Changed

DocxReader.cs — +791 lines (new): OOXML document parser
DocxToPdfConverter.cs — +730 lines (new): DOCX-to-PDF rendering engine
MiniPdf.cs — +26 lines: .docx auto-detection and ConvertDocxToPdf() API
MiniPdf.csproj — updated description and package tags
340 files changed total (including benchmark images, reports, and scripts)

Full Changelog: v0.9.0...v0.10.0

Assets 2

06 Mar 01:15

shps951023

v0.10.0

e0678ae

v0.10.0 — Word (.docx) to PDF Conversion

Highlights

This release adds DOCX-to-PDF conversion — a brand-new, zero-dependency Word document renderer. MiniPdf can now convert .docx files to PDF with paragraph, table, and image support, achieving a 97.4% average overall score across 60 benchmark test cases compared to LibreOffice reference output.

New Features

DOCX Reader (DocxReader.cs — 727 lines)

• Full OOXML paragraph parsing: text runs, bold/italic/underline/strikethrough, font sizes, font colors, highlight colors
• Heading styles (Heading1–Heading9) with automatic font size mapping from styles.xml
• Paragraph alignment (left, center, right, justify) and indentation (left, right, hanging, firstLine)
• Bullet and numbered list support with numId/ilvl detection
• Tab stop parsing with position and alignment (left, center, right, decimal)
• Paragraph shading / background color support
• Table parsing with cell content, borders, shading, column spans (gridSpan), and grid column widths
• Embedded image extraction via relationships with EMU-to-point dimension conversion
• Page layout reading from sectPr: page size, margins, orientation
• Page break detection (w:br type page and lastRenderedPageBreak)

DOCX-to-PDF Converter (`DocxToPdfConverter.cs` — 682 lines)

• Paragraph rendering with mixed formatting runs, line wrapping, and proper line spacing
• Heading rendering with bold weight and scaled font sizes
• Text alignment: left, center, right, justified
• List rendering with bullet (•) and numbered (1., 2., …) prefixes at correct indentation
• Tab stop handling with leader positioning
• Paragraph shading rendered as filled rectangles behind text
• Table rendering with cell borders, shading fills, column-width distribution, and automatic row height
• Image embedding as inline JPEG XObjects with aspect-ratio-aware scaling
• Page layout support: reads page dimensions and margins from DOCX sectPr
• Automatic page breaks: content overflow and explicit w:br type="page" handling
• Configurable ConversionOptions: font size, margins, line spacing, page dimensions

Unified API

• MiniPdf.ConvertToPdf() now auto-detects .docx files by extension — no API change needed for existing callers
• New MiniPdf.ConvertDocxToPdf(Stream) method for stream-based DOCX conversion
• Updated NuGet description and tags to include word and docx

Tests

• 9 new unit tests in DocxToPdfConverterTests.cs covering: simple documents, bold text, tables, empty documents, multi-paragraph, stream input, and file output
• 60 DOCX benchmark test cases with visual comparison against LibreOffice reference PDFs

Benchmark

• 60 DOCX test cases (classic01–classic60): single paragraph, multiple paragraphs, headings, bold/italic, font sizes, font colors, alignment, bullet lists, numbered lists, simple tables, table shading, mixed content, images, long documents, multi-page tables, comprehensive reports, and more
• Average Overall Score: 0.9739 (text similarity + visual comparison vs LibreOffice)
• Benchmark scripts: Run-Benchmark_docx.ps1, generate_reference_pdfs_docx.py, compare_pdfs.py with DOCX mode

Other Changes

• Added .gitattributes to configure GitHub Linguist for Python scripts
• Updated README badges: replaced .NET badge with Gitee link across all language variants (EN, zh-CN, zh-TW, ja, ko, fr, it)
• README now includes DOCX benchmark visual comparison table with MiniPdf vs Reference side-by-side images

Files Changed

• DocxReader.cs — +727 lines (new): OOXML document parser
• DocxToPdfConverter.cs — +682 lines (new): DOCX-to-PDF rendering engine
• MiniPdf.cs — +23 lines: .docx auto-detection and ConvertDocxToPdf() API
• MiniPdf.csproj — updated description and package tags
• 192 files changed total (including benchmark images, reports, and scripts)

Full Changelog: v0.9.0...v0.10.0

Assets 2

05 Mar 10:48

shps951023

v0.9.0

f869038

v0.9.0 — Multi-Font Unicode & Horizontal Scaling

Highlights

This release introduces multi-font embedding for full Unicode coverage and horizontal text scaling to prevent column overflow, significantly expanding support for multilingual, emoji, and symbol-heavy spreadsheets.

New Features

Multi-Font Embedding Engine

Replaced the single hardcoded Arial CID font with a dynamic multi-font system that discovers and embeds multiple system fonts at runtime
Cross-platform font discovery: Windows (YaHei, JhengHei, Malgun Gothic, Segoe UI, Segoe UI Emoji, Segoe UI Symbol), macOS (PingFang, Apple SD Gothic Neo, Apple Color Emoji), Linux (Noto Sans CJK, Noto Color Emoji, WenQuanYi)
Characters are automatically split into runs by font slot — e.g. CJK in F2, Korean in F3, emoji in F4 — with proper Td advances within the same BT/ET block
Full TrueType/TTC font parsing: cmap format 4 & 12, hmtx glyph widths, head/OS2/hhea metrics, glyf table subsetting
CIDToGIDMap streams for correct glyph mapping with ZLib compression
ToUnicode CMap with UTF-16 surrogate pair support for non-BMP code points (emoji, CJK Ext-B)
Font subsetting: zeros out unused glyph outlines to reduce embedded font size
Glyph outline validation (HasGlyphOutline) to detect placeholder/empty glyphs, enabling proper font fallback
Emoji range detection (IsEmojiRange) to prefer dedicated emoji fonts over CJK fonts with placeholder glyphs

Arabic Text Shaping

Built-in Arabic Presentation Forms-B shaping engine with contextual form selection (isolated, initial, medial, final)
Arabic joining type analysis (Non-Joining, Right-Joining, Dual-Joining, Join-Causing, Transparent)
Lam-Alef ligature handling (ﻻ ﻵ ﻷ ﻹ)

Horizontal Text Scaling (Tz Operator)

Added MaxWidth property to PdfTextBlock for per-cell width constraints
When text exceeds the column width, the PDF Tz (horizontal scaling) operator compresses text to fit — keeping all characters intact for text extraction while preventing visual overflow
Helvetica width table (MeasureTextWidth) with standard character widths in 1/1000 em units
Applied to both WinAnsi (F1) and Unicode (Fn) text rendering paths

Improvements

Adjusted default margins: left/right 50pt → 54pt, column padding 2pt → 3pt for better visual balance
Fill rectangles no longer include extra columnPadding width, matching LibreOffice cell boundary rendering more closely
Proper Unicode code point enumeration with surrogate pair handling

Benchmark

16 new multilingual & emoji test cases (classic151–classic166):
- Multilingual greetings, emoji sampler, currency symbols, math symbols, diacritical marks, RTL/BiDi text, CJK extended, emoji skin tones, ZWJ emoji sequences, punctuation marks, box drawing, CJK+emoji styled, Cyrillic alphabets, Indic scripts, Southeast Asian scripts, emoji progress
180 total test cases, average overall score: 0.9652

Files Changed

PdfWriter.cs — +676 lines: multi-font engine, Arabic shaping, TrueType parsing, font subsetting
ExcelToPdfConverter.cs — horizontal scaling integration, margin adjustments
PdfTextBlock.cs — MaxWidth property
PdfPage.cs — maxWidth parameter on AddText

Full Changelog: v0.8.0...v0.9.0

Assets 2

05 Mar 01:56

shps951023

v0.8.0

f62bc00

v0.8.0

Highlights

This release improves Excel-to-PDF fidelity with better text measurement, vertical alignment support, merged-cell rendering, and clipping support — raising the benchmark average score to 0.9712 across 150 test cases.

What's Changed

Rendering Improvements

Vertical alignment support — cells with top, center, or bottom vertical alignment in Excel are now rendered with the correct Y-position in PDF, matching LibreOffice behavior
Merged cell rendering — fill rectangles, borders, and text alignment (right/center) now correctly span the full merged column range instead of being limited to the first column
Calibri-scaled text measurement — introduced a dedicated CalibriFittingScale constant (0.86) and MeasureScaledWidth() method so that column-fit checks and text truncation use a consistent Calibri-to-Helvetica scale factor
Cell-level font size for fitting — text truncation and numeric reformatting now use each cell's actual font size instead of the global default, fixing clipping for cells with non-default sizes
Auto-expand row height for large fonts — rows containing cells with font sizes larger than the default now auto-grow (≈1.3× font size) to prevent text overlap
Column padding tightened — default ColumnPadding reduced from 4pt to 2pt for more compact table rendering that better matches LibreOffice output

Text & Number Formatting

Boolean cell alignment — cells with type "b" (boolean) now default to center alignment under the General format, matching Excel/LibreOffice behavior
Negative number parenthesis handling — negative number formatting no longer incorrectly prepends a minus sign when the format uses parentheses ( for negative display
Numeric reformat for all cells — FitNumericText() is now applied to all numeric cells (not just clipped ones), matching LibreOffice's General format auto-shrink behavior

PDF Engine

Clipping rectangle support — PdfTextBlock and PdfPage.AddText() now accept an optional clipRect parameter; PdfWriter emits PDF q/Q graphics state save/restore with a clipping path (re W n), so text is visually clipped but remains fully extractable

Chart Rendering

Legend text removed — legend labels in bar/column and pie charts are no longer emitted as extractable PDF text, matching LibreOffice's behavior of rendering legends as vector graphics only

Benchmark & Tooling

Added comprehensive benchmark analysis scripts for PDF comparison (compare_pdfs.py, analysis helpers)
150 test cases with an average overall score of 0.9712

Stats

5 commits since v0.7.0
5 source files changed: ExcelReader.cs, ExcelToPdfConverter.cs, PdfPage.cs, PdfTextBlock.cs, PdfWriter.cs
+171 / −51 lines in source code

Full Changelog: v0.7.0...v0.8.0

Assets 2

04 Mar 15:05

shps951023

v0.7.1

7dca007

v0.7.1 — Rich Cell Styling, Combo Charts & Number Formatting Overhaul

v0.7.0 — Rich Cell Styling, Combo Charts & Number Formatting Overhaul

Highlights

This release brings substantial fidelity improvements to the Excel-to-PDF conversion engine, adding support for cell borders, font sizes, bold/italic text, horizontal alignment, explicit row heights, combo (overlay) charts, and a comprehensive rewrite of date/time and number formatting to match LibreOffice output.

New Features

Cell border rendering — Read and draw left/right/top/bottom borders from .xlsx styles, with support for thin, medium, and thick stroke widths and per-side colors.
Font style support — Extract font size, bold, and italic properties from the Excel font table; render text at the per-cell font size instead of a fixed global size.
Horizontal text alignment — Parse alignment from cellXf styles; support left, center, and right alignment with automatic "general" resolution (numbers right-align, text left-aligns).
Explicit row heights — Read per-row ht attributes and defaultRowHeight from sheetFormatPr; use them in page layout instead of computing row height solely from font size.
Combo chart (overlay series) — Detect secondary chart types in the plot area (e.g., a line series overlaid on a bar chart); render overlay lines with markers and merge all series into the legend.
Non-solid fill patterns — Approximate darkGray, mediumGray, lightGray, gray125, gray0625, and various hatching patterns as tint-blended solid fills.

Improvements

Date/time format engine — Full ConvertExcelDateFormat() converter that maps Excel format codes (d, dd, m, mm, yy, yyyy, h, hh, s, ss, AM/PM, A/P) to .NET DateTime format strings with correct month-vs-minute disambiguation. Built-in format IDs 14–22 now each produce the correct date/time style.
Number format improvements
- Zero-padding for integer formats (e.g., "0000" → 0042).
- Correct negative sign placement for currency formats (e.g., -$180,000.00 instead of $-180,000.00).
- Multi-section format handling (positive;negative;zero) with proper sign tracking.
General numeric display — FormatGeneral() now uses scientific notation for integers ≥ 1e10 (matching LibreOffice) and rounds near-integer values to shorter representations.
Chart title centering — Chart titles are now horizontally centered over the plot area.
Column width logic — Moved numeric text fitting (FitNumericText) inside the clipping branch so it only applies when content actually needs clipping; single-column sheets still expand to page width.
Default font size changed from 10pt to 11pt to match Excel's default.

Changed Files

File	Change
ExcelReader.cs	+679 lines — font styles, borders, row heights, overlay charts, date/number formatting
ExcelToPdfConverter.cs	+168 lines — border drawing, alignment rendering, row height layout, combo charts
README.md	Updated benchmark score images

New Internal Types

FontStyleInfo(Color, Size, Bold, Italic) — font metadata record
BorderSide(Style, Color) — single border edge
CellBorderInfo(Left, Right, Top, Bottom) — four-side cell border
ExcelCell extended with Alignment, FontSize, Bold, Italic, Border
ExcelSheet extended with RowHeights, DefaultRowHeight
ExcelChartInfo extended with OverlaySeries, OverlayChartType

Full Changelog: v0.6.0...v0.7.0

Assets 2

04 Mar 05:38

shps951023

v0.6.0

d0fa089

v0.6.0

Highlights

Benchmark quality score improved from 94.7% → 96.5% across 120 test cases. Excellent-tier results rose from 97 → 108, while Needs-Improvement cases dropped from 2 → 1.

New Features

Excel Reader Enhancements

Number format support — reads numFmt entries from styles.xml and applies formatting (currency, percentage, date, scientific, custom format codes) to numeric cell values
Cell fill / background color — parses solid patternFill colors from styles.xml and renders them as colored rectangles behind cell text
Merged cells — reads <mergeCells> regions; text width and clipping now respect the merged column span

Chart Rendering

Scatter / Bubble chart — new dedicated RenderScatterChart with numeric X and Y axes
Radar chart — rendered via RenderLineChart with spoke labels around the center
Horizontal bar chart — new RenderHorizontalBarChart (categories on Y-axis)
Stacked & percent-stacked support for bar and area charts, including reversed legend order to match Excel's bottom-to-top stacking
Pie / Doughnut legends — vertical category legend with color swatches below the chart
Data label percentages on pie/doughnut charts (showPercent)
Axis value formatting using the chart's numFmt formatCode (e.g. #,##0)
Chart title clipping — long titles are now truncated to fit the chart width

PDF Writer

Unified CJK / Unicode rendering — when a text block contains any non-WinAnsi character, the entire block is now rendered in the CID font (F2), eliminating the ~3 pt Y-offset between Type1 and CIDFontType2 that caused PyMuPDF to split spans across lines
Improved font metrics — updated FontBBox, Ascent, Descent, CapHeight to more accurate values
Full-width character detection (IsFullWidthCharPdf) for correct CJK/Hangul width calculation

Text Layout & Clipping

FittingChars / FitNumericText — pixel-accurate character fitting replaces the old maxChars integer estimate, improving truncation fidelity across varying font sizes
Single-column overflow — clips text at the page right edge and calculates virtual row height from wrapping at the default column width (matches LibreOffice behavior)
Multi-page row splitting — rows taller than the usable page height are now split across pages line-by-line

Defaults Changed

Setting	Old	New
`MarginTop` / `MarginBottom`	50 pt	72 pt (1 inch)
`LineSpacing`	1.6	1.5

Project Reorganization

Moved translated README files to documents folder
Moved utility scripts (Run-Benchmark.ps1, analyze_overflow.py, etc.) to scripts
Removed obsolete .github_issue_body.md and root-level PdfPage.cs
Added 40+ benchmark analysis & debugging Python scripts

Full Changelog

v0.5.0...v0.6.0

Assets 2

03 Mar 08:39

shps951023

v0.5.0

ac78c21

v0.5.0

Highlights

Excel Chart Rendering — MiniPdf now parses and renders charts embedded in .xlsx files directly to PDF, covering bar, line, area, pie, doughnut, scatter, radar, stacked, combo, stock (OHLC), bubble, and 3D chart types. Charts are rendered as native PDF vector primitives (rectangles + lines), producing lightweight, searchable output without rasterization.

New Features

Chart-to-PDF conversion — Automatically detects and renders c:chartSpace elements from .xlsx DrawingML, including:
- Bar / Column (grouped, stacked, percent-stacked, 3D)
- Line (with markers, multi-series)
- Area (stacked, percent-stacked)
- Pie / Doughnut (with slice labels)
- Scatter (with trendlines) / Bubble / Radar
- Combo (bar + line) / Stock OHLC
- Chart sheet support
PdfPage.AddRectangle() — Draw filled rectangles with custom PdfColor
PdfPage.AddLine() — Draw line segments with custom color and stroke width
New PDF primitives — PdfRectBlock and PdfLineBlock records for rectangle and line rendering in the PDF content stream
Chart data resolution — ExcelReader resolves numRef, strRef, and numLit references to extract series data, categories, and axis titles from worksheet cells
Nice axis scaling — Auto-calculated round-number axis labels with gridlines
Chart legend rendering — Multi-series charts display color-coded legends

Improvements

Text clipping at column boundaries — Cell text is now truncated when the adjacent cell contains content, matching LibreOffice behavior (previously text always overflowed)
Explicit newline handling — Cells with Alt+Enter line breaks (\n) are rendered as multi-line text
Chart overflow pages — Right-anchored charts produce overflow pages to match LibreOffice page count
README updated — Benchmark expanded from 90 → 120 test cases (30 new chart cases: classic91–classic120)

Benchmark

Category	Count	Threshold
Excellent	97	≥ 90%
Acceptable	21	70%–90%
Needs Improvement	2	< 70%

Average overall score: 94.7% (text 40% + visual 40% + page count 20%)

Stats

2 commits, 259 files changed, +6,798 / −1,266 lines
30 new chart test cases with reference PDFs and visual comparison images

Full Changelog: v0.4.1...v0.5.0

Assets 2

03 Mar 01:36

shps951023

v0.4.1

de32dc3

v0.4.1

v0.4.0 — March 3, 2026

Breaking Changes

Namespace renamed from MiniPdf to MiniSoftware.
Update all using MiniPdf; statements to using MiniSoftware;.

New Features

Unicode / CJK text rendering — PdfWriter now detects non-Latin-1 characters and automatically embeds a composite Identity-H CIDFont (Type0/CIDFontType2) with a ToUnicode CMap, enabling correct rendering of CJK and other Unicode text without manual font setup.
Multi-framework targeting — The library now targets net6.0, net8.0, and net9.0 (previously only net9.0), ensuring broad compatibility across .NET LTS releases.

Improvements

Benchmark AI visual-comparison threshold raised from 0.90 → 0.97 for more accurate pass/fail decisions.
compare_pdfs.py gains --ai-compare, --ai-max-pages, and --ai-threshold CLI flags for fine-grained control over AI visual comparison.
Added check_clip.cs test to validate character-width calculations used for text clipping.

Internal / Tooling

Added helper scripts: analyze_overflow.py, analyze_xlsx.py, check_reference.py, fix_code.py.
Added update_readme_images.py to automate the Visual Comparison table in all README variants.
Updated benchmark report images and comparison JSON/Markdown reports.

What's Changed

Commit	Summary
`61b5de2`	refactor: update namespace from MiniPdf to MiniSoftware across all files
`7e94618`	fix: update TargetFrameworks to support net6.0, net8.0, net9.0
`586f265`	Enhance PDF comparison step with AI options
`234f273`	Update AI threshold in benchmark; add check_clip test for character width calculations
`6635d0b`	Add script to update Visual Comparison table in README.md

Full Changelog: v0.3.0...v0.4.0

Assets 2

Releases: mini-software/MiniPdf

v0.13.0 — Excel Print Fidelity, Theme Colors & Online Demo

v0.13.0 — Excel Print Fidelity, Theme Colors & Online Demo

Highlights

New Features

Excel Print Area Support

Excel Page Setup & Paper Size

Theme Color Support for Fills & Fonts

Horizontal Centering

Online Demo (MiniPdf.Web)

Improvements

Fit-to-Page Scaling

Print Scale Rendering

Chart Rendering Improvements

Column & Row Handling

Hidden Sheet Filtering

Benchmark

Other Changes

Files Changed

Uh oh!

v0.12.0 — Helvetica Font Metrics, Header/Footer & Paragraph Borders

v0.12.0 — Helvetica Font Metrics, Header/Footer & Paragraph Borders

Highlights

New Features

DOCX Header & Footer Support

Paragraph Borders

Improvements

Helvetica Font Metrics Engine (replaces fixed-width estimation)

Text Positioning & Layout

Tab Stop & Leader Improvements

Table Rendering

Excel-to-PDF Improvements

Benchmark

Other Changes

Files Changed

Uh oh!

v0.11.0

v0.11.0 — Word (.docx) to PDF Conversion

Highlights

New Features

DOCX Reader (DocxReader.cs — 791 lines)

DOCX-to-PDF Converter (DocxToPdfConverter.cs — 730 lines)

Unified API

Tests

Benchmark

Other Changes

Files Changed

Uh oh!

v0.10.0 — Word (.docx) to PDF Conversion

v0.10.0 — Word (.docx) to PDF Conversion

Highlights

New Features

DOCX Reader (DocxReader.cs — 727 lines)

DOCX-to-PDF Converter (DocxToPdfConverter.cs — 682 lines)

Unified API

Tests

Benchmark

Other Changes

Files Changed

Uh oh!

v0.9.0 — Multi-Font Unicode & Horizontal Scaling

v0.9.0 — Multi-Font Unicode & Horizontal Scaling

Highlights

New Features

Multi-Font Embedding Engine

Arabic Text Shaping

Horizontal Text Scaling (Tz Operator)

Improvements

Benchmark

Files Changed

Uh oh!

v0.8.0

v0.8.0

Highlights

What's Changed

Rendering Improvements

Text & Number Formatting

PDF Engine

Chart Rendering

Benchmark & Tooling

DOCX Reader (`DocxReader.cs` — 791 lines)

DOCX-to-PDF Converter (`DocxToPdfConverter.cs` — 730 lines)

DOCX-to-PDF Converter (`DocxToPdfConverter.cs` — 682 lines)