Skip to content

Feature request: Add OMML (Office Math Markup Language) parsing support #859

@bgreenwell

Description

@bgreenwell

Is your feature request related to a problem? Please describe.

I'm building a terminal-based .docx viewer and am frustrated when display equations
(standalone math formulas) don't appear at their correct positions in the document flow. This
happens because docx-rs currently skips paragraphs that contain only OMML (Office Math
Markup Language) content during parsing.

For example, a document with:
Paragraph 1: "Introduction text"
Paragraph 2: [Display equation: A=πr²]
Paragraph 3: "Conclusion text"

The equation paragraph is completely missing from the parsed DocumentChild structure, making it impossible to preserve the correct document order.

Describe the solution you'd like

Add support for parsing OMML content so that equations are included in the document structure. This could be implemented in one of two ways:

Option 1: New DocumentChild variant

  pub enum DocumentChild {
      Paragraph(Paragraph),
      Table(Table),
      Equation(Equation),  // New variant for OMML equations
      // ... existing variants
  }

Option 2: Include OMML within Paragraph children

Expose equation content as part of the paragraph's child elements when equations are present.

Either approach would allow document processing tools to maintain equation positions in the output.

Describe alternatives you've considered

Current workaround: Parse the document twice:

  1. Use docx-rs for regular content (paragraphs, tables, etc.)
  2. Separately parse word/document.xml directly using quick-xml to extract OMML equations
  3. Merge the two based on paragraph indices

Problems with this approach:

  • Index misalignment because docx-rs skips equation-only paragraphs
  • Requires maintaining parallel parsing logic
  • Fragile and error-prone

Other libraries: Looked at other Rust DOCX libraries, but docx-rs has the best feature set
for our needs aside from this limitation.

Additional context

  • I'm working on https://github.com/bgreenwell/doxx, a terminal .docx viewer for developers
  • Can provide sample .docx files demonstrating the issue
  • OMML elements appear as <m:oMath> (inline) and <m:oMathPara> (display) in word/document.xml
  • This would benefit the broader Rust ecosystem: document viewers, converters (docx →
    markdown/html), and document processing tools

I'm happy to contribute a PR if you're open to this feature!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions