Skip to content

Standalone equations are partially confused with regular text items #730

@cau-git

Description

@cau-git

Bug

The layout detector wrongly classifies some obvious equations as text items. The post-processing needs to be improved to resolve these cases better when there are competing proposals on the same element with different labels and confidences.

Steps to reproduce

Convert the provided example PDF and observe missed formulas.
code_and_formulas_2.pdf

Docling version


Docling version: 2.15.1
Docling Core version: 2.14.0
Docling IBM Models version: 3.1.2
Docling Parse version: 3.0.0

Python version

Any

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinglayout

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions