Skip to content

Update/retrain layout model to identify correctly single column reference pages #908

@PeterStaar-IBM

Description

@PeterStaar-IBM

Bug

Currently, I see that the layout model makes sometimes tables out of references.

Steps to reproduce

example 1: https://arxiv.org/pdf/2106.09685

Image

Image

example 2: https://arxiv.org/pdf/2501.12948

Image

Image

Docling version

Docling version: 2.18.0
Docling Core version: 2.17.1
Docling IBM Models version: 3.3.0
Docling Parse version: 3.2.0
Python: cpython-312 (3.12.6)
Platform: macOS-15.3-arm64-arm-64bit

Metadata

Metadata

Labels

bugSomething isn't workinglayout

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions