Improve book/spine recognition?

I've been playing around with the code to try to improve recognizing books.

These are issues I've noticed:

1. Sometimes two or more books are grouped. This seems to be due to various reasons such as the congratulation of the edge detection, lighting issues, and the fact that gray scale is used.
2. A lot of times there are slivers captured as books when they are clearly not. I think some could be removed simply by checking the width of the sliver and if it is a few pixels or so it could be removed.
3. Sometimes a spine is split in two. I've noticed this when the spine has a horizontal line in it or if there is a strong specular highlight. Basically anything that will produce a horizontal line.


I'm wondering if there can be a much more improved way to get the books and maybe even books in general such as using AI. If a NN was setup then it would just be a matter of training it and this might be auto generated by combining spines images in to "shelves".


I'm not sure how we can improve the current algorithm to make it more robust. I get a lot of false hits and strange results that make it a bit unusable. Not saying it can't get the data but it's too inaccurate to have it fully automatable.

There are some AI models that already have the ability to detect things like books so I wonder if it could be used as plug and play where one gets the books then clips them out of the image and feeds them in to the OCR to get info about them.

E.g., https://www.freecodecamp.org/news/how-to-detect-objects-in-images-using-yolov8/

https://cocodataset.org/#explore

You can see it is pretty good at recognizing books.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve book/spine recognition? #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Improve book/spine recognition? #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions