Skip to content

Improve book/spine recognition? #8

@Aphexus

Description

@Aphexus

I've been playing around with the code to try to improve recognizing books.

These are issues I've noticed:

  1. Sometimes two or more books are grouped. This seems to be due to various reasons such as the congratulation of the edge detection, lighting issues, and the fact that gray scale is used.
  2. A lot of times there are slivers captured as books when they are clearly not. I think some could be removed simply by checking the width of the sliver and if it is a few pixels or so it could be removed.
  3. Sometimes a spine is split in two. I've noticed this when the spine has a horizontal line in it or if there is a strong specular highlight. Basically anything that will produce a horizontal line.

I'm wondering if there can be a much more improved way to get the books and maybe even books in general such as using AI. If a NN was setup then it would just be a matter of training it and this might be auto generated by combining spines images in to "shelves".

I'm not sure how we can improve the current algorithm to make it more robust. I get a lot of false hits and strange results that make it a bit unusable. Not saying it can't get the data but it's too inaccurate to have it fully automatable.

There are some AI models that already have the ability to detect things like books so I wonder if it could be used as plug and play where one gets the books then clips them out of the image and feeds them in to the OCR to get info about them.

E.g., https://www.freecodecamp.org/news/how-to-detect-objects-in-images-using-yolov8/

https://cocodataset.org/#explore

You can see it is pretty good at recognizing books.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions