Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image and data visualization distinction #91

Open
renesteeman opened this issue Jan 23, 2025 · 3 comments
Open

Image and data visualization distinction #91

renesteeman opened this issue Jan 23, 2025 · 3 comments

Comments

@renesteeman
Copy link

Is it possible to differentiate between an image, such as a photo, logo, or face, and a visual representation of data, such as a bar chart, line graph, or diagram?

@JulioZhao97
Copy link
Collaborator

@renesteeman Could you please provide more explaination? I did not follow your question

@renesteeman
Copy link
Author

@JulioZhao97 Sure. So for my use case, I need a way to distinguish between two types of figures. The first are those that present some sort of useful information to extract, such as graphs and diagrams. The second type would consist of company logos, decorative pictures on slides, pictures of a person, etc. that do not contribute any usable information. This is mainly for corporate documents such as investor presentations. The goal is to get the page numbers for where graphs or diagrams can be found so that their contents can be analyzed without having to look over entire files.

So the question is if it is possible to distinguish the output class of 'figure' into a sub-class of 'data visualization' and 'images'.

@JulioZhao97
Copy link
Collaborator

@renesteeman Sorry for late reply (Chinese New Year). For now we do not distinguish images into figure/diagram/logo or so on. I suggest:

  1. Train your own model on dataset that fits your requirements. Or
  2. Using a classifier or LVLM to distinguish detected images into diagram or logo and so on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants