Skip to content

Conversation

@tsbalzhanov
Copy link

@tsbalzhanov tsbalzhanov commented Aug 30, 2025

Hello

This PR adds an option to include similarity scores into result of mine hard negatives function.
This options might be helpful to fine-tune parameters of the mining function without a need to recalculate scores again or to extract logic of selecting negatives outside of the mining function.

Tsyren Balzhanov

@tsbalzhanov tsbalzhanov marked this pull request as ready for review August 30, 2025 14:46
@tomaarsen
Copy link
Member

Hello!

I think it's indeed a good idea to also allow exporting scores, but so far I've introduced that via the n-tuple-scores: https://github.com/UKPLab/sentence-transformers/blob/1def8d3d6289e72bfa6a6a48592b1342053e6ff2/sentence_transformers/util/hard_negatives.py#L209

If we instead add a parameter akin to include_scores, then we'll have to deprecate the n-tuple-scores presumably. That's not really an issue, though. I'll do some more thinking on it.

  • Tom Aarsen

@tsbalzhanov
Copy link
Author

@tomaarsen

Hi, did you decide on what to do with output_format=n-tuple-scores?
I've just updated the PR: rebased on current master branch and made output_format=n-tuple with include_scores=True equivalent with output_format=n-tuple-scores

@tomaarsen
Copy link
Member

Apologies for the delay. I think it would be preferable indeed to move towards output_scores and deprecate n-tuple-scores. If n-tuple-scores is passed, we can simply give a warning and set output_format="n-tuple" and include_scores=True indeed.

I want share that I'll be taking 3 weeks off starting Monday, so I won't be able to move this PR forward in the meantime. Apologies for this.

  • Tom Aarsen

@tsbalzhanov
Copy link
Author

If n-tuple-scores is passed, we can simply give a warning and set output_format="n-tuple" and include_scores=True indeed.

Okay, I've implemented this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants