Skip to content

Conversation

@Pikauba
Copy link

@Pikauba Pikauba commented Dec 12, 2023

This merge request address the bug in: #28

As stated in the issue, there is a clear problem with the actual assignment process in the diarize.py.

Especially with those lines.

As I explained there : #28 (comment) ,
we have to refactor the algorithm in the call method of the ASRDiarizationPipeline.

The idea is to use the intersection over union to match the results from the diarization segments and the asr segments timestamps. We assign the speaker with the best matching IoU for each asr segment.

It is possible to set a threshold to ignore IoU match lower than a specific value and we can assigne a specific "no match" label when the is not a clear match found between a asr segment and any of the diarization segments available.

I removed the same speaker squashing part but we can probably do some refactoring in order to re-implement it in this pull request.

I would like to have feedback about this pull request as I am open to make improvements to it or make changes I could have forgot to take into account.

@Pikauba Pikauba changed the title ASR segment spearker match using IoU to adress issue: [#28](https://github.com/huggingface/speechbox/issues/28) ASR segment speaker match using IoU to address the issue #28 Dec 12, 2023
@2010b9
Copy link

2010b9 commented May 20, 2024

Thanks for doing this! I've tried your code, but I'm having the same issue mentioned in #28 (comment). I don't know why it happens, but I haven't looked thoroughly to the code yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants