ASR segment speaker match using IoU to address the issue #28 #35
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This merge request address the bug in: #28
As stated in the issue, there is a clear problem with the actual assignment process in the diarize.py.
Especially with those lines.
As I explained there : #28 (comment) ,
we have to refactor the algorithm in the call method of the ASRDiarizationPipeline.
The idea is to use the intersection over union to match the results from the diarization segments and the asr segments timestamps. We assign the speaker with the best matching IoU for each asr segment.
It is possible to set a threshold to ignore IoU match lower than a specific value and we can assigne a specific "no match" label when the is not a clear match found between a asr segment and any of the diarization segments available.
I removed the same speaker squashing part but we can probably do some refactoring in order to re-implement it in this pull request.
I would like to have feedback about this pull request as I am open to make improvements to it or make changes I could have forgot to take into account.