Skip to content

Add Search::LexiconfreeRNNTTimesyncBeamSearch#179

Draft
larissakl wants to merge 2 commits intomasterfrom
rnnt-timesync-beam-search
Draft

Add Search::LexiconfreeRNNTTimesyncBeamSearch#179
larissakl wants to merge 2 commits intomasterfrom
rnnt-timesync-beam-search

Conversation

@larissakl
Copy link
Contributor

Adds a lexiconfree timesynchronous beam-search algorithm for standard (non-monotonic) Transducers. At each timestep, multiple non-blank labels can be predicted (the maximum number is controllable via a hyperparameter), a hypothesis is finished in the current timestep if it has emitted a blank label. In the inner loop of a timestep, first, all active inner hypotheses are extended with blank so they become outer hypotheses. Then, the inner hyps are extended by non-blank tokens and are pruned. If there are already more than max-beam-size outer hyps, all inner hyps which are worse than the worst of the max-beam-size best outer hyps are removed. If no inner hyps are left, the inner loop is stopped. At the end of a timestep, the outer hyps are pruned again based on their length-normalized score.

The implementation is based on PyTorch's RNNTBeamSearch

Major To-Dos:

  • Integration of sub-scorers and intermediate pruning
  • Correct handling of sentence-end
  • Testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant