Skip to content

Conversation

@virajsabhaya23
Copy link

Fixed Critical Memory Leak in Word-Level Timestamp Alignment

Problem Identified

  • Memory leak in find_alignment() function causing hooks to persist on exception
  • PyTorch forward hooks not cleaned up when errors occur during inference
  • Accumulated hooks degrading model performance over time

Root Cause Analysis

  • Location: whisper/timing.py, lines 187-204
  • Issue: Hook cleanup code placed outside exception handling
  • Impact:
    ○ Hooks remain attached to cross-attention layers on failure
    ○ Memory consumption increases with each failed alignment attempt
    ○ Model state becomes polluted across subsequent inference calls
    ○ Performance degradation in production environments

Solution Implemented

  • Wrapped hook registration and forward pass in try/finally block
  • Before: Hooks cleaned up only on successful execution
  • After: Hooks guaranteed to be removed regardless of exceptions
  • Ensures proper resource cleanup in all code paths

Technical Details

  • Modified function: find_alignment()
  • Protection added:
    try block wraps model forward pass and attention weight extraction
    finally block ensures all registered hooks are removed
    ○ Exception-safe cleanup prevents memory leaks
    ○ No performance impact on normal execution path

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant