MIT HAN Lab

streaming-llm Public

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7.2k 398

llm-awq Public

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3.5k 314

efficientvit Public

Efficient vision foundation models for high-resolution generation and perception.

Python 3.3k 242

bevfusion Public archive

[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

Python 3.1k 575

temporal-shift-module Public

[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding

Python 2.2k 425

once-for-all Public

[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment

Python 1.9k 343

Provide feedback