Skip to content
@mit-han-lab

MIT HAN Lab

Efficient AI Computing. PI: Song Han

Pinned Loading

  1. streaming-llm streaming-llm Public

    [ICLR 2024] Efficient Streaming Language Models with Attention Sinks

    Python 6.8k 377

  2. smoothquant smoothquant Public

    [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

    Python 1.3k 161

  3. llm-awq llm-awq Public

    [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    Python 2.8k 232

  4. bevfusion bevfusion Public archive

    [ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

    Python 2.5k 444

  5. once-for-all once-for-all Public

    [ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment

    Python 1.9k 338

  6. temporal-shift-module temporal-shift-module Public

    [ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding

    Python 2.1k 420

Repositories

Showing 10 of 58 repositories
  • nunchaku Public

    [ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

    mit-han-lab/nunchaku’s past year of commit activity
    Cuda 701 Apache-2.0 45 37 (2 issues need help) 2 Updated Feb 22, 2025
  • torchquantum Public

    A PyTorch-based framework for Quantum Classical Simulation, Quantum Machine Learning, Quantum Neural Networks, Parameterized Quantum Circuits with support for easy deployments on real quantum computers.

    mit-han-lab/torchquantum’s past year of commit activity
    Jupyter Notebook 1,399 MIT 213 61 (4 issues need help) 9 Updated Feb 21, 2025
  • omniserve Public

    [MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

    mit-han-lab/omniserve’s past year of commit activity
    C++ 514 Apache-2.0 31 34 4 Updated Feb 21, 2025
  • VisCompare Public

    A WebUI for Side-by-Side Comparison of Media (Images/Videos) Across Multiple Folders

    mit-han-lab/VisCompare’s past year of commit activity
    Python 19 Apache-2.0 2 0 0 Updated Feb 21, 2025
  • deepcompressor Public

    Model Compression Toolbox for Large Language Models and Diffusion Models

    mit-han-lab/deepcompressor’s past year of commit activity
    Python 337 Apache-2.0 26 28 0 Updated Feb 21, 2025
  • Block-Sparse-Attention Public

    A sparse attention kernel supporting mix sparse patterns

    mit-han-lab/Block-Sparse-Attention’s past year of commit activity
    C++ 142 BSD-3-Clause 4 5 0 Updated Feb 13, 2025
  • llm-awq Public

    [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    mit-han-lab/llm-awq’s past year of commit activity
    Python 2,763 MIT 232 148 8 Updated Feb 11, 2025
  • duo-attention Public

    [ICLR 2025] DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

    mit-han-lab/duo-attention’s past year of commit activity
    Python 428 MIT 26 9 0 Updated Feb 11, 2025
  • efficientvit Public

    Efficient vision foundation models for high-resolution generation and perception.

    mit-han-lab/efficientvit’s past year of commit activity
    Python 2,650 Apache-2.0 213 102 0 Updated Jan 24, 2025
  • patch_conv Public

    Patch convolution to avoid large GPU memory usage of Conv2D

    mit-han-lab/patch_conv’s past year of commit activity
    Python 85 MIT 6 2 1 Updated Jan 23, 2025