clownrat6

Follow

🤡

A holistic joke

🤡🐀 clownrat6

🤡

A holistic joke

Follow

🤡 "V i L L A i N" without dawn

73 followers · 102 following

🤡 School
🤡 Gotham
clownrat6.github.io
@clownrat66

Achievements

Achievements

Highlights

Pro

Organizations

clownrat6/README.md

Hi there 👋

我是北京大学计算机科学与技术专业直博四年级学生（预计 2026 年毕业），本科毕业于华南理工大学电子与信息学院（2021 届）。

人生格言: 知行合一，格物致知；志存高远，脚踏实地。

📌 主要研究方向

我的研究方向主要集中在 "多模态大模型与图像/视频理解" 领域，具体包括：

多模态大模型 (视频理解), 包括:
- 泛视频理解: Qwen2.5-VL core contributor
- 音视频理解: VideoLLaMA2; CMM
- 流视频理解: VideoLLaMA3
- 长视频理解: Inf-CL (CVPR 2025 Highlight)
- 细粒度视频理解: VideoRefer (CVPR 2025)
图像/视频分割，包括:
- 弱监督分割: OCR (CVPR 2023)
- 视频实例分割: TAR (ICCV 2025)
- 多模态分割: WiCo (IJCAI 2023, Neurocomputing 2024); PVD (AAAI 2024); BriVIS (AAAI 2025)
- 医学图像分割: Fused U-Net (Medical Physics 2021)

📈 学术成果

目前已发表论文 20+ 篇，总 Google Scholar 引用量为。

所参与开源项目获得广泛关注，代表性项目的 GitHub Star 数如下：

💬 联系方式

如果您对我的研究感兴趣，欢迎联系交流合作或提供实习 / 全职机会 🙏🙏。这是我的联系邮箱: [email protected]

📎 Homepages

Personal Pages: https://clownrat6.github.io (updated recently🔥)
Google Scholar: https://scholar.google.com/citations?user=Jkkp8JAAAAAJ

🔥 News

2021.03: I join Sensetime as a research intern in shenzhen for developing MMSegmentation toolkit.

💻 Selected Research Papers

My full paper list is shown at my personal homepage.

Pinned Loading

DAMO-NLP-SG/VideoLLaMA2 DAMO-NLP-SG/VideoLLaMA2 Public

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1.2k 81
DAMO-NLP-SG/VideoLLaMA3 DAMO-NLP-SG/VideoLLaMA3 Public

Frontier Multimodal Foundation Models for Image and Video Understanding

Jupyter Notebook 897 67
DAMO-NLP-SG/Inf-CLIP DAMO-NLP-SG/Inf-CLIP Public

[CVPR 2025 Highlight] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A super memory-efficiency CLIP training sc…

Python 260 11
DAMO-NLP-SG/CMM DAMO-NLP-SG/CMM Public

✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

Python 46 2
OpenVIS OpenVIS Public

[AAAI 2025] Open-vocabulary Video Instance Segmentation Codebase built upon Detectron2, which is really easy to use.

Python 23
mmsegmentation mmsegmentation Public

Forked from open-mmlab/mmsegmentation

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Python 4 2