Highlights
🎉 253 commits from 93 contributors, including 29 new contributors!
- Deepseek enhancements:
- Support for DeepSeek Multi-Token Prediction, 1.69x speedup in low QPS scenarios (#12755)
- AMD support: DeepSeek tunings, yielding 17% latency reduction (#13199)
- Using FlashAttention3 for MLA (#12807)
- Align the expert selection code path with official implementation (#13474)
- Optimize moe_align_block_size for deepseek_v3 (#12850)
- V1 Engine:
- LoRA Support (#10957, #12883)
- Logprobs and prompt logprobs support (#9880), min_p sampling support (#13191), logit_bias in v1 Sampler (#13079)
- Use msgpack for core request serialization (#12918)
- Pipeline parallelism support (#12996, #13353, #13472, #13417, #13315)
- Metrics enhancements: GPU prefix cache hit rate % gauge (#12592), iteration_tokens_total histogram (#13288), several request timing histograms (#12644)
- Initial speculative decoding support with ngrams (#12193, #13365)
Model Support
- Enhancement to Qwen2.5-VL: BNB support (#12944), LoRA (#13261), Optimizations (#13155)
- Support Unsloth Dynamic 4bit BnB quantization (#12974)
- IBM/NASA Prithvi Geospatial model (#12830)
- Support Mamba2 (Codestral Mamba) (#9292), Bamba Model (#10909)
- Ultravox Model: Support v0.5 Release (#12912)
transformers
backend- VLM:
Hardware Support
- Pluggable platform-specific scheduler (#13161)
- NVIDIA: Support nvfp4 quantization (#12784)
- AMD:
- TPU: V1 Support (#13049)
- Neuron: Support Longer Sequences in NKI-based Flash PagedAttention and Improve Efficiency (#12921)
- Gaudi:
Engine Feature
- Add sleep and wake up endpoint and v1 support (#12987)
- Add
/v1/audio/transcriptions
OpenAI API endpoint (#12909)
Performance
Others
- Make vLLM compatible with veRL (#12824)
- Fixes for cases of FA2 illegal memory access error (#12848)
- choice-based structured output with xgrammar (#12632)
- Run v1 benchmark and integrate with PyTorch OSS benchmark database (#13068)
What's Changed
- [Misc] Update w2 scale loading for GPTQMarlinMoE by @dsikka in #12757
- [Docs] Add Google Cloud Slides by @simon-mo in #12814
- [Attention] Use FA3 for MLA on Hopper by @LucasWilkinson in #12807
- [misc] Reduce number of config file requests to HuggingFace by @khluu in #12797
- [Misc] Remove unnecessary decode call by @DarkLight1337 in #12833
- [Kernel] Make rotary_embedding ops more flexible with input shape by @Isotr0py in #12777
- [torch.compile] PyTorch 2.6 and nightly compatibility by @youkaichao in #12393
- [Doc] double quote cmake package in build.inc.md by @jitseklomp in #12840
- [Bugfix] Fix unsupported FA version check for Turing GPU by @Isotr0py in #12828
- [V1] LoRA Support by @varun-sundar-rabindranath in #10957
- Add Bamba Model by @fabianlim in #10909
- [MISC] Check space in the file names in the pre commit checks by @houseroad in #12804
- [misc] Revert # 12833 by @khluu in #12857
- [Bugfix] FA2 illegal memory access by @LucasWilkinson in #12848
- Make vllm compatible with verl by @ZSL98 in #12824
- [Bugfix] Missing quant_config in deepseek embedding layer by @SzymonOzog in #12836
- Prevent unecessary requests to huggingface hub by @maxdebayser in #12837
- [MISC][EASY] Break check file names into entry and args in the pre-commit hooks by @houseroad in #12880
- [Misc] Remove unnecessary detokenization in multimodal processing by @DarkLight1337 in #12868
- [Model] Add support for partial rotary embeddings in Phi3 model by @garg-amit in #12718
- [V1] Logprobs and prompt logprobs support by @afeldman-nm in #9880
- [ROCm] [Feature] [Doc] [Dockerfile] [BugFix] Support Per-Token-Activation Per-Channel-Weight FP8 Quantization Inferencing by @tjtanaa in #12501
- [V1] LM Eval With Streaming Integration Tests by @robertgshaw2-redhat in #11590
- [Bugfix] Fix disagg hang caused by the prefill and decode communication issues by @houseroad in #12723
- [V1][Minor] Remove outdated comment by @WoosukKwon in #12928
- [V1] Move KV block hashes from Request to KVCacheManager by @WoosukKwon in #12922
- [Bugfix] Fix Qwen2_5_VLForConditionalGeneration packed_modules_mapping by @jeejeelee in #12905
- [Misc] Fix typo in the example file by @DK-DARKmatter in #12896
- [Bugfix] Fix multi-round chat error when mistral tokenizer is used by @zifeitong in #12859
- [bugfix] respect distributed_executor_backend in world_size=1 by @youkaichao in #12934
- [Misc] Add offline test for disaggregated prefill by @Shaoting-Feng in #12418
- [V1][Minor] Move cascade attn logic outside _prepare_inputs by @WoosukKwon in #12943
- [Build] Make pypi install work on CPU platform by @wangxiyuan in #12874
- [Hardware][Intel-Gaudi] Enable long-contexts + LoRA support for Intel Gaudi by @SanjuCSudhakaran in #12812
- [misc] Add LoRA to benchmark_serving by @varun-sundar-rabindranath in #12898
- [Misc] Log time consumption on weight downloading by @waltforme in #12926
- [CI] Resolve transformers-neuronx version conflict by @liangfu in #12925
- [Doc] Correct HF repository for TeleChat2 models by @waltforme in #12949
- [Misc] Add qwen2.5-vl BNB support by @Isotr0py in #12944
- [CI/Build] Auto-fix Markdown files by @DarkLight1337 in #12941
- [Bugfix] Remove unused seq_group_metadata_list from ModelInputForGPU by @ShangmingCai in #12935
- [bugfix] fix early import of flash attention by @youkaichao in #12959
- [VLM] Merged multi-modal processor for GLM4V by @jeejeelee in #12449
- [V1][Minor] Remove outdated comment by @WoosukKwon in #12968
- [RFC] [Mistral] FP8 format by @patrickvonplaten in #10130
- [V1] Cache
uses_mrope
in GPUModelRunner by @WoosukKwon in #12969 - [core] port pynvml into vllm codebase by @youkaichao in #12963
- [MISC] Always import version library first in the vllm package by @houseroad in #12979
- [core] improve error handling when wake up from sleep mode by @youkaichao in #12981
- [core][rlhf] add colocate example for RLHF by @youkaichao in #12984
- [V1] Use msgpack for core request serialization by @njhill in #12918
- [Bugfix][Platform] Check whether selected backend is None in get_attn_backend_cls() by @terrytangyuan in #12975
- [core] fix sleep mode and pytorch checkpoint compatibility by @youkaichao in #13001
- [Doc] Add link to tool_choice tracking issue in tool_calling.md by @terrytangyuan in #13003
- [misc] Add retries with exponential backoff for HF file existence check by @khluu in #13008
- [Bugfix] Clean up and fix multi-modal processors by @DarkLight1337 in #13012
- Fix seed parameter behavior in vLLM by @SmartManoj in #13007
- [Model] Ultravox Model: Support v0.5 Release by @farzadab in #12912
- [misc] Fix setup.py condition to avoid AMD from being mistaken with CPU by @khluu in #13022
- [V1][Minor] Move scheduler outputs to a separate file by @WoosukKwon in #13062
- [Docs] Annouce Meta Meetup by @simon-mo in #13065
- [Bugfix] Support missing tool parameters in mistral tokenizer by @fgreinacher in #12884
- [Benchmark] Add BurstGPT to benchmark_serving by @WoosukKwon in #13063
- [Core] Don't do platform detection at import time by @russellb in #12933
- [Misc] LoRA - Refactor Punica ops tests by @varun-sundar-rabindranath in #12970
- [Bugfix]: Reasoning output bug according to the chat template change by @gaocegege in #13025
- [V1][Metrics] Add GPU prefix cache hit rate % gauge by @comaniac in #12592
- [executor] init
local_rank
as device index by @MengqingCao in #13027 - [ROCm] Using a more precise memory profiling by @gshtras in #12624
- [Build] Fix cuda link target of cumem_allocator in CPU env by @guoyuhong in #12863
- [Platform] add pre_register_and_update function by @wangxiyuan in #12432
- [Bugfix] fix flaky test by @SmartManoj in #13089
- [V1][Metrics] Add several request timing histograms by @markmc in #12644
- Set
torch_dtype
inTransformersModel
by @hmellor in #13088 - [Misc] Fix typo at comments at metrics.py by @je1lee in #13024
- [Bugfix] Do not use resource module on Windows (#12858) by @MoonRide303 in #13029
- [BugFix] Pop instead of del CUDA_VISIBLE_DEVICES by @HollowMan6 in #12962
- Fix initializing GGUF weights for ColumnParallelLinear when using tensor parallel > 1 by @SzymonOzog in #13023
- [CI/Build][Bugfix] Fix CPU backend default threads num by @bigPYJ1151 in #13077
- [Doc] Improve OpenVINO installation doc by @hmellor in #13102
- [Bugfix] Guided decoding falls back to outlines when fails to import xgrammar by @terrytangyuan in #12976
- [Misc] Move pre-commit suggestion back to the end by @russellb in #13114
- [RFC][vllm-API] Support tokenizer registry for customized tokenizer in vLLM by @youngkent in #12518
- [Model] IBM/NASA Prithvi Geospatial model by @christian-pinto in #12830
- [ci] Add more source file dependencies for some tests by @khluu in #13123
- [Neuron][Kernel] Support Longer Sequences in NKI-based Flash PagedAttention and Improve Efficiency by @lingfanyu in #12921
- Bump helm/kind-action from 1.10.0 to 1.12.0 by @dependabot in #11612
- Bump actions/stale from 9.0.0 to 9.1.0 by @dependabot in #12462
- Bump helm/chart-testing-action from 2.6.1 to 2.7.0 by @dependabot in #12463
- Bump actions/setup-python from 5.3.0 to 5.4.0 by @dependabot in #12672
- Further reduce the HTTP calls to huggingface.co by @maxdebayser in #13107
- [Misc] AMD Build Improvements by @842974287 in #12923
- [Bug] [V1] Try fetching stop_reason from EngineOutput before checking the request by @bnellnm in #13108
- [Bugfix] Fix num video tokens calculation for Qwen2-VL by @DarkLight1337 in #13148
- [Frontend] Generate valid tool call IDs when using
tokenizer-mode=mistral
by @rafvasq in #12332 - [Misc] Delete unused LoRA modules by @jeejeelee in #13151
- Introduce VLLM_CUDART_SO_PATH to allow users specify the .so path by @houseroad in #12998
- [CI/Build] Use mypy matcher for pre-commit CI job by @russellb in #13162
- [CORE] [QUANT] Support for GPTQModel's
dynamic
quantization per module override/control by @Qubitium in #7086 - [Bugfix] Allow fallback to AWQ from AWQMarlin at per-layer granularity by @mgoin in #13119
- [CI] Fix failing FP8 cpu offload test by @mgoin in #13170
- [V1][Bugfix] Copy encoder input ids to fix set iteration issue during VLM abort by @andoorve in #13173
- [CI/Build] Ignore ruff warning up007 by @russellb in #13182
- [perf-benchmark] cleanup unused Docker images and volumes in H100 benchmark instance by @khluu in #12706
- [NVIDIA] Support nvfp4 quantization by @kaixih in #12784
- [Bugfix][Example] Fix GCed profiling server for TPU by @mgoin in #12792
- [VLM] Implement merged multimodal processor for Mllama by @Isotr0py in #11427
- Simplify logic of locating CUDART so file path by @houseroad in #13203
- [Build] Automatically use the wheel of the base commit with Python-only build by @comaniac in #13178
- [Bugfix] deepseek_r1_reasoning_parser put reason content in wrong field in certain edge case by @LikeSundayLikeRain in #13097
- [Frontend] Move CLI code into vllm.cmd package by @russellb in #12971
- Allow Unsloth Dynamic 4bit BnB quants to work by @danielhanchen in #12974
- [CI/Build] Allow ruff to auto-fix some issues by @russellb in #13180
- [V1][core] Implement pipeline parallel on Ray by @ruisearch42 in #12996
- [VLM] Remove input processor from clip and siglip by @Isotr0py in #13165
- [Frontend] Pass pre-created socket to uvicorn by @russellb in #13113
- [V1] Clarify input processing and multimodal feature caching logic by @ywang96 in #13211
- [VLM] Merged multi-modal processor for Molmo by @DarkLight1337 in #12966
- [V1][Core] Add worker_base for v1 worker by @AoyuQC in #12816
- [Misc] Qwen2.5-VL Optimization by @wulipc in #13155
- [VLM] Separate text-only and vision variants of the same model architecture by @DarkLight1337 in #13157
- [Bugfix] Missing Content Type returns 500 Internal Server Error by @vaibhavjainwiz in #13193
- [Frontend] Add
/v1/audio/transcriptions
OpenAI API endpoint by @NickLucche in #12909 - Add label if pre-commit passes by @hmellor in #12527
- Optimize moe_align_block_size for deepseek_v3 by @mgoin in #12850
- [Kernel][Bugfix] Refactor and Fix CUTLASS 2:4 Sparse Kernels by @tlrmchlsmth in #13198
- Revert "Add label if pre-commit passes" by @hmellor in #13242
- [ROCm] Avoid using the default stream on ROCm as it is a performance killer by @gshtras in #13238
- [Kernel] Fix awq error when n is not divisable by 128 by @jinzhen-lin in #13227
- [V1] Consolidate MM cache size to vllm.envs by @ywang96 in #13239
- [Bugfix/CI] Turn test_compressed_tensors_2of4_sparse back on by @tlrmchlsmth in #13250
- [Bugfix][CI] Inherit codespell settings from pyproject.toml in the pre-commit-config by @tlrmchlsmth in #13237
- [Bugfix] Offline example of disaggregated prefill by @XiaobingSuper in #13214
- [Misc] Remove redundant statements in scheduler.py by @WrRan in #13229
- Consolidate Llama model usage in tests by @hmellor in #13094
- Expand MLA to support most types of quantization by @mgoin in #13181
- [V1] LoRA - Enable Serving Usecase by @varun-sundar-rabindranath in #12883
- [ROCm][V1] Add intial ROCm support to V1 by @SageMoore in #12790
- [Bugfix][V1] GPUModelRunner._update_states should return True when there is a finished request in batch by @imkero in #13126
- [WIP] TPU V1 Support Refactored by @alexm-redhat in #13049
- [Frontend] Optionally remove memory buffer used for uploading to URLs in run_batch by @pooyadavoodi in #12927
- [Bugfix] Fix missing parentheses by @xu-song in #13263
- [Misc] Log time consumption of sleep and wake-up by @waltforme in #13115
- [VLM] Keep track of whether prompt replacements have been applied by @DarkLight1337 in #13215
- [V1] Simplify GPUModelRunner._update_states check by @njhill in #13265
- Support logit_bias in v1 Sampler by @houseroad in #13079
- [Core] choice-based structured output with xgrammar by @russellb in #12632
- [Hardware][Gaudi][Bugfix] Fix error for guided decoding by @zhouyu5 in #12317
- [Quant][Perf] Use moe_wna16 kernel by default for MoEs with many experts by @mgoin in #13236
- [Core] Reduce TTFT with concurrent partial prefills by @joerunde in #10235
- [V1][Core] min_p sampling support by @AoyuQC in #13191
- [V1][CI] Fix failed v1-test because of min_p by @WoosukKwon in #13316
- [V1][Sampler] Don't apply temp for greedy-only by @njhill in #13311
- [V1][PP] Fix memory profiling in PP by @WoosukKwon in #13315
- [Bugfix][AMD] Update torch_bindings so that scaled_fp4_quant isn't build on ROCm by @SageMoore in #13235
- [Bugfix][Docs] Fix offline Whisper by @NickLucche in #13274
- [Bugfix] Massage MLA's usage of flash attn for RoCM by @tlrmchlsmth in #13310
- [BugFix] Don't scan entire cache dir when loading model by @njhill in #13302
- [Bugfix]Fix search start_index of stop_checker by @xu-song in #13280
- [Bugfix] Fix qwen2.5-vl image processor by @Isotr0py in #13286
- [V1][Metrics] Add iteration_tokens_total histogram from V0 by @markmc in #13288
- [AMD] [Model] DeepSeek tunings by @rasmith in #13199
- [V1][PP] Run engine busy loop with batch queue by @comaniac in #13064
- [ci/build] update flashinfer by @youkaichao in #13323
- [Doc] [2/N] Add Fuyu E2E example for multimodal processor by @DarkLight1337 in #13331
- [V1][Spec Decode] Ngram Spec Decode by @LiuXiaoxuanPKU in #12193
- [Quant] Add
SupportsQuant
to phi3 and clip by @kylesayrs in #13104 - [Bugfix] Pin xgrammar to 0.1.11 by @mgoin in #13338
- [BugFix] Enhance test_pos_encoding to support execution on multi-devices by @wchen61 in #13187
- [V1] Update doc and examples for H2O-VL by @ywang96 in #13349
- [ci] skip failed tests for flashinfer by @youkaichao in #13352
- [platform] add base class for communicators by @youkaichao in #13208
- [Bugfix] Fix 2 Node and Spec Decode tests by @DarkLight1337 in #13341
- [Docs] Change myenv to vllm. Update python_env_setup.inc.md by @arkylin in #13325
- [V1][BugFix] Add init.py to v1/spec_decode/ by @WoosukKwon in #13359
- [V1][PP] Cache Intermediate Tensors by @WoosukKwon in #13353
- [Bugfix][Platform][CPU] Fix cuda platform detection on CPU backend edge case by @Isotr0py in #13358
- [V1][BugFix] Clean up rejection sampler & Fix warning msg by @WoosukKwon in #13362
- [V1][Misc] Avoid unnecessary log output by @jeejeelee in #13289
- [Feature][Spec Decode] Simplify the use of Eagle Spec Decode by @ShangmingCai in #12304
- Fix spelling error in index.md by @yankooo in #13369
- Run v1 benchmark and integrate with PyTorch OSS benchmark database by @huydhn in #13068
- [MISC] tiny fixes by @MengqingCao in #13378
- [VLM] Check required fields before initializing field config in
DictEmbeddingItems
by @DarkLight1337 in #13380 - [Model] Support Mamba2 (Codestral Mamba) by @tlrmchlsmth in #9292
- [Bugfix] fix xpu communicator by @yma11 in #13368
- [Bugfix] Fix VLLM_USE_MODELSCOPE issue by @r4ntix in #13384
- [V1] Get input tokens from scheduler by @WoosukKwon in #13339
- [V1][PP] Fix intermediate tensor values by @comaniac in #13417
- [V1][Spec decode] Move drafter to model runner by @WoosukKwon in #13363
- [Bugfix][CI][V1] Work around V1 + CUDA Graph + torch._scaled_mm fallback issue by @tlrmchlsmth in #13425
- [Misc] Remove dangling references to
SamplingType.BEAM
by @hmellor in #13402 - [Model] Enable quantization support for
transformers
backend by @Isotr0py in #12960 - [ROCm] fix get_device_name for rocm by @divakar-amd in #13438
- [v1] fix parallel config rank by @youkaichao in #13445
- [Quant] Molmo SupportsQuant by @kylesayrs in #13336
- [Quant] Arctic SupportsQuant by @kylesayrs in #13366
- [Bugfix] Only print out chat template when supplied by @terrytangyuan in #13444
- [core] fix sleep mode in pytorch 2.6 by @youkaichao in #13456
- [Quant] Aria SupportsQuant by @kylesayrs in #13416
- [V1][PP] Fix & Pin Ray version in requirements-cuda.txt by @WoosukKwon in #13436
- Add outlines fallback when JSON schema has enum by @mgoin in #13449
- [Bugfix] Ensure LoRA path from the request can be included in err msg by @terrytangyuan in #13450
- [Bugfix] Fix failing transformers dynamic module resolving with spawn multiproc method by @Isotr0py in #13403
- [Doc]: Improve feature tables by @hmellor in #13224
- [Bugfix] Remove noisy error logging during local model loading by @Isotr0py in #13458
- [ROCm] Make amdsmi import optional for other platforms by @DarkLight1337 in #13460
- [Bugfix] Handle content type with optional parameters by @zifeitong in #13383
- [Bugfix] Fix invalid rotary embedding unit test by @liangfu in #13431
- [CI/Build] migrate static project metadata from setup.py to pyproject.toml by @dtrifiro in #8772
- [V1][PP] Enable true PP with Ray executor by @WoosukKwon in #13472
- [misc] fix debugging code by @youkaichao in #13487
- [V1][Tests] Adding additional testing for multimodal models to V1 by @andoorve in #13308
- [V1] Optimize handling of sampling metadata and req_ids list by @njhill in #13244
- Pin Ray version to 2.40.0 by @WoosukKwon in #13490
- [V1][Spec Decode] Optimize N-gram matching with Numba by @WoosukKwon in #13365
- [Misc] Remove dangling references to
--use-v2-block-manager
by @hmellor in #13492 - [Hardware][Gaudi][Feature] Support Contiguous Cache Fetch by @zhouyu5 in #12139
- [perf-benchmark] Allow premerge ECR by @khluu in #13509
- [ROCm][MoE configs] mi325 mixtral & mi300 qwen_moe by @divakar-amd in #13503
- [Doc] Add clarification note regarding paligemma by @ywang96 in #13511
- [1/n][CI] Load models in CI from S3 instead of HF by @khluu in #13205
- [perf-benchmark] Fix ECR path for premerge benchmark by @khluu in #13512
- Refactor GPUModelRunnerBase load_model method to include device param by @Zzhiter in #13037
- [Bugfix] Fix Positive Feature Layers in Llava Models by @alex-jw-brooks in #13514
- [Model][Speculative Decoding] DeepSeek MTP spec decode by @luccafong in #12755
- [V1][Core] Generic mechanism for handling engine utility methods by @njhill in #13060
- [Feature] Pluggable platform-specific scheduler by @yannicks1 in #13161
- [CI/Build] force writing version file by @dtrifiro in #13544
- [doc] clarify profiling is only for developers by @youkaichao in #13554
- [VLM][Bugfix] Pass processor kwargs properly on init by @DarkLight1337 in #13516
- [Bugfix] Fix device ordinal when initializing spec_decode_sampler under multi-node setup by @ShangmingCai in #13269
- [doc] clarify multi-node serving doc by @youkaichao in #13558
- Fix copyright year to auto get current year by @wilsonwu in #13561
- [MISC] Logging the message about Ray teardown by @comaniac in #13502
- [Misc] Avoid calling unnecessary
hf_list_repo_files
for local model path by @Isotr0py in #13348 - [BugFix] Avoid error traceback in logs when V1
LLM
terminates by @njhill in #13565 - [3/n][CI] Load Quantization test models with S3 by @khluu in #13570
- [Misc] Qwen2.5 VL support LoRA by @jeejeelee in #13261
- [ci] Add AWS creds for AMD by @khluu in #13572
- [ROCm][MoE] mi300 mixtral8x7B perf for specific BS by @divakar-amd in #13577
- [core] add sleep and wake up endpoint and v1 support by @youkaichao in #12987
- [bugfix] spec decode worker get tp group only when initialized by @simon-mo in #13578
- [Misc] Warn if the vLLM version can't be retrieved by @alex-jw-brooks in #13501
- [Misc] add mm_processor_kwargs to extra_body for Qwen2.5-VL by @wulipc in #13533
- [ROCm] MI300A compile targets deprecation by @gshtras in #13560
- [API Server] Add port number range validation by @terrytangyuan in #13506
- [CI/Build] Use uv in the Dockerfile by @mgoin in #13566
- [ci] Fix spec decode test by @khluu in #13600
- [2/n][ci] S3: Use full model path by @khluu in #13564
- [Kernel] LoRA - Refactor sgmv kernels by @varun-sundar-rabindranath in #13110
- Merge similar examples in
offline_inference
into singlebasic
example by @hmellor in #12737 - [Bugfix] Fix deepseekv3 grouped topk error by @Chen-XiaoBing in #13474
New Contributors
- @jitseklomp made their first contribution in #12840
- @fabianlim made their first contribution in #10909
- @ZSL98 made their first contribution in #12824
- @SzymonOzog made their first contribution in #12836
- @DK-DARKmatter made their first contribution in #12896
- @Shaoting-Feng made their first contribution in #12418
- @SmartManoj made their first contribution in #13007
- @farzadab made their first contribution in #12912
- @je1lee made their first contribution in #13024
- @MoonRide303 made their first contribution in #13029
- @christian-pinto made their first contribution in #12830
- @lingfanyu made their first contribution in #12921
- @842974287 made their first contribution in #12923
- @kaixih made their first contribution in #12784
- @LikeSundayLikeRain made their first contribution in #13097
- @danielhanchen made their first contribution in #12974
- @AoyuQC made their first contribution in #12816
- @wulipc made their first contribution in #13155
- @vaibhavjainwiz made their first contribution in #13193
- @xu-song made their first contribution in #13263
- @zhouyu5 made their first contribution in #12317
- @arkylin made their first contribution in #13325
- @yankooo made their first contribution in #13369
- @huydhn made their first contribution in #13068
- @r4ntix made their first contribution in #13384
- @Zzhiter made their first contribution in #13037
- @luccafong made their first contribution in #12755
- @wilsonwu made their first contribution in #13561
- @Chen-XiaoBing made their first contribution in #13474
Full Changelog: v0.7.2...v0.7.3