[Bug]: Some voice files report errors when using the MDXC model #189

Yale-7 · 2025-03-12T15:00:59Z

Describe the bug

When I use some models in MDXC, such as "model_bs_roformer_ep_368_sdr_12.9628.ckpt"、
"mel_band_roformer_kim_ft2_bleedless_unwa.ckpt" , I get errors when running. I think it is related to the sampling rate, because I have set the loading model before, and the sampling rate is 16000, and many models have errors. After changing to the default sampling rate, some models will be executed successfully, such as "MDX23C-8KFFT-InstVoc_HQ.ckpt", but some models will also have errors. The attachment is the original video example.

Have you searched for existing issues? 🔎

I have searched and found no existing issues.

Screenshots or Videos

https://github.com/user-attachments/assets/7b583840-517f-4b5f-beea-38d641979f0a
https://github.com/user-attachments/assets/b8ead5d2-4c34-448b-a53c-4531c50c9529

Logs

2025-03-12 14:39:37.255 | INFO     | __main__:extract_audio_from_video:25 - Extracting audio from /workspace/a.mp4: ffmpeg -i /workspace/a.mp4 -vn -acodec pcm_s16le -ar 44100 -ac 2 ./output/audio_20250312_143937_253147.wav
2025-03-12 14:39:37.319 | INFO     | __main__:extract_audio_from_video:27 - Extracting successfully: ./output/audio_20250312_143937_253147.wav
2025-03-12 14:39:37,321 - INFO - separator - Separator version 0.30.1 instantiating with output_dir: ./output/, output_format: WAV
2025-03-12 14:39:37,321 - INFO - separator - Using model directory from model_file_dir parameter: /workspace/Audio-Tools/audio-separator-models
2025-03-12 14:39:37,322 - INFO - separator - Operating System: Linux #144-Ubuntu SMP Fri Feb 7 20:47:38 UTC 2025
2025-03-12 14:39:37,326 - INFO - separator - System: Linux Node: f82d4d9b4fed3-drtbm Release: 5.15.0-133-generic Machine: x86_64 Proc: x86_64
2025-03-12 14:39:37,326 - INFO - separator - Python Version: 3.10.13
2025-03-12 14:39:37,326 - INFO - separator - PyTorch Version: 2.6.0+cu124
2025-03-12 14:39:37,364 - INFO - separator - FFmpeg installed: ffmpeg version 7.1.1 Copyright (c) 2000-2025 the FFmpeg developers
2025-03-12 14:39:37,366 - INFO - separator - ONNX Runtime GPU package installed with version: 1.21.0
2025-03-12 14:39:37,454 - INFO - separator - CUDA is available in Torch, setting Torch device to CUDA
2025-03-12 14:39:37,455 - INFO - separator - ONNXruntime has CUDAExecutionProvider available, enabling acceleration
2025-03-12 14:39:37,455 - INFO - separator - Loading model mel_band_roformer_kim_ft2_bleedless_unwa.ckpt...
2025-03-12 14:39:44,810 - INFO - mdxc_separator - MDXC Separator initialisation complete
2025-03-12 14:39:44,811 - INFO - separator - Load model duration: 00:00:07
2025-03-12 14:39:44,811 - INFO - separator - Starting separation process for audio_file_path: /workspace/a.mp4
  0%|                                                                                                                                                                      | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/workspace/Audio-Tools/test.py", line 113, in <module>
    vocals_file_path, instrumental_file_path = separate_audio_pipeline(file_path, output_dir)
  File "/workspace/Audio-Tools/test.py", line 103, in separate_audio_pipeline
    vocals_file_path, instrumental_file_path = separate_audio(file_path, output_dir, model_name, output_type)
  File "/workspace/Audio-Tools/test.py", line 81, in separate_audio
    output_files = separator.separate(audio_path, output_names)
  File "/opt/conda/envs/audio-tools/lib/python3.10/site-packages/audio_separator/separator/separator.py", line 769, in separate
    output_files = self.model_instance.separate(audio_file_path, custom_output_names)
  File "/opt/conda/envs/audio-tools/lib/python3.10/site-packages/audio_separator/separator/architectures/mdxc_separator.py", line 138, in separate
    source = self.demix(mix=mix)
  File "/opt/conda/envs/audio-tools/lib/python3.10/site-packages/audio_separator/separator/architectures/mdxc_separator.py", line 258, in demix
    result = self.overlap_add(result, x, window, result.shape[-1] - chunk_size, length)
  File "/opt/conda/envs/audio-tools/lib/python3.10/site-packages/audio_separator/separator/architectures/mdxc_separator.py", line 194, in overlap_add
    result[..., start : start + length] += x[..., :length] * weights[:length]
RuntimeError: The size of tensor a (288414) must match the size of tensor b (352800) at non-singleton dimension 1

System Info

Operating System: ubuntu-2204
Python version: python 3.10
Other...

Additional Information

No response

Yale-7 added the bug Something isn't working label Mar 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Some voice files report errors when using the MDXC model #189

[Bug]: Some voice files report errors when using the MDXC model #189

Yale-7 commented Mar 12, 2025

[Bug]: Some voice files report errors when using the MDXC model #189

[Bug]: Some voice files report errors when using the MDXC model #189

Comments

Yale-7 commented Mar 12, 2025

Describe the bug

Have you searched for existing issues? 🔎

Screenshots or Videos

Logs

System Info

Additional Information