The NVIDIA driver on your system is too old (found version 11080). 

 NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8 

export MODEL_PATH=Yi-34B-Chat-4bits
01ai/Yi-34B-Chat-4bits  $ export MODEL_ID=01-ai/Yi-34B-Chat-4bits
01ai/Yi-34B-Chat-4bits  $ docker run -it --gpus=all --net=host --shm-size=1g \
> -v $MODEL_PATH:$MODEL_PATH \
> -e DEVICE=cuda:1 \
> -e NCCL_DEBUG=INFO \
> docker.io/vectorchai/scalellm:latest --logtostderr --model_path=$MODEL_PATH --model_id=$MODEL_ID  --model_type=Yi
I20231129 08:13:34.992501     7 main.cpp:135] Using devices: cuda:1
W20231129 08:13:34.993809     7 args_overrider.cpp:132] Overwriting model_type from llama to Yi
I20231129 08:13:34.993916     7 engine.cpp:91] Initializing model from: /data4/candowu/modelscope/01ai/Yi-34B-Chat-4bits
W20231129 08:13:34.993944     7 model_loader.cpp:162] Failed to find tokenizer.json, use tokenizer.model instead. Please consider using fast tokenizer for better performance.
I20231129 08:13:35.245934     7 engine.cpp:98] Initializing model with dtype: Half
I20231129 08:13:35.245993     7 engine.cpp:107] Initializing model with ModelArgs: [model_type: Yi, dtype: float16, hidden_size: 7168, hidden_act: silu, intermediate_size: 20480, n_layers: 60, n_heads: 56, n_kv_heads: 8, vocab_size: 64000, rms_norm_eps: 1e-05, layer_norm_eps: 0, rotary_dim: 0, rope_theta: 5e+06, rope_scaling: 1, rotary_pct: 1, max_position_embeddings: 4096, bos_token_id: 1, eos_token_id: 2, use_parallel_residual: 0, attn_qkv_clip: 0, attn_qk_ln: 0, attn_alibi: 0, alibi_bias_max: 0, no_bias: 0, residual_post_layernorm: 0], QuantArgs: [quant_method: awq, bits: 4, group_size: 128, desc_act: 0, true_sequential: 0]
terminate called after throwing an instance of 'c10::Error'
  what():  The NVIDIA driver on your system is too old (found version 11080). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.
Exception raised from device_count_impl at ../c10/cuda/CUDAFunctions.cpp:53 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f2c0dc6e38b in /app/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xbf (0x7f2c0dc68f3f in /app/lib/libc10.so)
frame #2: c10::cuda::device_count_ensure_non_zero() + 0x18c (0x7f2c0e0535dc in /app/lib/libc10_cuda.so)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The NVIDIA driver on your system is too old (found version 11080). #20

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The NVIDIA driver on your system is too old (found version 11080). #20

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions