You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Try to run nueral_compressor/language_modeling, as follows. it just same as on read.me. I have 24G GPU, but cause GPU OOM. This model is only 125M, is it normal? How much GPU ram do I need?
python run_clm.py \
--model_name_or_path EleutherAI/gpt-neo-125M \
--dataset_name wikitext \
--dataset_config_name wikitext-2-raw-v1 \
--apply_quantization \
--quantization_approach aware_training \
--apply_pruning \
--target_sparsity 0.02 \
--num_train_epochs 4 \
--max_train_samples 100 \
--do_train \
--do_eval \
--verify_loading \
--output_dir /tmp/clm_output
/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
0%| | 0/52 [00:00<?, ?it/s]2023-04-10 13:44:00 [INFO] Fx trace of the entire model failed. We will conduct auto quantization
/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/torch/ao/quantization/observer.py:214: UserWarning: Please use quant_min and quant_max to specify the range for observers. reduce_range will be deprecated in a future release of PyTorch.
warnings.warn(
2023-04-10 13:44:02 [INFO] current target ratio is 0.0
2023-04-10 13:44:03 [INFO] current sparsity ratio is 0.0
/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/torch/ao/quantization/fake_quantize.py:309: UserWarning: _aminmax is deprecated as of PyTorch 1.11 and will be removed in a future release. Use aminmax instead. This warning will only appear once per process. (Triggered internally at ../aten/src/ATen/native/ReduceAllOps.cpp:45.)
return torch.fused_moving_avg_obs_fake_quant(
/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/torch/ao/quantization/fake_quantize.py:309: UserWarning: _aminmax is deprecated as of PyTorch 1.11 and will be removed in a future release. Use aminmax instead. This warning will only appear once per process. (Triggered internally at ../aten/src/ATen/native/TensorCompare.cpp:568.)
return torch.fused_moving_avg_obs_fake_quant(
Traceback (most recent call last):
File "/home/chang/AI/llm/optimum-intel/examples/neural_compressor/language-modeling/run_clm.py", line 732, in <module>
main()
File "/home/chang/AI/llm/optimum-intel/examples/neural_compressor/language-modeling/run_clm.py", line 654, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/transformers/trainer.py", line 1633, in train
return inner_training_loop(
File "/home/chang/AI/llm/optimum-intel/optimum/intel/neural_compressor/trainer.py", line 411, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/transformers/trainer.py", line 2645, in training_step
loss = self.compute_loss(model, inputs)
File "/home/chang/AI/llm/optimum-intel/optimum/intel/neural_compressor/trainer.py", line 699, in compute_loss
outputs = model(**inputs)
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/transformers/models/gpt_neo/modeling_gpt_neo.py", line 756, in forward
lm_logits = self.lm_head(hidden_states)
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/torch/fx/graph_module.py", line 658, in call_wrapped
return self._wrapped_call(self, *args, **kwargs)
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/torch/fx/graph_module.py", line 277, in __call__
raise e
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/torch/fx/graph_module.py", line 267, in __call__
return super(self.cls, obj).__call__(*args, **kwargs) # type: ignore[misc]
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "<eval_with_key>.439", line 7, in forward
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1215, in _call_impl
hook_result = hook(self, input, result)
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/neural_compressor/adaptor/torch_utils/util.py", line 84, in output_scale_hook
module.output_observer(output)
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/chang/anaconda3/envs/openvino/lib/python3.9/site-packages/torch/ao/quantization/fake_quantize.py", line 309, in forward
return torch.fused_moving_avg_obs_fake_quant(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.54 GiB (GPU 0; 23.68 GiB total capacity; 20.09 GiB already allocated; 1.05 GiB free; 20.36 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
The text was updated successfully, but these errors were encountered:
Try to run nueral_compressor/language_modeling, as follows. it just same as on read.me. I have 24G GPU, but cause GPU OOM. This model is only 125M, is it normal? How much GPU ram do I need?
The text was updated successfully, but these errors were encountered: