-
Notifications
You must be signed in to change notification settings - Fork 410
Open
Labels
bugSomething isn't workingSomething isn't workingwindowsConcerns running code on WindowsConcerns running code on Windows
Description
Bug report checklist
I am using the sample provided in this repo to fine tune the model based on some history data I have.
Describe the bug
Getting error when running the fine tune:
C:\GIT\MyPython\training>python train.py --config chronos-bolt-tiny.yaml --model-id amazon/chronos-bolt-tiny --no-random-init --max-steps 1000 --learning-rate 0.001
2025-02-25 09:47:32,958 - C:\GIT\MyPython\training\train.py - INFO - Using SEED: 1733703252
2025-02-25 09:47:33,007 - C:\GIT\MyPython\training\train.py - INFO - Logging dir: output\run-0
2025-02-25 09:47:33,007 - C:\GIT\MyPython\training\train.py - INFO - Loading and filtering 4 datasets for training: ['series1-monthly.arrow', 'series1-weekly.arrow', 'series2-monthly.arrow', 'series2-weekly.arrow']
2025-02-25 09:47:33,007 - C:\GIT\MyPython\training\train.py - INFO - Mixing probabilities: [0.9, 0.7, 0.5, 0.1]
2025-02-25 09:47:33,012 - C:\GIT\MyPython\training\train.py - INFO - Initializing model
2025-02-25 09:47:33,012 - C:\GIT\MyPython\training\train.py - INFO - Using pretrained initialization from amazon/chronos-bolt-tiny
The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
2025-02-25 09:47:39,536 - C:\GIT\MyPython\training\train.py - INFO - Training
0%| | 0/1000 [00:00<?, ?it/s]Traceback (most recent call last):
File "C:\GIT\MyPython\training\train.py", line 702, in <module>
app()
File "C:\GIT\MyPython\.venv\Lib\site-packages\typer\main.py", line 340, in __call__
raise e
File "C:\GIT\MyPython\.venv\Lib\site-packages\typer\main.py", line 323, in __call__
return get_command(self)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\click\core.py", line 1161, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\typer\core.py", line 680, in main
return _main(
^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\typer\core.py", line 198, in _main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\click\core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\click\core.py", line 788, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\typer\main.py", line 698, in wrapper
return callback(**use_params)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\typer_config\decorators.py", line 96, in wrapped
return cmd(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\training\train.py", line 689, in main
trainer.train()
File "C:\GIT\MyPython\.venv\Lib\site-packages\transformers\trainer.py", line 2241, in train
return inner_training_loop(
^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\transformers\trainer.py", line 2500, in _inner_training_loop
batch_samples, num_items_in_batch = self.get_batch_samples(epoch_iterator, num_batches)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\transformers\trainer.py", line 5180, in get_batch_samples
batch_samples += [next(epoch_iterator)]
^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\accelerate\data_loader.py", line 792, in __iter__
main_iterator = self.base_dataloader.__iter__()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\torch\utils\data\dataloader.py", line 491, in __iter__
return self._get_iterator()
^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\torch\utils\data\dataloader.py", line 422, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\GIT\MyPython\.venv\Lib\site-packages\torch\utils\data\dataloader.py", line 1146, in __init__
w.start()
File "C:\Python311\Lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\multiprocessing\popen_spawn_win32.py", line 94, in __init__
reduction.dump(process_obj, to_child)
File "C:\Python311\Lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
File "<stringsource>", line 2, in pyarrow.lib._RecordBatchFileReader.__reduce_cython__
TypeError: no default __reduce__ due to non-trivial __cinit__
0%| | 0/1000 [00:05<?, ?it/s]
PS C:\GIT\MyPython\training> Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Python311\Lib\multiprocessing\spawn.py", line 120, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\multiprocessing\spawn.py", line 130, in _main
self = reduction.pickle.load(from_parent)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
EOFError: Ran out of input
Expected behavior
I expect to succeed
To reproduce
I generated 4 files using the sample code provided.
each file contains 4 series of numbers extracted from a data frame:
data.to_numpy().T
and converted to arrow files using the convert_to_arrow method
Environment description
Operating system: Windows
Python version: 3.11.3
CUDA version: 12.6
PyTorch version: 2.6.0
HuggingFace transformers version: 4.49.0
HuggingFace accelerate version:
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingwindowsConcerns running code on WindowsConcerns running code on Windows