GPT Model Trainer

A Python-based training pipeline for fine-tuning GPT models on the Nvidia Nemotron dataset.

Features

Multi-phase training with progressive learning rates
Memory-optimized data loading and processing
CUDA optimizations for better GPU utilization
Dataset validation and analysis tools
Progress tracking with detailed statistics
Checkpoint management and safe model saving
Interactive dataset management system

Requirements

Python 3.8+
PyTorch with CUDA support
Transformers library
Datasets library
tqdm for progress bars
CUDA-capable GPU with 8GB+ VRAM

Installation

pip install torch transformers datasets tqdm

Usage

Set up your environment variables:

os.environ["HUGGINGFACE_TOKEN"] = "your_token_here"

Run the training script:

python model_trainer_v_1_1.py

Replace [User] with the user you are currently logged into your computer as.
Follow the interactive prompts to manage datasets and start training.
Chat with the AI by running chat.py

Configuration

The training pipeline is configured through the DATASET_CONFIGS dictionary in the script. You can modify training parameters like batch size, learning rate, and model checkpointing frequency in the train() function.

License

BSD3 liscense.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.config		.config
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
chat.py		chat.py
documentation.md		documentation.md
model_trainer_v_1_1.py		model_trainer_v_1_1.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GPT Model Trainer

Features

Requirements

Installation

Usage

Configuration

License

About

Uh oh!

Releases

Packages

Languages

License

PrashunKC/random_repo

Folders and files

Latest commit

History

Repository files navigation

GPT Model Trainer

Features

Requirements

Installation

Usage

Configuration

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages