Skip to content

PrashunKC/random_repo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPT Model Trainer

A Python-based training pipeline for fine-tuning GPT models on the Nvidia Nemotron dataset.

Features

  • Multi-phase training with progressive learning rates
  • Memory-optimized data loading and processing
  • CUDA optimizations for better GPU utilization
  • Dataset validation and analysis tools
  • Progress tracking with detailed statistics
  • Checkpoint management and safe model saving
  • Interactive dataset management system

Requirements

  • Python 3.8+
  • PyTorch with CUDA support
  • Transformers library
  • Datasets library
  • tqdm for progress bars
  • CUDA-capable GPU with 8GB+ VRAM

Installation

pip install torch transformers datasets tqdm

Usage

  1. Set up your environment variables:
os.environ["HUGGINGFACE_TOKEN"] = "your_token_here"
  1. Run the training script:
python model_trainer_v_1_1.py
  1. Replace [User] with the user you are currently logged into your computer as.

  2. Follow the interactive prompts to manage datasets and start training.

  3. Chat with the AI by running chat.py

Configuration

The training pipeline is configured through the DATASET_CONFIGS dictionary in the script. You can modify training parameters like batch size, learning rate, and model checkpointing frequency in the train() function.

License

BSD3 liscense.

About

Random things that I code.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages