Skip to content

profadityasaxena/Building_Generative_AI_Agents_with_Deep_Reinforcement_Learning

Repository files navigation

Building Generative AI Agents with Deep Reinforcement Learning

This repository accompanies the research study entitled "Building Generative AI Agents with Deep Reinforcement Learning". The objective of this project is to evaluate the effectiveness of integrating deep reinforcement learning (DRL) with generative large language models (LLMs) for constructing autonomous agents capable of sequential, multimodal reasoning and decision-making in financial environments.

We empirically compare two agent architectures:

  • A DRL-enhanced generative agent that combines policy learning with generative modeling.
  • A static generative agent that operates without reinforcement-based optimization.

The agents are assessed in the context of stock market trading simulations using historical financial data spanning four decades.

Research Objectives

This study investigates the following core research questions:

  1. RQ1: How can deep reinforcement learning be effectively combined with generative large language models to construct autonomous agents capable of complex, multi-modal reasoning and decision-making?

  2. RQ2: What architectural frameworks and parallelization techniques are most effective in accelerating the training of generative AI agents using DRL, without sacrificing sample efficiency or convergence stability?

  3. RQ3: How can multi-agent reinforcement learning (MARL) frameworks be enhanced with generative capabilities to support cooperative, competitive, or hierarchical behavior in complex environments?

Methodological Overview

The methodology consists of the following major components:

  • Data Source: Historical stock market data from over 1,000 publicly listed firms across a 40-year period. Derived features include technical indicators, price volatility, and temporal embeddings.

  • DRL Agent Architecture: Implements a policy optimized using Proximal Policy Optimization (PPO), operating within a Gym-compatible trading environment. The reward function encodes portfolio returns adjusted for risk and transaction costs.

  • Static Agent Baseline: Uses the same input features but selects actions via a supervised learning model, trained to predict directional price movement without reinforcement signals.

  • Evaluation Metrics: Performance is assessed using cumulative return, Sharpe ratio, maximum drawdown, and, where applicable, action entropy and interpretability of generative outputs.

  • Implementation Strategy: Both agents are trained and evaluated in controlled simulation environments. Statistical comparisons are conducted over multiple seeds and market segments to ensure robustness.

Experimental Goals

This repository enables reproduction of the experiments described in the paper, including:

  • Training both DRL-based and static agents under equivalent data and environmental conditions.
  • Evaluating agent behavior in high-volatility, non-stationary time-series domains.
  • Analyzing the comparative advantages of reinforcement-based adaptation in generative systems.

Installation and Execution

To run the experiments:

  1. Clone the repository.
  2. Create a Python virtual environment and install dependencies listed in requirements.txt.
  3. Execute the training and evaluation scripts (train.py and evaluate.py) with appropriate configuration files.
  4. Results will be logged in designated output directories for statistical and visual analysis.

Citation

If you use this codebase or build upon this work, please cite the following:

@article{your2025generative,
  title={Building Generative AI Agents with Deep Reinforcement Learning},
  author={Aditya Saxena},
  year={2025},
  note={Manuscript, Toronto Metropolitan University}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published