Project Prometheus is an autonomous AI framework designed to accelerate early-stage scientific discovery. The system leverages a hierarchical, multi-agent architecture to replicate the full scientific method: it synthesizes knowledge from a research corpus, formulates novel hypotheses guided by an expert AI consultant, conducts multi-stage virtual experiments, learns from the results, and publishes its findings in a peer-review-ready scientific report.
During a recent benchmark mission, the V3.1+ platform autonomously designed a novel inhibitor for the Epidermal Growth Factor Receptor (EGFR), a key anti-cancer target. The final molecule achieved a Composite Score of 9.661, a significant leap in performance over both the baseline drug (Erlotinib) and the previous V2.0 architecture, demonstrating a state-of-the-art capability for intelligent, multi-objective optimization.
(The final champion molecule designed by the V3.1+ architecture for the EGFR target, achieving a record-breaking composite score.)
The platform's latest mission targeted the Epidermal Growth Factor Receptor (EGFR, PDB: 1M17), a well-understood protein in cancer development. The objective was to benchmark the new V3.1+ architecture against the first-generation inhibitor Erlotinib and the results from the legacy V2.0 platform.
The AI's task was to design a novel molecule that balances four competing objectives:
- Binding Affinity: Potency against the target, as predicted by molecular docking.
- Interaction Fingerprint (IFP): The quality of specific, key chemical bonds formed with the protein.
- Drug-Likeness (QED): Favorable physicochemical properties.
- Synthetic Accessibility (SA Score): Feasibility of laboratory synthesis.
Note on the Composite Score: The AI's decision-making is guided by a weighted composite score. This score is transparently defined in the
config.tomlfile and allows the system to balance competing objectives. For the EGFR mission, the formula was:
Score = (-1.0 * Affinity) + (5.0 * QED) + (-1.0 * SA Score) + (10.0 * IFP Score)The negative weights for Affinity and SA Score correctly reward lower values (stronger binding, easier synthesis), while the high weight for the IFP Score strongly incentivizes the formation of specific, high-quality chemical bonds.
The V3.1+ platform successfully designed a novel quinazolin-4-one scaffold with a heavily fluorinated aniline fragment. This candidate achieved a predicted binding affinity of -8.7 kcal mol⁻¹ and a final Composite Score of 9.661. This result not only surpassed the Erlotinib baseline but also definitively exceeded the V2.0 champion's score of 8.746, proving the superiority of the new architecture.
It is important to note that this record-breaking result was achieved in a deliberately constrained, 5-cycle benchmark run. The mission was executed using a Tier 1 developer API key, which imposes a strict rate limit of 30,000 tokens per minute. This demonstrates the platform's exceptional efficiency, but also suggests that with access to higher-tier resources and more extensive, longer-duration runs, there is significant potential for even further optimization and discovery.
At the conclusion of its campaign, Prometheus's ReportSynthesizerAgent produced a full, peer-review-quality scientific paper detailing its methods, results, and limitations.
For full transparency, the complete, final report—with embedded images—and the raw, unabridged mission log are curated as permanent artifacts of this benchmark run.
➡️ Read the full scientific report generated by the AI
➡️ Analyze the raw mission log
Project Prometheus uses a hierarchical, multi-agent architecture orchestrated by a central mission script. This design is the result of a rapid, iterative development cycle aimed at solving the limitations of earlier versions, particularly the "potency plateau" and the risk of generative AI hallucination.
The diagram below provides a high-level overview of the workflow for a single discovery cycle.
The core components of this loop are:
- "Principal Investigator" (GPT-5): A high-level strategic consultant that analyzes the full experiment history and proposes novel, state-of-the-art chemical strategies and valid starting scaffolds.
- "Postdoc" (Gemini 2.5 Pro): A hypothesis generator that uses its massive context window to synthesize the PI's advice with the literature knowledge base, decorating the provided scaffolds to create concrete, testable molecules. An iterative feedback loop allows it to self-correct chemically invalid proposals.
- Experimenter Agent: Executes high-throughput in silico screening, performing molecular docking with Smina and Open Babel.
- Scoring Agent: A multi-faceted evaluation engine that calculates the inputs for the composite score.
- MD Validator Agent: Runs GPU-accelerated Molecular Dynamics simulations with OpenMM on the most promising champion candidates to validate binding-pose stability.
- Report Synthesizer Agent: An agentic, section-by-section workflow that uses GPT-5 to write a final, publication-ready scientific report.
This hierarchical workflow grounds the creativity of large language models in the empirical reality of physics-based simulation at every stage, creating a robust, self-correcting discovery engine.
The AI's decision-making is guided by a transparent, multi-objective composite score defined in the config.toml file. This score allows the system to balance the competing demands of potency, interaction quality, and drug-like properties. For the EGFR mission, the formula was:
Score = (-1.0 * Affinity) + (5.0 * QED) + (-1.0 * SA Score) + (10.0 * IFP Score)
- Binding Affinity: The raw docking score from Smina (kcal mol⁻¹). The negative weight rewards stronger (more negative) binding.
- Drug-Likeness (QED): A score from 0-1 calculated by RDKit. The high positive weight encourages molecules with favorable physicochemical properties.
- Synthetic Accessibility (SA Score): A heuristic score from RDKit where lower is better. The negative weight rewards molecules that are predicted to be easier to synthesize.
- Interaction Fingerprint (IFP Score): A custom score that rewards the formation of specific, high-quality chemical bonds with key residues in the protein's binding pocket, as calculated by ProLIF.
Project Prometheus is a powerful tool designed for beneficial scientific discovery. It is licensed and distributed under a framework designed to encourage responsible research while preventing misuse.
Project Prometheus is licensed under the PolyForm-Noncommercial-1.0.0 License (see LICENSE file for full details).
In simple terms:
- You are free to view, download, modify, and distribute this software for any non-commercial purpose (e.g., personal experimentation, academic research).
- You are not permitted to use this software for any commercial purpose without a separate, written commercial license. Please contact the author for commercial licensing inquiries.
By using this software, you agree that you will not use it, or any output it generates, for any of the following purposes:
- The design, development, or production of any chemical, biological, or nuclear weapons.
- Any military application.
- Activities that violate local or international laws or regulations.
Furthermore, users should be aware that this platform operates using third-party Large Language Models (LLMs) via API keys that you must provide. These upstream providers, such as Google and OpenAI, have their own robust safety policies and filters. Attempts to misuse this framework for malicious purposes will likely violate their Terms of Service and may be blocked or flagged by their internal safety mechanisms.
While Prometheus can run on modern multi-core CPUs, the performance of the various simulation and AI components is hardware-dependent.
- CPU: A modern multi-core processor (e.g., Apple Silicon M-series, Intel i7/i9, AMD Ryzen 7/9) is recommended for a smooth experience.
- GPU: The Molecular Dynamics (MD) validation step is computationally intensive. A GPU is highly recommended to complete this step in a reasonable timeframe. The system is confirmed to leverage Apple Silicon GPUs via Metal.
- Development Environment: The V3.1+ platform was successfully developed and tested on a MacBook Pro with an M3 Pro chip.
- macOS or Linux
- Git
- Miniforge (for
mamba)
Create and activate the environment with all heavy scientific dependencies:
mamba create -n prometheus -c conda-forge python=3.10 rdkit smina openbabel "openmm=8.0.0" pdbfixer openmmforcefields
mamba activate prometheusYour prompt will now begin with (prometheus).
# Clone the repository
git clone https://github.com/mescuwa/project-prometheus.git
cd project-prometheus
# Install remaining Python libraries
pip install -r requirements.txtCreate a .env file in the project root and add your API keys:
GEMINI_API_KEY="YOUR_GOOGLE_API_KEY"
OPENAI_API_KEY="YOUR_OPENAI_API_KEY"
A Note on API Costs: A full mission run involves multiple calls to powerful large language models (GPT-5 and Gemini 2.5 Pro) and will incur API costs. A typical 5-cycle mission may cost a few dollars, but this can vary based on model usage and the complexity of the run. Please refer to the official pricing pages for the most up-to-date information.
Place your PDF literature into data/literature/ and run:
python scripts/build_knowledge_base.pypython scripts/run.pyTimeline & Cost Expectation: The benchmark 5-cycle EGFR mission completed in approximately 35 minutes on a MacBook Pro (M3 Pro) with a Tier 1 OpenAI API key. A full run will incur API costs that may vary. Please refer to the official OpenAI and Google Gemini pricing pages for the latest information.
All outputs are written to a timestamped directory in reports/. For a full-precision run, edit config.toml and set quick_test = false.
Prometheus stands on the shoulders of giants. The AI's reasoning is guided by a knowledge base built from these seminal papers:
- Lynch, T. J., et al. (2004). Activating Mutations in the Epidermal Growth Factor Receptor Underlying Responsiveness of Non–Small-Cell Lung Cancer to Gefitinib. NEJM, 350(21), 2129–2139. https://doi.org/10.1056/NEJMoa040938
- Pao, W., et al. (2004). EGF receptor gene mutations are common in lung cancers from "never smokers" and are associated with sensitivity of tumors to gefitinib and erlotinib. PNAS, 101(36), 13306–13311. https://doi.org/10.1073/pnas.0405220101
- Pao, W., et al. (2005). Acquired Resistance of Lung Adenocarcinomas to Gefitinib or Erlotinib Is Associated with a Second Mutation in the EGFR Kinase Domain. PLoS Medicine, 2(3), e73. https://doi.org/10.1371/journal.pmed.0020073
- Engelman, J. A., et al. (2007). MET Amplification Leads to Gefitinib Resistance in Lung Cancer by Activating ERBB3 Signaling. Science, 316(5827), 1039–1043. https://doi.org/10.1126/science.1141478
- Jänne, P. A., et al. (2015). Osimertinib or Platinum–Pemetrexed in EGFR T790M–Positive Lung Cancer. NEJM, 372(18), 1689–1699. https://doi.org/10.1056/NEJMoa1411817
- Protein Structure: Epidermal Growth Factor Receptor (EGFR) kinase domain coordinates obtained from the RCSB Protein Data Bank.
The V3.1+ architecture represents a quantum leap in capability over the legacy V2.0 platform, which was released under the MIT license. The previous version, while successful, was based on a simpler architecture that lacked the sophisticated strategic reasoning and self-correction mechanisms of the current system.
To fully appreciate the scale of the improvement, we have archived the key artifacts from the final V2.0 mission. We invite you to compare the final AI-generated reports from both versions to see the evolution of the platform's scientific reasoning and communication abilities.

