Markdown for the AI Era. Develop, test, and evalaute your AI Agents.
AgentMark makes it easy for developers to develop, test, and evaluate Agents.
| Feature | Description |
|---|---|
| Multimodal Generation | Generate text, objects, images, and speech from a single prompt file, supporting a wide range of model capabilities. |
| Datasets | Create a collection of inputs and expected outputs to test your prompts and agents in readable JSONL format. |
| Evals | Assess the quality/output of your prompts with our eval support |
| CLI | Run prompts and experiments directly from the command line or your editor for rapid iteration. |
| Tools and Agents | Extend prompts with custom tools and agentic workflows. |
| JSON Output | AgentMark supports structured Object/JSON output through JSON Schema definitions. |
| File Attachments | Attach images and files to prompts for tasks like image analysis, document processing, and more. |
| Type Safety | Ensure reliable, type-checked inputs and outputs for prompts using JSON Schema and auto-generated TypeScript types. |
| Reusable Components | Import and reuse components across your prompts. |
| Conditionals, Loops, Props, Filter Functions | Add logic, dynamic data, and transformations to your prompts with powerful JSX-like syntax. |
| MCP Servers | AgentMark supports calling Model Context Protocol (MCP) tools directly from your prompts. |
Get started by first initializing your AgentMark app.
Install:
npm create agentmark@latest
We offer a few ways to run prompts with AgentMark.
- Use our AgentMark CLI:
Run .prompt.mdx files directly from the command line using our CLI. This is the quickest way to test and execute your prompts.
# Run a prompt with test props (default)
agentmark run-prompt your-prompt.prompt.mdx
# Run a prompt with a dataset
agentmark run-experiment your-prompt.prompt.mdx- Run AgentMark files with your favorite SDK
AgentMark doesn't support any models or calling any LLM providers. Instead, we format the input of your prompt through an adapter to match the input of the SDK you're using.
| Adapter | Description |
|---|---|
| Vercel | The Vercel AI SDK. |
| Mastra | The Mastra SDK |
| LlamaIndex | The LLamaIndex SDK |
| Default | Turns prompts into raw JSON, adapt manually to your needs |
Want to add support for another adapter? Open an issue.
We plan on providing support for AgentMark across a variety of languages.
| Language | Support Status |
|---|---|
| TypeScript | ✅ Supported |
| JavaScript | ✅ Supported |
| Python | |
| Others | Need something else? Open an issue |
AgentMark Studio supports type safety out of the box. Read more about it here.
We welcome contributions! Please check out our contribution guidelines for more information.
AgentMark Cloud extends this OSS project, and allows you to:
- Collaborate with teammates on prompts and datasets
- Run experiments
- Persist your telemetry data
- Annotate and evaluate your data
- Setup alerts for quality, latency, cost, and more
- View high-level metrics for your agents
- Setup two-way syncing with your Git repo
Join our community to collaborate, ask questions, and stay updated:
This project is licensed under the MIT License.
