Python reference implementation of the experience.md standard.
experience.md is an open standard for packaging, storing, retrieving, and adapting AI agent operational experience as structured, transferable artifacts.
"Knowledge is taught. Skill is earned."
LLMs have knowledge. RAG retrieves knowledge.
experience.mdtransfers earned experience from real-world task execution — recovery paths, failure signatures, validated skills — in a form agents can retrieve, adapt, and apply.
When Agent A fixes a broken OAuth redirect URI in Azure AD, it doesn't just succeed — it learns something. That learning is currently lost. Next time a different agent hits the same problem class in Keycloak, it starts from zero.
experience.md captures what Agent A learned as a structured Experience Packet, stores it, and makes it retrievable and adaptable for Agent B — even when the environment is different.
Agent A (Azure AD) Quantum Experience Mesh Agent B (Keycloak)
│ │ │
Task succeeds ◄─── stores ─── │
pack.from_episode() │ │
│ │ Task starts │
└──► ExperiencePacket ──────┤ │
│ ◄── retrieve ────────────────►│
│ adapt() │
│ │
│ "Open Keycloak Admin → │
│ Clients → Redirect URIs" │
│ (was: Azure App Reg) │
If you've used skills.md files to give agents structured instructions before a task, experience.md is the natural complement:
skills.md = instructions written before execution → tells an agent HOW to act
experience.md = artifacts generated after execution → captures what actually worked
They form a complete agent learning loop. skills.md defines the procedure. experience.md records what happened in the real world and what generalised across environments. You are not choosing between them — you need both.
No model changes, no fine-tuning, no special infrastructure required. experience.md sits above the model layer — it structures what gets passed into context, not how the model works. Drop it into any agent built on OpenAI, Anthropic, Gemini, local models, or any framework (LangChain, LangGraph, CrewAI, AutoGen).
experience.md brings the same discipline Git brought to code — applied to agent learning:
skills versioned → revert an agent to any prior skill state
baselines pinned → reproducible experiments, comparable results
provenance tracked → every packet traces back to its source execution
trust scored → reputation updates with every use, like commit history
Enterprises running critical agent workflows can audit exactly what an agent knew, when it knew it, and where that knowledge came from.
git clone https://github.com/quantum-agents/experience-md
cd experience-md
python examples/oauth_example.pyExpected output:
[1] Packing Azure AD experience...
Packed: ExperiencePacket(domain=oauth, family=redirect-uri-mismatch, trust=0.50)
[2] Saving to store...
Store stats: {'total_packets': 1, 'by_domain': {'oauth': 1}}
[3] Retrieving for Keycloak redirect problem...
Found 1 result — score: 0.528
[4] Adapting to Keycloak + k8s-secret environment...
Substitutions applied: 15
Step 2: Open Keycloak Admin → Clients → {client_id} → Redirect URIs [adapted]
Step 4: Update k8s secret KEYCLOAK_REDIRECT_URI to match exactly [adapted]
Confidence: 0.41
[5] Trust updated: 0.50 → 0.926 after successful transfer
[6] Validation status: corroborated
pip install experiencemd # PyPI (coming soon)
# or from source:
git clone https://github.com/quantum-agents/experience-md
cd experience-md
pip install -e .No dependencies beyond the Python standard library for the core library. Optional: pyyaml for YAML I/O, numpy/sentence-transformers for embedding-based retrieval (drop-in replacements for the default token similarity).
from experiencemd import pack
packet = pack.from_episode(
agent_id="my-agent-prod",
domain="oauth",
task_family="redirect-uri-mismatch",
task_goal="Fix OAuth login — users authenticated but callback rejected",
environment={
"protocol": "OIDC",
"flow": "auth-code",
"provider": "azure-ad",
"config_surface": "env-file",
"app_type": "spa",
},
observable_symptoms=[
"Token generation succeeds",
"Callback URL returns 401 or redirect loop",
],
failure_signature="callback_uri_rejected_post_auth",
confirmed_root_cause="Redirect URI mismatch between app registration and runtime config",
misleading_signals=["Token generation appeared successful"],
failed_attempts=["Checked token scopes", "Verified client ID"],
steps=[
{"action": "Retrieve exact callback URL from runtime config",
"is_adaptation_point": False},
{"action": "Open Azure App Registration → Authentication → Redirect URIs",
"is_adaptation_point": True}, # ← will be adapted per environment
{"action": "Compare URIs character-by-character including trailing slash",
"is_adaptation_point": False},
{"action": "Update env-file AZURE_REDIRECT_URI to match registration exactly",
"is_adaptation_point": True},
{"action": "Re-test auth flow end-to-end", "is_adaptation_point": False},
],
why_it_worked="OIDC requires exact URI equality — even trailing slash differences cause rejection",
skill_statement=(
"When auth partially succeeds but callback fails, verify exact redirect URI "
"equality (scheme, host, path, trailing slash) between provider registration "
"and runtime config. These are independent surfaces that must match exactly."
),
applicable_when=["OIDC/OAuth2 auth-code flow", "Callback-based auth"],
not_applicable_when=["Client credentials flow", "Token scope errors"],
adaptation_required=[
"Provider-specific registration console path",
"Environment config surface and variable names",
],
outcome="success",
time_to_resolve=840,
retry_count=2,
)
print(packet)
# ExperiencePacket(id=4f224b48..., domain=oauth, family=redirect-uri-mismatch,
# status=unvalidated, trust=0.50)PII scrubbing is on by default. Emails, IPs, UUIDs, secrets, and URLs are redacted before the packet is stored. Agent IDs are one-way hashed.
from experiencemd import ExperienceStore
store = ExperienceStore("./experience_db")
store.save(packet)
print(store.stats())
# {'total_packets': 1, 'by_domain': {'oauth': 1},
# 'by_status': {'unvalidated': 1}, 'avg_trust_score': 0.5}results = store.retrieve(
query_env={
"protocol": "OIDC",
"flow": "auth-code",
"provider": "keycloak", # different provider
"config_surface": "k8s-secret",
"app_type": "server",
},
task_context="keycloak redirect uri rejected after login oauth callback",
domain="oauth",
task_family="redirect-uri-mismatch", # problem classification
top_k=5,
)
print(results[0])
# RetrievalResult(score=0.528, packet=4f224b48..., family=redirect-uri-mismatch)Retrieval uses a multi-factor scoring function across task family, failure signature, environment similarity, tool overlap, trust score, and recency. All weights are configurable.
from experiencemd import ExperienceAdapter
adapter = ExperienceAdapter()
adapted = adapter.adapt(
results[0].packet,
target_env={"provider": "keycloak", "config_surface": "k8s-secret"},
)
print(adapted.summary())
# Adaptation of 4f224b48...
# Substitutions applied: 15
# Unmapped fields: 0
# Steps needing review: []
# Confidence: 0.41
for step in adapted.adapted_steps:
print(f" {step['step_id']}. {step['action']}")
# 1. Retrieve exact callback URL from runtime config
# 2. Open Keycloak Admin → Clients → {client_id} → Authentication → Redirect URIs [adapted]
# 3. Compare URIs character-by-character: scheme, host, path, trailing slash
# 4. Update k8s secret KEYCLOAK_REDIRECT_URI to match registration exactly [adapted]
# 5. Re-test auth flow end-to-endAdaptation is 70% deterministic (mapping tables) + 30% LLM gap-fill (subclass ExperienceAdapter._llm_fill_gaps to enable).
store.update_trust(
packet_id=results[0].packet.experience_id,
transfer_succeeded=True,
attribution_score=0.75,
retrieving_agent_id="agent-consumer-b",
task_context="keycloak redirect failure in staging",
)
# Trust score rises: 0.50 → 0.926store.corroborate(packet_id, "agent-validator-001")
store.corroborate(packet_id, "agent-validator-002")
packet = store.get(packet_id)
print(packet.validation_status)
# ValidationStatus.CORROBORATED (requires 2+ independent agents)Subclass ExperienceAdapter and override _llm_fill_gaps:
from experiencemd import ExperienceAdapter
class LLMAdapter(ExperienceAdapter):
def __init__(self, llm_client, **kwargs):
super().__init__(**kwargs)
self.llm = llm_client
def _llm_fill_gaps(self, result, packet, unmapped):
if not unmapped:
return result
prompt = self.build_llm_prompt(result, packet, unmapped)
response = self.llm.complete(prompt) # your LLM call here
import json
try:
suggestions = json.loads(response)
step_map = {s["step_id"]: s for s in suggestions}
for step in result.adapted_steps:
if step["step_id"] in step_map:
suggestion = step_map[step["step_id"]]
step["action"] = suggestion["adapted_action"]
step["needs_review"] = suggestion.get("needs_human_review", False)
except Exception:
pass # keep deterministic result on parse failure
return resultThe built-in build_llm_prompt() produces a structured, bounded prompt that constrains the LLM to substitution tasks only — not free-form generation.
An Experience Packet has three layers:
| Layer | Contents | Purpose |
|---|---|---|
| Scenario | task goal, environment signature, symptoms | What was the situation |
| Failure + Solution | failure signature, steps, recovery path | What happened and what was done |
| Transferable Skill | abstracted pattern, applicability, adaptation hooks | What generalises |
Layer 3 is a separate abstraction pass — not a summary of Layers 1 and 2. It deliberately removes context-specific detail to preserve the transferable core.
Full schema: see SCHEMA.md or the paper experience_md_paper.md.
S(packet, query) = 0.30 × task_family_similarity
+ 0.25 × failure_signature_similarity
+ 0.20 × environment_similarity
+ 0.10 × tool_similarity
+ 0.10 × trust_score
+ 0.05 × recency
All weights are configurable. The default token-similarity functions are drop-in replaceable with embedding-based similarity for production use.
trust_score = base_confidence × 0.4
+ reuse_success_rate × 0.4
+ avg_attribution_score × 0.2
True attribution corrects for tasks that would have succeeded without the transferred experience:
true_attribution = raw_attribution × (1 - base_success_rate_on_similar_tasks)
This prevents false positives where the agent succeeded despite the transfer rather than because of it.
Packets use semantic versioning (MAJOR.MINOR.PATCH). Baselines snapshot store state for reproducible experiments:
# All versions retained — pin to any prior state
store.get(packet_id) # latest
store.get(packet_id, version="0.1.0") # pinned (coming in v0.2.0)Add domain-specific substitutions at init time:
adapter = ExperienceAdapter(extra_mappings={
"provider": {
("okta", "auth0"): {
"Okta Admin Console": "Auth0 Dashboard",
"OKTA_CLIENT_ID": "AUTH0_CLIENT_ID",
}
}
})Or contribute to the built-in tables in adapt.py.
experiencemd/
├── schema.py — ExperiencePacket and all dataclasses
├── pack.py — from_episode() factory, PII scrubbing, quality gate
├── store.py — ExperienceStore, multi-factor retrieval scoring
├── adapt.py — ExperienceAdapter, env diff, substitution engine
└── __init__.py — public API
tests/
└── test_experiencemd.py — 22 tests, all passing
examples/
└── oauth_example.py — full end-to-end OAuth walkthrough
- v0.2.0 — YAML/JSON import/export for packets and baselines
- v0.2.0 — Packet version pinning in store
- v0.3.0 — Embedding-based similarity (drop-in for token similarity)
- v0.3.0 — Vector DB backend (Chroma, Qdrant, pgvector)
- v0.4.0 — Quantum Experience Mesh (distributed multi-agent store)
- v0.4.0 — Counterfactual attribution (automated baseline comparison)
- v1.0.0 — Stable schema, community-ratified mapping tables
The most valuable contributions right now:
- Domain mapping tables — extend
BUILTIN_MAPPINGSinadapt.pyfor new provider pairs - Task family definitions — standard names for common problem classes
- Embedding backends — drop-in replacements for
_text_token_similarity - Experiment results — run the benchmark and share your findings
If you use experience.md in research, please cite:
@misc{experiencemd2026,
title = {experience.md: A Standard for Transferable Agent Experience},
author = {Quantum Agents Project},
year = {2026},
url = {https://github.com/quantum-agents/experience-md}
}
MIT. The standard itself (experience.md) is CC0 — use it freely, build on it, contribute back.