Did you check the docs?
Is your feature request related to a problem? Please describe.
NeMo Guardrails enforces conversational policies at runtime. However, guardrail decisions (allow, block, modify) are primarily observable through logs or tracing systems.
For audit and compliance workflows, teams need portable and verifiable records of these decisions that can be:
- Shared across systems
- Reviewed independently
- Preserved in a tamper-evident format
Why Current Approaches Are Insufficient
Logs and traces are useful for debugging, but they are not sufficient for audit workflows because they are:
- Not standardized
- Not portable
- Difficult to verify independently
- Require access to the original infrastructure to review
Real Use Case
A lending bot denies a loan using a fairness guardrail. The customer requests proof the decision was fair and auditable. The company can show OpenTelemetry traces (technical, complex) but cannot easily show a regulator or auditor a portable, independently verifiable artifact that proves:
"On this date, for this customer, this policy was evaluated, this decision was made, and no one modified it afterward."
Describe the solution you'd like
Introduce an optional evidence export layer that exposes guardrail decisions as structured data for downstream processing.
Key Principles
- Does not change runtime behavior
- Optional and non-invasive
- Can be implemented as an extension point
- Teams choose their own destination for evidence
Implementation Options
Option A: Post-Execution Hook
on_guardrail_decision(
decision: str, # "allow" | "block" | "modify"
rail_id: str, # "content-safety", "topic-control", etc.
policy_name: str, # "no-jailbreak", "stay-on-topic"
reason: str, # "matched rule: 'politics'"
confidence: float, # 0.95
input_context: dict, # user message, conversation state
output_action: str, # what happened (allow/block/rephrase)
)
Option B: Export API Endpoint
GET /api/guardrails/decisions/{session_id}
Returns structured JSON with all decisions in a session.
Option C: Plugin/Middleware Interface
class EvidenceExporter(ABC):
def export(self, decision: GuardrailDecision) -> None:
"""
Called for each guardrail decision.
Implementers decide where to send the evidence.
"""
pass
Decision Record Structure
Each exported decision record should include:
- Policy/rule applied — which guardrail evaluated this input
- Decision outcome — allow, block, or modify
- Timestamp and session identifier — when and in which conversation
- Input/output context — relevant message content and conversation state
- Reason and confidence score — why the decision was made, with confidence
Describe alternatives you've considered
Current Approaches
Teams currently use:
- Application logs — useful for debugging, not suitable for audit
- Tracing systems (e.g., OpenTelemetry) — provide observability, not portability
- Custom audit pipelines — inconsistent, per-team implementation
Why These Are Insufficient
| Approach |
Portable? |
Standardized? |
Independently Verifiable? |
| Logs |
❌ |
❌ |
❌ |
| OpenTelemetry traces |
❌ |
⚠️ (vendor-specific) |
❌ |
| Custom pipelines |
❌ |
❌ |
❌ |
All require additional infrastructure and access to the original system to review evidence.
Additional context
Broader Pattern in AI Infrastructure
This reflects a gap in AI systems:
- Runtime enforcement (what NeMo does well) is well-supported
- Portable audit artifacts are not
Rationale for Non-Invasive Design
This request is for a downstream extension point, not a core change:
- Runtime behavior stays unchanged
- Zero latency impact
- Teams decide whether to use it
- Teams choose their own destination format
Related Work
EPI Recorder is an example of this pattern: it captures AI execution into portable, verifiable artifacts for debugging, review, and verification.
This request does NOT propose adopting any specific format.
Instead, it asks for a structured export interface. Teams can:
- Use a portable evidence format (like EPI)
- Send data to compliance dashboards
- Store in internal databases
- Integrate with custom workflows
Expected Outcome
By adding an evidence export layer, NeMo Guardrails becomes:
- The runtime gold standard for policy enforcement ✅ (already true)
- Compliance-workflow-friendly ✅ (enabled by this feature)
This positions NeMo as the choice for regulated industries where audit trails are mandatory.
Implementation Notes
- No changes to core runtime
- Optional feature (teams opt-in)
- Can be added as a post-execution stage
- Should support async/streaming scenarios
- Consider traceability (request ID, session ID)
- Should be compatible with existing LLM providers (OpenAI, Azure, NIM, etc.)
Questions for Maintainers
- Does this align with NeMo's direction toward enterprise/compliance workflows?
- Which implementation option (A, B, or C) would be most compatible with the current architecture?
- Should evidence export be enabled per-rail, per-configuration, or globally?
- How should this interact with existing tracing/logging systems?
Closing
Happy to help with implementation, examples, or a PR if this direction is of interest.
Thank you for considering this feature request.
Did you check the docs?
Is your feature request related to a problem? Please describe.
NeMo Guardrails enforces conversational policies at runtime. However, guardrail decisions (allow, block, modify) are primarily observable through logs or tracing systems.
For audit and compliance workflows, teams need portable and verifiable records of these decisions that can be:
Why Current Approaches Are Insufficient
Logs and traces are useful for debugging, but they are not sufficient for audit workflows because they are:
Real Use Case
A lending bot denies a loan using a fairness guardrail. The customer requests proof the decision was fair and auditable. The company can show OpenTelemetry traces (technical, complex) but cannot easily show a regulator or auditor a portable, independently verifiable artifact that proves:
Describe the solution you'd like
Introduce an optional evidence export layer that exposes guardrail decisions as structured data for downstream processing.
Key Principles
Implementation Options
Option A: Post-Execution Hook
Option B: Export API Endpoint
Returns structured JSON with all decisions in a session.
Option C: Plugin/Middleware Interface
Decision Record Structure
Each exported decision record should include:
Describe alternatives you've considered
Current Approaches
Teams currently use:
Why These Are Insufficient
All require additional infrastructure and access to the original system to review evidence.
Additional context
Broader Pattern in AI Infrastructure
This reflects a gap in AI systems:
Rationale for Non-Invasive Design
This request is for a downstream extension point, not a core change:
Related Work
EPI Recorder is an example of this pattern: it captures AI execution into portable, verifiable artifacts for debugging, review, and verification.
This request does NOT propose adopting any specific format.
Instead, it asks for a structured export interface. Teams can:
Expected Outcome
By adding an evidence export layer, NeMo Guardrails becomes:
This positions NeMo as the choice for regulated industries where audit trails are mandatory.
Implementation Notes
Questions for Maintainers
Closing
Happy to help with implementation, examples, or a PR if this direction is of interest.
Thank you for considering this feature request.