Open
Description
Did you check the docs?
- I have read all the NeMo-Guardrails docs
Is your feature request related to a problem? Please describe.
Add support for Clavata that provides content moderation capabilities to detect and filter inappropriate content.
Describe the solution you'd like
A user can customize the content moderation behavior by:
- Configuring different policies for input and output flows
- Specifying which labels must match within a policy
- Setting the label match logic to either "ALL" (all specified labels must match) or "ANY" (at least one label must match)
Example config
rails:
config:
clavata:
# Only provide this if you've been told to by Clavata.ai
server_endpoint: "https://some-alt-endpoint.com"
policies:
- alias: "Violence"
id: "00000000-0000-0000-0000-000000000000"
- alias: "Weapons"
id: "00000000-0000-0000-0000-000000000000"
input:
policy: "Violence"
# Optional: Specify labels to require specific matches
labels:
- "Violence"
- "Weapons"
- "Drugs"
label_match_logic: ALL # Can be "ALL" or "ANY"
output:
policy: "Weapons"
input:
flows:
- clavata check input
output:
flows:
- clavata check output
Details
server_endpoint
: The Clavata API endpoint (only if provided by Clavata.ai)policies
: List of policy configurations with aliases and IDsinput/output
: Flow-specific configurationspolicy
: The policy alias to use for this flowlabels
: (Optional) List of specific labels to check forlabel_match_logic
: (Optional) "ALL" requires all specified labels to match, "ANY" requires at least one match
Describe alternatives you've considered
N/A
Additional context
No response