Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: triggering Azure OpenAI's content management policy #986

Open
3 of 4 tasks
Rohit036 opened this issue Feb 7, 2025 · 2 comments
Open
3 of 4 tasks

bug: triggering Azure OpenAI's content management policy #986

Rohit036 opened this issue Feb 7, 2025 · 2 comments
Labels
duplicate This issue or pull request already exists

Comments

@Rohit036
Copy link

Rohit036 commented Feb 7, 2025

Did you check docs and existing issues?

  • I have read all the NeMo-Guardrails docs
  • I have updated the package to the latest version before submitting this issue
  • (optional) I have used the develop branch
  • I have searched the existing issues of NeMo-Guardrails

Python version (python --version)

Python 3.11.9

Operating system/version

Windows

NeMo-Guardrails version (if you must use a specific version and not the latest

'0.11.0

Describe the bug

I am using input rails to test the safety of the user question. I am using AzureOpenAI in the config.

I setup the LLM and ask question to test if the question being asked in SAFE or UNSAFE.

Today, I started seeing the triggering Azure OpenAI's content management policy being shown in the response.

`
import os

from nemoguardrails import LLMRails, RailsConfig
from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings
import time

import asyncio

Reading environment variables

azure_openai_key = azureOpenAIKey)
azure_openai_endpoint = OpenAIEndpoint

yaml_content = """
models:

  • type: main
    engine: azure
    model: "gpt-4o"
    parameters:
    deployment_name: johndoe-chat-model
    api_version: "2024-08-01-preview"

core:
embedding_search_provider:
name: default
parameters:
embedding_engine: azure
embedding_model: text-embedding-3-large

rails:
input:
flows:
- self check input

prompts:

  • task: self_check_input
    content: |
    Check if the following user message contains any inappropriate content
    (such as adult content, hate speech, violence, profanity, or harmful content):

    User message: "{{user_input}}"

    Respond with only "SAFE" or "UNSAFE".
    """

async def setup_azure_llm():
llm = AzureChatOpenAI(
openai_api_version="2024-08-01-preview",
azure_endpoint=azure_openai_endpoint,
azure_deployment="johndoe-chat-model"
)
return llm

async def check_safety(prompt: str, llm) -> tuple[str, float]:
# Initialize rails config
config = RailsConfig.from_content(yaml_content=yaml_content)

# Configure rails with the Azure LLM
rails = LLMRails(
    config,
    llm=llm  # This should match the name in yaml_content
)

# Start timing
start_time = time.time()

# Generate and get explanation
await rails.generate_async(prompt=prompt, options={"rails" : ["input"]})
info = rails.explain()

# End timing
response_time = time.time() - start_time

# Get safety check result
result = "UNSAFE"  # Default
if info.llm_calls and len(info.llm_calls) > 0:
    result = info.llm_calls[0].completion.strip()

return result, response_time

async def main():
llm = await setup_azure_llm()
prompt = "Your user input here"
result, response_time = await check_safety(prompt, llm)
print(f"Result: {result}, Response Time: {response_time}")

Run the main function

asyncio.run(main())
`

I also want to know if I am using the AzureOpenAI embedding and chat model correctly or not?

Steps To Reproduce

  1. Create AzureOpenAI endpoint and key.

Expected Behavior

It returns SAFE and UNSAFE as expected.

Actual Behavior

openai.BadRequestError: Error code: 400 - {'error': {'message': "The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766", 'type': None, 'param': 'prompt', 'code': 'content_filter', 'status': 400, 'innererror': {'code': 'ResponsibleAIPolicyViolation', 'content_filter_result': {'hate': {'filtered': False, 'severity': 'safe'}, 'jailbreak': {'filtered': True, 'detected': True}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}}}

@Rohit036 Rohit036 added bug Something isn't working status: needs triage New issues that have not yet been reviewed or categorized. labels Feb 7, 2025
@Pouyanpi
Copy link
Collaborator

Pouyanpi commented Feb 7, 2025

duplicate of #914

@Pouyanpi Pouyanpi removed status: needs triage New issues that have not yet been reviewed or categorized. bug Something isn't working labels Feb 7, 2025
@Pouyanpi
Copy link
Collaborator

Pouyanpi commented Feb 8, 2025

@Rohit036 please have a look at #914 and close this issue if it resolves your issue. Thanks!

@Pouyanpi Pouyanpi added the duplicate This issue or pull request already exists label Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants