Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for NemoGuard JailbreakDetect NIM. #1038

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/user-guides/advanced/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,6 @@ Advanced
using-docker
vertexai-setup
nemoguard-contentsafety-deployment
nemoguard-jailbreakdetect-deployment
nemoguard-topiccontrol-deployment
jailbreak-detection-heuristics-deployment
safeguarding-ai-virtual-assistant-blueprint
7 changes: 4 additions & 3 deletions docs/user-guides/advanced/jailbreak-detection-deployment.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# Jailbreak Detection Deployment

```{note}
The recommended way to use Jailbreak Detection Heuristics and models with NeMo Guardrails is using the provided [Dockerfile](https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/nemoguardrails/library/jailbreak_detection/Dockerfile). For more details, check out how to [build and use the image](using-docker.md).
```
The recommended way to use Jailbreak Detection Heuristics and models with NeMo Guardrails is using the provided [Dockerfile](https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/nemoguardrails/library/jailbreak_detection/Dockerfile).
For more details, check out how to [build and use the image](using-docker.md).

If you wish to use the NemoGuard JailbreakDetect NIM, please see the [related documentation](nemoguard-jailbreakdetect-deployment.md).

In order to deploy the jailbreak detection server, follow these steps:

Expand Down

This file was deleted.

53 changes: 53 additions & 0 deletions docs/user-guides/advanced/nemoguard-jailbreakdetect-deployment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# NemoGuard JailbreakDetect Deployment

The NemoGuard Jailbreak Detect model is available via the [Jailbreak Detection Container](jailbreak-detection-deployment.md) or as an [NVIDIA NIM](https://docs.nvidia.com/nim/#nemoguard).

## NIM Deployment

The first step is to ensure access to NVIDIA NIM assets through NGC using an NVAIE license.
Once you have the NGC API key with the necessary permissions, set the following environment variables:

```bash
export NGC_API_KEY=<your NGC API key>
docker login nvcr.io -u '$oauthtoken' -p <<< <your NGC API key>
```

Test that you are able to use the NVIDIA NIM assets by pulling the latest NemoGuard container.

```bash
export NIM_IMAGE='nvcr.io/nim/nvidia/nemoguard-jailbreak-detect:latest'
docker pull $NIM_IMAGE
```

Then run the container.

```bash
docker run -it --gpus=all --runtime=nvidia \
-e NGC_API_KEY="$NGC_API_KEY" \
-p 8000:8000 \
$NIM_IMAGE
```

## Using the NIM in Guardrails
Within your guardrails configuration file, you can specify that you want to use the NIM endpoint as part of the jailbreak detection configuration.
To do this, ensure that you specify the location of the NIM in the `nim_url` parameter.
If the NIM is listening on a port other than 8000, specify that port in the `nim_port` parameter.
An example configuration is shown below.

```yaml
models:
- type: main
engine: openai
model: gpt-3.5-turbo-instruct



rails:
config:
jailbreak_detection:
nim_url: "0.0.0.0"
nim_port: 8000
input:
flows:
- jailbreak detection model
```
16 changes: 12 additions & 4 deletions nemoguardrails/library/jailbreak_detection/actions.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
from nemoguardrails.library.jailbreak_detection.request import (
jailbreak_detection_heuristics_request,
jailbreak_detection_model_request,
jailbreak_nim_request,
)
from nemoguardrails.llm.taskmanager import LLMTaskManager

Expand Down Expand Up @@ -93,11 +94,13 @@ async def jailbreak_detection_model(
jailbreak_config = llm_task_manager.config.rails.config.jailbreak_detection

jailbreak_api_url = jailbreak_config.server_endpoint
nim_url = jailbreak_config.nim_url
nim_port = jailbreak_config.nim_port

if context is not None:
prompt = context.get("user_message", "")

if not jailbreak_api_url:
if not jailbreak_api_url and not nim_url:
from nemoguardrails.library.jailbreak_detection.model_based.checks import (
check_jailbreak,
initialize_model,
Expand All @@ -111,9 +114,14 @@ async def jailbreak_detection_model(

return jailbreak["jailbreak"]

jailbreak = await jailbreak_detection_model_request(
prompt=prompt, api_url=jailbreak_api_url
)
if nim_url:
jailbreak = await jailbreak_nim_request(
prompt=prompt, nim_url=nim_url, nim_port=nim_port
)
elif jailbreak_api_url:
jailbreak = await jailbreak_detection_model_request(
prompt=prompt, api_url=jailbreak_api_url
)

if jailbreak is None:
log.warning("Jailbreak endpoint not set up properly.")
Expand Down
41 changes: 41 additions & 0 deletions nemoguardrails/library/jailbreak_detection/request.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

import asyncio
import logging
from typing import Optional

Expand Down Expand Up @@ -92,3 +93,43 @@ async def jailbreak_detection_model_request(
log.exception("No jailbreak field in result.")
result = None
return result


async def jailbreak_nim_request(
prompt: str,
nim_url: str,
nim_port: int,
):
payload = {
"input": prompt,
}

endpoint = f"http://{nim_url}:{nim_port}/v1/classify"
try:
async with aiohttp.ClientSession() as session:
try:
async with session.post(endpoint, json=payload, timeout=30) as resp:
if resp.status != 200:
log.error(
f"NemoGuard JailbreakDetect NIM request failed with status {resp.status}"
)
return None

result = await resp.json()

log.info(f"Prompt jailbreak check: {result}.")
try:
result = result["jailbreak"]
except KeyError:
log.exception("No jailbreak field in result.")
result = None
return result
except aiohttp.ClientError as e:
log.error(f"NemoGuard JailbreakDetect NIM connection error: {str(e)}")
return None
except asyncio.TimeoutError:
log.error("NemoGuard JailbreakDetect NIM request timed out")
return None
except Exception as e:
log.error(f"Unexpected error during NIM request: {str(e)}")
return None
14 changes: 12 additions & 2 deletions nemoguardrails/rails/llm/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@

import logging
import os
import re
import warnings
from enum import Enum
from typing import Any, Dict, List, Optional, Set, Tuple, Union
Expand Down Expand Up @@ -468,9 +469,18 @@ class JailbreakDetectionConfig(BaseModel):
prefix_suffix_perplexity_threshold: float = Field(
default=1845.65, description="The prefix/suffix perplexity threshold."
)
embedding: str = Field(
nim_url: Optional[str] = Field(
default=None,
description="Location of the NemoGuard JailbreakDetect NIM.",
)
nim_port: int = Field(
default=8000,
description="Port the NemoGuard JailbreakDetect NIM is listening on.",
)
embedding: Optional[str] = Field(
default="nvidia/nv-embedqa-e5-v5",
description="Model to use for embedding-based detections.",
description="DEPRECATED: Model to use for embedding-based detections. Use NIM instead.",
deprecated=True,
)


Expand Down
1 change: 0 additions & 1 deletion tests/test_configs/jailbreak_models/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ rails:
config:
jailbreak_detection:
server_endpoint: ""
embedding: "snowflake/snowflake-arctic-embed-m-long"

input:
flows:
Expand Down
10 changes: 10 additions & 0 deletions tests/test_configs/jailbreak_nim/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
rails:
config:
jailbreak_detection:
server_endpoint: ""
nim_url: "0.0.0.0"
nim_port: 8000

input:
flows:
- jailbreak detection model
11 changes: 11 additions & 0 deletions tests/test_configs/jailbreak_nim/flows.co
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
define user express greeting
"hi"
"hello"
"hey"

define flow
user express greeting
bot express greeting

define bot express greeting
"Hello!"
Loading
Loading