Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update AI API models and providers #9252

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
180 changes: 137 additions & 43 deletions docs/pages/product/apis-integrations/ai-api.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Specifically, you can send the AI API a message (or conversation of messages) an

See [AI API reference][ref-ref-ai-api] for the list of supported API endpoints.

<YouTubeVideo url="https://www.youtube.com/embed/Qpg4RxqndnE"/>
<YouTubeVideo url="https://www.youtube.com/embed/Qpg4RxqndnE" />

## Configuration

Expand Down Expand Up @@ -146,7 +146,8 @@ When using `"runQuery": true`, you might sometimes receive a query result contai
## Advanced Usage

<InfoBox>
The advanced features discussed here are available on Cube version 1.1.7 and above.
The advanced features discussed here are available on Cube version 1.1.7 and
above.
</InfoBox>

### Custom prompts
Expand All @@ -159,28 +160,30 @@ for example if it should usually prefer a particular view.
To use a custom prompt, set the `CUBE_CLOUD_AI_API_PROMPT` environment variable in your deployment.

<InfoBox>
Custom prompts add to, rather than overwrite, the AI API's existing prompting, so you
do not need to re-write instructions around how to generate the query itself.
Custom prompts add to, rather than overwrite, the AI API's existing prompting,
so you do not need to re-write instructions around how to generate the query
itself.
</InfoBox>

### Meta tags

The AI API can read [meta tags](/reference/data-model/view#meta) on your dimensions, measures,
The AI API can read [meta tags](/reference/data-model/view#meta) on your dimensions, measures,
segments, and views.

Use the `ai` meta tag to give context that is specific to AI and goes beyond what is
Use the `ai` meta tag to give context that is specific to AI and goes beyond what is
included in the description. This can have any keys that you want. For example, you can use it
to give the AI context on possible values in a categorical dimension:

```yaml
- name: status
sql: status
type: string
meta:
ai:
values:
- shipped
- processing
- completed
- name: status
sql: status
type: string
meta:
ai:
values:
- shipped
- processing
- completed
```

### Value search
Expand All @@ -201,20 +204,21 @@ The LLM will select dimensions from among those you have based on the question a
generate possible values dynamically.

<InfoBox>
When running value search queries, the AI API passes through the security context used
for the AI API request, so security is maintained and only dimensions the end user has
access to are able to be searched.
When running value search queries, the AI API passes through the security
context used for the AI API request, so security is maintained and only
dimensions the end user has access to are able to be searched.
</InfoBox>

To enable value search on a dimension, set the `searchable` field to true under the `ai`
meta tag, as shown below:

```yaml
- name: order_status
sql: order_status
type: string
meta:
ai:
searchable: true
- name: order_status
sql: order_status
type: string
meta:
ai:
searchable: true
```

Note that enabling Value Search may lead to slightly longer AI API response times when it
Expand All @@ -224,42 +228,132 @@ Search can only be used on string dimensions.
### Other LLM providers

<InfoBox>
These environment variables also apply to the [AI Assistant](/product/workspace/ai-assistant),
if it is enabled on your deployment.
These environment variables also apply to the [AI
Assistant](/product/workspace/ai-assistant), if it is enabled on your
deployment.
</InfoBox>

If desired, you may "bring your own" LLM model by providing a model and API credentials
for a supported model provider. Do this by setting environment variables in your Cube
deployment. See below for required variables by provider (required unless noted):
deployment.

- `CUBE_CLOUD_AI_COMPLETION_MODEL` - The AI model name to use (varies based on provider). For example `gpt-4o`.
- `CUBE_CLOUD_AI_COMPLETION_PROVIDER` - The provider. Must be one of the following:
- `amazon-bedrock`
- `anthropic`
- `azure`
- `cohere`
- `deepseek`
- `fireworks`
- `google-generative-ai`
- `google-vertex-ai`
- `google-vertex-ai-anthropic`
- `groq`
- `mistral`
- `openai`
- `openai-compatible` (any provider with an OpenAI-compatible API; support may vary)
- `together-ai`
- `x-ai`

See below for required variables by provider (required unless noted):

#### AWS Bedrock

<WarningBox>
The AI API currently supports only Anthropic Claude models on AWS Bedrock. Other
models may work but are not fully supported.
The AI API currently supports only Anthropic Claude models on AWS Bedrock.
Other models may work but are not fully supported.
</WarningBox>

- `CUBE_BEDROCK_MODEL_ID` - A supported [AWS Bedrock chat model](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html), for example `anthropic.claude-3-5-sonnet-20241022-v2:0`
- `CUBE_BEDROCK_ACCESS_KEY` - An access key for an IAM user with `InvokeModelWithResponseStream` permissions on the desired region/model.
- `CUBE_BEDROCK_ACCESS_SECRET` - The corresponding access secret
- `CUBE_BEDROCK_REGION_ID` - A supported AWS Bedrock region, for example `us-west-2`
- `CUBE_CLOUD_AI_AWS_ACCESS_KEY_ID` - An access key for an IAM user with `InvokeModelWithResponseStream` permissions on the desired region/model.
- `CUBE_CLOUD_AI_AWS_SECRET_ACCESS_KEY` - The corresponding access secret
- `CUBE_CLOUD_AI_AWS_REGION` - A supported AWS Bedrock region, for example `us-west-2`
- `CUBE_CLOUD_AI_AWS_SESSION_TOKEN` - The session token (optional)

#### Anthropic

- `CUBE_CLOUD_AI_ANTHROPIC_API_KEY`
- `CUBE_CLOUD_AI_ANTHROPIC_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

#### Microsoft Azure OpenAI

- `CUBE_CLOUD_AI_AZURE_RESOURCE_NAME`
- `CUBE_CLOUD_AI_AZURE_API_KEY`
- `CUBE_CLOUD_AI_AZURE_API_VERSION` (optional)
- `CUBE_CLOUD_AI_AZURE_BASE_URL` (optional)

#### Cohere

- `CUBE_CLOUD_AI_COHERE_API_KEY`
- `CUBE_CLOUD_AI_COHERE_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

#### DeepSeek

- `CUBE_CLOUD_AI_DEEPSEEK_API_KEY`
- `CUBE_CLOUD_AI_DEEPSEEK_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

#### Fireworks

- `CUBE_CLOUD_AI_FIREWORKS_API_KEY`
- `CUBE_CLOUD_AI_FIREWORKS_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

#### Google Generative AI

- `CUBE_CLOUD_AI_GOOGLE_GENERATIVE_AI_API_KEY`
- `CUBE_CLOUD_AI_GOOGLE_GENERATIVE_AI_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

#### GCP Vertex
#### GCP Vertex AI

<WarningBox>
The AI API currently supports only Anthropic Claude models on GCP Vertex. Other
models may work but are not fully supported.
See <Btn>Google Vertex AI (Anthropic)</Btn> below if using Anthropic models
</WarningBox>

- `CUBE_VERTEX_MODEL_ID` - A supported GCP Vertex chat model, for example `claude-3-5-sonnet@20240620`
- `CUBE_VERTEX_PROJECT_ID` - The GCP project the model is deployed in
- `CUBE_VERTEX_REGION` - The GCP region the model is deployed in, for example `us-east5`
- `CUBE_VERTEX_CREDENTIALS` - The private key for a service account with permissions to run the chosen model
- `CUBE_CLOUD_AI_GOOGLE_VERTEX_PROJECT`
- `CUBE_CLOUD_AI_GOOGLE_VERTEX_LOCATION`
- `CUBE_CLOUD_AI_GOOGLE_VERTEX_CREDENTIALS`
- `CUBE_CLOUD_AI_GOOGLE_VERTEX_PUBLISHER` - defaults to `google`; change if using another publisher (optional)

#### GCP Vertex AI (Anthropic)

- `CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_PROJECT`
- `CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_LOCATION`
- `CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_CREDENTIALS`
- `CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_PUBLISHER` - defaults to `anthropic`; change if using another publisher (optional)

#### Groq

- `CUBE_CLOUD_AI_GROQ_API_KEY`
- `CUBE_CLOUD_AI_GROQ_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

#### Mistral

- `CUBE_CLOUD_AI_MISTRAL_API_KEY`
- `CUBE_CLOUD_AI_MISTRAL_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

#### OpenAI

- `OPENAI_MODEL` - An OpenAI chat model ID, for example `gpt-4o`
- `OPENAI_API_KEY` - An OpenAI API key (we recommend creating a service account for the AI API)
- `CUBE_CLOUD_AI_OPENAI_API_KEY`
- `CUBE_CLOUD_AI_OPENAI_ORGANIZATION` - (optional)
- `CUBE_CLOUD_AI_OPENAI_PROJECT` - (optional)
- `CUBE_CLOUD_AI_OPENAI_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

#### OpenAI Compatible Providers

<InfoBox>
Use this provider if your provider is not listed on this page but provides an
OpenAI compatible endpoint. Not all providers/models are supported.
</InfoBox>

- `CUBE_CLOUD_AI_OPENAI_COMPATIBLE_API_KEY`
- `CUBE_CLOUD_AI_OPENAI_COMPATIBLE_BASE_URL`

#### Together AI

- `CUBE_CLOUD_AI_TOGETHER_API_KEY`
- `CUBE_CLOUD_AI_TOGETHER_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

#### xAI (Grok)

- `CUBE_CLOUD_AI_X_AI_API_KEY`
- `CUBE_CLOUD_AI_X_AI_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

[ref-ref-ai-api]: /product/apis-integrations/ai-api/reference
[ref-ref-ai-api]: /product/apis-integrations/ai-api/reference
Loading