You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
refactor(env): modernize model configuration environment variables (#1375)
* refactor(env): modernize model configuration environment variables
This PR refactors the model configuration system with improved naming conventions
and better type safety while maintaining backward compatibility.
Key Changes:
1. Environment Variable Naming Convention Updates:
- Renamed OPENAI_* → MODEL_* for public API variables
* OPENAI_API_KEY → MODEL_API_KEY (deprecated, backward compatible)
* OPENAI_BASE_URL → MODEL_BASE_URL (deprecated, backward compatible)
- Renamed MIDSCENE_*_VL_MODE → MIDSCENE_*_LOCATOR_MODE across all intents
* MIDSCENE_VL_MODE → MIDSCENE_LOCATOR_MODE
* MIDSCENE_VQA_VL_MODE → MIDSCENE_VQA_LOCATOR_MODE
* MIDSCENE_PLANNING_VL_MODE → MIDSCENE_PLANNING_LOCATOR_MODE
* MIDSCENE_GROUNDING_VL_MODE → MIDSCENE_GROUNDING_LOCATOR_MODE
- Updated all internal MIDSCENE_*_OPENAI_* → MIDSCENE_*_MODEL_*
* MIDSCENE_VQA_OPENAI_API_KEY → MIDSCENE_VQA_MODEL_API_KEY
* MIDSCENE_PLANNING_OPENAI_API_KEY → MIDSCENE_PLANNING_MODEL_API_KEY
* MIDSCENE_GROUNDING_OPENAI_API_KEY → MIDSCENE_GROUNDING_MODEL_API_KEY
* (and corresponding BASE_URL variables)
2. Type System Improvements:
- Split TModelConfigFn into public and internal types
- Public API (TModelConfigFn) no longer exposes 'intent' parameter
- Internal type (TModelConfigFnInternal) maintains intent parameter
- Users can still optionally use intent parameter via type casting
3. Backward Compatibility:
- Maintained compatibility for documented public variables (OPENAI_API_KEY, OPENAI_BASE_URL)
- New variables take precedence, fallback to legacy names if not set
- Only public documented variables are deprecated, internal variables renamed directly
4. Updated Files:
- packages/shared/src/env/types.ts - Type definitions and constants
- packages/shared/src/env/constants.ts - Config key mappings
- packages/shared/src/env/decide-model-config.ts - Compatibility logic
- packages/shared/src/env/model-config-manager.ts - Type casting implementation
- packages/shared/src/env/init-debug.ts - Debug variable updates
- All test files updated to use new variable names
Testing:
- All 24 model-config-manager tests passing
- Overall test suite: 241 tests passing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
* Update packages/shared/src/env/constants.ts
Co-authored-by: Copilot <[email protected]>
* test(env): add comprehensive backward compatibility tests for OPENAI_* variables
- Added test suite to verify MODEL_API_KEY/MODEL_BASE_URL take precedence
- Added test to ensure OPENAI_API_KEY/OPENAI_BASE_URL still work as fallback
- Fixed compatibility logic to prioritize new variables over legacy ones
- All 13 tests passing, including 5 new backward compatibility tests
Test coverage:
✓ Using only legacy variables (OPENAI_API_KEY)
✓ Using only new variables (MODEL_API_KEY)
✓ Mixing new and legacy variables (new takes precedence)
✓ Individual precedence for API_KEY and BASE_URL
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
* fix(test): reset MIDSCENE_CACHE in beforeEach to avoid .env interference
The test 'should return the correct value from override' was failing because
.env file sets MIDSCENE_CACHE=1. This was polluting the test environment and
causing the test to expect false but receive true.
Fixed by explicitly resetting MIDSCENE_CACHE to empty string in beforeEach.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
* docs(site): update environment variable names and add advanced configuration examples for agents
---------
Co-authored-by: Claude <[email protected]>
Co-authored-by: Copilot <[email protected]>
Copy file name to clipboardExpand all lines: apps/site/docs/en/api.mdx
+56-2Lines changed: 56 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,58 @@ In Playwright and Puppeteer, there are some common parameters:
25
25
-`forceSameTabNavigation: boolean`: If true, page navigation is restricted to the current tab. (Default: true)
26
26
-`waitForNavigationTimeout: number`: The timeout for waiting for navigation finished. (Default: 5000ms, set to 0 means not waiting for navigation finished)
27
27
28
+
These Agents also support the following advanced configuration parameters:
29
+
30
+
-`modelConfig: () => IModelConfig`: Optional. Custom model configuration function. Allows you to dynamically configure different models through code instead of environment variables. This is particularly useful when you need to use different models for different AI tasks (such as VQA, planning, grounding, etc.).
-`createOpenAIClient: (config) => OpenAI`: Optional. Custom OpenAI client factory function. Allows you to create custom OpenAI client instances for integrating observability tools (such as LangSmith, LangFuse) or using custom OpenAI-compatible clients.
45
+
46
+
**Parameter Description:**
47
+
-`config.modelName: string` - Model name
48
+
-`config.openaiApiKey?: string` - API key
49
+
-`config.openaiBaseURL?: string` - API endpoint URL
50
+
-`config.intent: string` - AI task type ('VQA' | 'planning' | 'grounding' | 'default')
51
+
-`config.vlMode?: string` - Visual language model mode
52
+
- Other configuration parameters...
53
+
54
+
**Example (LangSmith Integration):**
55
+
```typescript
56
+
importOpenAIfrom'openai';
57
+
import { wrapOpenAI } from'langsmith/wrappers';
58
+
59
+
const agent =newPuppeteerAgent(page, {
60
+
createOpenAIClient: (config) => {
61
+
const openai =newOpenAI({
62
+
apiKey: config.openaiApiKey,
63
+
baseURL: config.openaiBaseURL,
64
+
});
65
+
66
+
// Wrap with LangSmith for planning tasks
67
+
if (config.intent==='planning') {
68
+
returnwrapOpenAI(openai, {
69
+
metadata: { task: 'planning' }
70
+
});
71
+
}
72
+
73
+
returnopenai;
74
+
}
75
+
});
76
+
```
77
+
78
+
**Note:**`createOpenAIClient` overrides the behavior of the `MIDSCENE_LANGSMITH_DEBUG` environment variable. If you provide a custom client factory function, you need to handle the integration of LangSmith or other observability tools yourself.
79
+
28
80
In Puppeteer, there is also a parameter:
29
81
30
82
-`waitForNetworkIdleTimeout: number`: The timeout for waiting for network idle between each action. (Default: 2000ms, set to 0 means not waiting for network idle)
@@ -854,9 +906,11 @@ You can override environment variables at runtime by calling the `overrideAIConf
854
906
import { overrideAIConfig } from'@midscene/web/puppeteer'; // or another Agent
855
907
856
908
overrideAIConfig({
857
-
OPENAI_BASE_URL: '...',
858
-
OPENAI_API_KEY: '...',
859
909
MIDSCENE_MODEL_NAME: '...',
910
+
MODEL_BASE_URL: '...', // recommended, use new variable name
911
+
MODEL_API_KEY: '...', // recommended, use new variable name
912
+
// OPENAI_BASE_URL: '...', // deprecated but still compatible
913
+
// OPENAI_API_KEY: '...', // deprecated but still compatible
Copy file name to clipboardExpand all lines: apps/site/docs/en/choose-a-model.mdx
+31-15Lines changed: 31 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,22 @@ import TroubleshootingLLMConnectivity from './common/troubleshooting-llm-connect
4
4
5
5
Choose one of the following models, obtain the API key, complete the configuration, and you are ready to go. Choose the model that is easiest to obtain if you are a beginner.
6
6
7
+
## Environment Variable Configuration
8
+
9
+
Starting from version 1.0, Midscene.js recommends using the following new environment variable names:
10
+
11
+
-`MODEL_API_KEY` - API key (recommended)
12
+
-`MODEL_BASE_URL` - API endpoint URL (recommended)
13
+
14
+
For backward compatibility, the following legacy variable names are still supported:
15
+
16
+
-`OPENAI_API_KEY` - API key (deprecated but still compatible)
17
+
-`OPENAI_BASE_URL` - API endpoint URL (deprecated but still compatible)
18
+
19
+
When both new and old variables are set, the new variables (`MODEL_*`) will take precedence.
20
+
21
+
In the configuration examples throughout this document, we will use the new variable names. If you are currently using the old variable names, there's no need to change them immediately - they will continue to work.
22
+
7
23
## Adapted models for using Midscene.js
8
24
9
25
Midscene.js supports two types of models, visual-language models and LLM models.
@@ -46,8 +62,8 @@ We recommend the Qwen3-VL series, which clearly outperforms Qwen2.5-VL. Qwen3-VL
46
62
Using the Alibaba Cloud `qwen3-vl-plus` model as an example:
MIDSCENE_MODEL_NAME="ep-2025..."# Inference endpoint ID or model name from Volcano Engine
136
152
MIDSCENE_USE_VLM_UI_TARS=DOUBAO
137
153
```
@@ -164,8 +180,8 @@ The token cost of GPT-4o is relatively high because Midscene sends DOM informati
164
180
**Config**
165
181
166
182
```bash
167
-
OPENAI_API_KEY="......"
168
-
OPENAI_BASE_URL="https://custom-endpoint.com/compatible-mode/v1"# Optional, if you want an endpoint other than the default OpenAI one.
183
+
MODEL_API_KEY="......"
184
+
MODEL_BASE_URL="https://custom-endpoint.com/compatible-mode/v1"# Optional, if you want an endpoint other than the default OpenAI one.
169
185
MIDSCENE_MODEL_NAME="gpt-4o-2024-11-20"# Optional. The default is "gpt-4o".
170
186
```
171
187
@@ -176,7 +192,7 @@ Other models are also supported by Midscene.js. Midscene will use the same promp
176
192
177
193
1. A multimodal model is required, which means it must support image input.
178
194
1. The larger the model, the better it works. However, it needs more GPU or money.
179
-
1. Find out how to to call it with an OpenAI SDK compatible endpoint. Usually you should set the `OPENAI_BASE_URL`, `OPENAI_API_KEY` and `MIDSCENE_MODEL_NAME`. Config are described in [Config Model and Provider](./model-provider).
195
+
1. Find out how to to call it with an OpenAI SDK compatible endpoint. Usually you should set the `MODEL_BASE_URL`, `MODEL_API_KEY` and `MIDSCENE_MODEL_NAME`. Config are described in [Config Model and Provider](./model-provider).
180
196
1. If you find it not working well after changing the model, you can try using some short and clear prompt, or roll back to the previous model. See more details in [Prompting Tips](./prompting-tips).
181
197
1. Remember to follow the terms of use of each model and provider.
182
198
1. Don't include the `MIDSCENE_USE_VLM_UI_TARS` and `MIDSCENE_USE_QWEN_VL` config unless you know what you are doing.
@@ -185,8 +201,8 @@ Other models are also supported by Midscene.js. Midscene will use the same promp
185
201
186
202
```bash
187
203
MIDSCENE_MODEL_NAME="....."
188
-
OPENAI_BASE_URL="......"
189
-
OPENAI_API_KEY="......"
204
+
MODEL_BASE_URL="......"
205
+
MODEL_API_KEY="......"
190
206
```
191
207
192
208
For more details and sample config, see [Config Model and Provider](./model-provider).
Copy file name to clipboardExpand all lines: apps/site/docs/en/model-provider.mdx
+19-15Lines changed: 19 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,12 +9,14 @@ In this article, we will show you how to config AI service provider and how to c
9
9
## Configs
10
10
11
11
### Common configs
12
-
These are the most common configs, in which `OPENAI_API_KEY` is required.
12
+
These are the most common configs, in which `MODEL_API_KEY` or `OPENAI_API_KEY` is required.
13
13
14
14
| Name | Description |
15
15
|------|-------------|
16
-
|`OPENAI_API_KEY`| Required. Your OpenAI API key (e.g. "sk-abcdefghijklmnopqrstuvwxyz") |
17
-
|`OPENAI_BASE_URL`| Optional. Custom endpoint URL for API endpoint. Use it to switch to a provider other than OpenAI (e.g. "https://some_service_name.com/v1")|
16
+
|`MODEL_API_KEY`| Required (recommended). Your API key (e.g. "sk-abcdefghijklmnopqrstuvwxyz") |
17
+
|`MODEL_BASE_URL`| Optional (recommended). Custom endpoint URL for API endpoint. Use it to switch to a provider other than OpenAI (e.g. "https://some_service_name.com/v1")|
18
+
|`OPENAI_API_KEY`| Deprecated but still compatible. Recommended to use `MODEL_API_KEY`|
19
+
|`OPENAI_BASE_URL`| Deprecated but still compatible. Recommended to use `MODEL_BASE_URL`|
18
20
|`MIDSCENE_MODEL_NAME`| Optional. Specify a different model name other than `gpt-4o`|
19
21
20
22
Extra configs to use `Qwen 2.5 VL` model:
@@ -69,7 +71,7 @@ Pick one of the following ways to config environment variables.
0 commit comments