web-infra-dev
diff --git a/‎apps/chrome-extension/static/manifest.json‎
Lines changed: 1 addition & 1 deletion b/‎apps/chrome-extension/static/manifest.json‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎apps/site/docs/en/changelog.mdx‎
Lines changed: 54 additions & 0 deletions b/‎apps/site/docs/en/changelog.mdx‎
Lines changed: 54 additions & 0 deletions
diff --git a/‎apps/site/docs/en/integrate-with-android.mdx‎
Lines changed: 1 addition & 0 deletions b/‎apps/site/docs/en/integrate-with-android.mdx‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎apps/site/docs/en/mcp-android.mdx‎
Lines changed: 1 addition & 1 deletion b/‎apps/site/docs/en/mcp-android.mdx‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎apps/site/docs/zh/changelog.mdx‎
Lines changed: 54 additions & 0 deletions b/‎apps/site/docs/zh/changelog.mdx‎
Lines changed: 54 additions & 0 deletions
diff --git a/‎apps/site/docs/zh/integrate-with-android.mdx‎
Lines changed: 1 addition & 0 deletions b/‎apps/site/docs/zh/integrate-with-android.mdx‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎apps/site/docs/zh/mcp-android.mdx‎
Lines changed: 1 addition & 1 deletion b/‎apps/site/docs/zh/mcp-android.mdx‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎packages/android-playground/src/bin.ts‎
Lines changed: 10 additions & 4 deletions b/‎packages/android-playground/src/bin.ts‎
Lines changed: 10 additions & 4 deletions
diff --git a/‎packages/android/src/agent.ts‎
Lines changed: 1 addition & 1 deletion b/‎packages/android/src/agent.ts‎
Lines changed: 1 addition & 1 deletion
@@ -1,7 +1,7 @@
 {
   "name": "Midscene.js",
   "description": "Open-source SDK for automating web pages using natural language through AI.",
-  "version": "0.136",
+  "version": "0.137",
   "manifest_version": 3,
   "permissions": [
     "activeTab",
 
@@ -2,6 +2,60 @@
 
 > For the complete changelog, please refer to: [Midscene Releases](https://github.com/web-infra-dev/midscene/releases)
 
+## v0.30 - 🎯 Cache management upgrade and mobile experience optimization
+
+### 🎯 More flexible cache strategy
+
+v0.30 improves the cache system, allowing you to control cache behavior based on actual needs:
+
+- **Multiple cache modes available**: Supports read-only, write-only, and read-write strategies. For example, use read-only mode in CI environments to reuse cache, and use write-only mode in local development to update cache
+- **Automatic cleanup of unused cache**: Agent can automatically clean up unused cache records when destroyed, preventing cache files from accumulating
+- **Simplified unified configuration**: Cache configuration parameters for CLI and Agent are now unified, no need to remember different configurations
+
+### 📊 Report management convenience
+
+- **Support for merging multiple reports**: In addition to playwright scenarios, all scenarios now support merging multiple automation execution reports into a single file for centralized viewing and sharing of test results
+
+### 📱 Mobile automation optimization
+
+#### iOS platform improvements
+- **Real device support improvement**: Removed simctl check restriction, making iOS real device automation smoother
+- **Auto-adapt device display**: Implemented automatic device pixel ratio detection, ensuring accurate element positioning on different iOS devices
+
+#### Android platform enhancements
+- **Flexible screenshot optimization**: Added `screenshotResizeRatio` option, allowing you to customize screenshot size while ensuring visual recognition accuracy, reducing network transmission and storage overhead
+- **Screen info cache control**: Use `alwaysRefreshScreenInfo` option to control whether to fetch screen information each time, allowing cache reuse in stable environments to improve performance
+- **Direct ADB command execution**: AndroidAgent added `runAdbCommand` method for convenient execution of custom device control commands
+
+#### Cross-platform consistency
+- **ClearInput support on all platforms**: Solves the problem of AI being unable to accurately plan clear input operations across platforms
+
+### 🔧 Feature enhancements
+
+- **Failure classification**: CLI execution results can now distinguish between "skipped failures" and "actual failures", helping locate issue causes
+- **aiInput append mode**: Added `append` option to append input while preserving existing content, suitable for editing scenarios
+- **Chrome extension improvements**:
+  - Popup mode preference saved to localStorage, remembering your choice on next open
+  - Bridge mode supports auto-connect, reducing manual operations
+  - Support for GPT-4o and non-visual language models
+
+### 🛡️ Type safety improvements
+
+- **Zod schema validation**: Introduced type checking for action parameters, detecting parameter errors during development to avoid runtime issues
+- **Number type support**: Fixed `aiInput` support for number type values, making type handling more robust
+
+### 🐞 Bug fixes
+
+- Fixed potential issues caused by Playwright circular dependencies
+- Fixed issue where `aiWaitFor` as the first statement could not generate reports
+- Improved video recorder delay logic to ensure the last frame is captured
+- Optimized report display logic to view both error information and element positioning information simultaneously
+- Fixed issue where `cacheable` option in `aiAction` subtasks was not properly passed
+
+### 📚 Community
+
+- Awesome Midscene section added [midscene-java](./awesome-midscene.md) community project
+
 ## v0.29 - 📱 iOS platform support added
 
 ### 🚀 iOS platform support added
 
@@ -129,6 +129,7 @@ The AndroidDevice constructor supports the following parameters:
   - `imeStrategy?: 'always-yadb' | 'yadb-for-non-ascii'` - Optional, when should Midscene invoke [yadb](https://github.com/ysbing/YADB) to input texts. `'yadb-for-non-ascii'` uses yadb only when handling non-ASCII words, while `'always-yadb'` forces yadb for every input task. Try switching between these strategies if the default configuration fails to input texts. (Default: 'yadb-for-non-ascii')
   - `displayId?: number` - Optional, the display id to use. (Default: undefined, means use the current display)
   - `screenshotResizeScale?: number` - Optional, controls the size of the screenshot Midscene sends to the AI model. Default is `1 / devicePixelRatio`, so a 1200×800 display with a device pixel ratio of 3 sends an image of roughly 400×267 to the model. Adjusting this value manually is not recommended.
+  - `alwaysRefreshScreenInfo?: boolean` - Optional, whether to re-fetch screen size and orientation information every time. Default is false (uses cache for better performance). Set to true if the device may rotate or you need real-time screen information.
 
 ### Additional Android Agent Interfaces
 
 
@@ -76,7 +76,7 @@ Midscene MCP provides the following Android device automation tools:
   Parameters:
   - deviceId: (Optional) Device ID to connect to. If not provided, uses the first available device.
   - displayId: (Optional) Display ID for multi-display Android devices (e.g., 0, 1, 2). When specified, all ADB input operations will target this specific display.
-  - alwaysFetchScreenInfo: (Optional) Whether to always fetch screen size and orientation from the device on each call. Defaults to false (uses cache for better performance). Set to true if the device may rotate or you need real-time screen information.
+  - alwaysRefreshScreenInfo: (Optional) Whether to always fetch screen size and orientation from the device on each call. Defaults to false (uses cache for better performance). Set to true if the device may rotate or you need real-time screen information.
   ```
 
 ### App control
 
@@ -2,6 +2,60 @@
 
 > 完整更新日志请参考：[Midscene Releases](https://github.com/web-infra-dev/midscene/releases)
 
+## V0.30 - 🎯 缓存管理升级与移动端体验优化
+
+### 🎯 更灵活的缓存策略
+
+v0.30 版本改进了缓存系统，让你可以根据实际需求控制缓存行为:
+
+- **多种缓存模式可选**: 支持只读(read-only)、只写(write-only)、读写(read-write)等策略。例如在 CI 环境中使用只读模式复用缓存，在本地开发时使用只写模式更新缓存
+- **自动清理无用缓存**: Agent 销毁时可自动清理未使用的缓存记录，避免缓存文件越积越多
+- **配置更简洁统一**: CLI 和 Agent 的缓存配置参数已统一，无需记忆不同的配置方式
+
+### 📊 报告管理更便捷
+
+- **支持合并多个报告**: 除了 playwright 场景，现在任意场景均支持将多次自动化执行的报告合并为单个文件，方便集中查看和分享测试结果
+
+### 📱 移动端自动化优化
+
+#### iOS 平台改进
+- **真机支持改进**: 移除了 simctl 检查限制，iOS 真机设备的自动化更流畅
+- **自动适配设备显示**: 实现设备像素比自动检测，确保在不同 iOS 设备上元素定位准确
+
+#### Android 平台增强
+- **灵活的截图优化**: 新增 `screenshotResizeRatio` 选项，你可以在保证视觉识别准确性的前提下自定义截图尺寸，减少网络传输和存储开销
+- **屏幕信息缓存控制**: 通过 `alwaysRefreshScreenInfo` 选项控制是否每次都获取屏幕信息，在稳定环境下可复用缓存提升性能
+- **直接执行 ADB 命令**: AndroidAgent 新增 `runAdbCommand` 方法，方便执行自定义的设备控制命令
+
+#### 跨平台一致性
+- **ClearInput 全平台支持**: 解决 AI 无法准确规划各平台清空输入的操作问题
+
+### 🔧 功能增强
+
+- **失败分类**: CLI 执行结果现在可以区分「跳过的失败」和「真正的失败」，帮助定位问题原因
+- **aiInput 追加输入**: 新增 `append` 选项，在保留现有内容的基础上追加输入，适用于编辑场景
+- **Chrome 扩展改进**:
+  - 弹窗模式偏好会保存到 localStorage，下次打开记住你的选择
+  - Bridge 模式支持自动连接，减少手动操作
+  - 支持 GPT-4o 和非视觉语言模型
+
+### 🛡️ 类型安全改进
+
+- **Zod 模式验证**: 为 action 参数引入类型检查，在开发阶段发现参数错误，避免运行时问题
+- **数字类型支持**: 修复了 `aiInput` 对 number 类型值的支持，类型处理更健壮
+
+### 🐞 问题修复
+
+- 修复了 Playwright 循环依赖导致的潜在问题
+- 修复了 `aiWaitFor` 作为首个语句时无法生成报告的问题
+- 改进视频录制器延迟逻辑，确保最后的画面帧也能被捕获
+- 优化报告展示逻辑，现在可以同时查看错误信息和元素定位信息
+- 修复了 `aiAction` 子任务中 `cacheable` 选项未正确传递的问题
+
+### 📚 社区
+
+- Awesome Midscene 板块新增 [midscene-java](./awesome-midscene.md) 社区项目
+
 ## v0.29 - 📱 新增 iOS 平台支持
 
 ### 🚀 新增 iOS 平台支持
 
@@ -128,6 +128,7 @@ AndroidDevice 的构造函数支持以下参数：
   - `imeStrategy?: 'always-yadb' | 'yadb-for-non-ascii'` - 可选参数，控制 Midscene 何时调用 [yadb](https://github.com/ysbing/YADB) 来输入文本。`'yadb-for-non-ascii'` 仅在输入非 ASCII 文本时启用 yadb，而 `'always-yadb'` 会在所有输入任务中都使用 yadb。如果默认配置无法正确输入文本，可尝试在这两种策略之间切换。默认值为 'yadb-for-non-ascii'。
   - `displayId?: number` - 可选参数，用于指定要使用的显示器 ID。默认值为 undefined，表示使用当前显示器。
   - `screenshotResizeScale?: number` - 可选参数，控制发送给 AI 模型的截图尺寸。默认值为 `1 / devicePixelRatio`，因此对于分辨率 1200×800、设备像素比（DPR）为 3 的界面，发送到模型的图片约为 400×267。不建议手动修改该参数。
+  - `alwaysRefreshScreenInfo?: boolean` - 可选参数，是否每次都重新获取屏幕尺寸和方向信息。默认为 false（使用缓存以提高性能）。如果设备可能会旋转或需要实时屏幕信息，设置为 true。
 
 ### Android Agent 上的更多接口
 
 
@@ -76,7 +76,7 @@ Midscene MCP 提供以下 Android 设备自动化工具：
   参数：
   - deviceId：（可选）要连接的设备 ID。如果未提供，使用第一个可用设备
   - displayId：（可选）多屏 Android 设备的显示屏 ID（如 0、1、2），当指定时，所有 ADB 输入操作将针对此特定显示屏
-  - alwaysFetchScreenInfo：（可选）是否每次都重新获取屏幕尺寸和方向信息。默认为 false（使用缓存以提高性能）。如果设备可能会旋转或需要实时屏幕信息，设置为 true
+  - alwaysRefreshScreenInfo：（可选）是否每次都重新获取屏幕尺寸和方向信息。默认为 false（使用缓存以提高性能）。如果设备可能会旋转或需要实时屏幕信息，设置为 true
   ```
 
 ### 应用控制
 
@@ -119,11 +119,17 @@ const main = async () => {
     const selectedDeviceId = await selectDevice();
     console.log(`✅ Selected device: ${selectedDeviceId}`);
 
-    // Create device and agent instances with selected device
-    const device = new AndroidDevice(selectedDeviceId);
-    const agent = new AndroidAgent(device);
+    // Create PlaygroundServer with agent factory
+    const playgroundServer = new PlaygroundServer(
+      // Agent factory - creates new agent with device each time
+      async () => {
+        const device = new AndroidDevice(selectedDeviceId);
+        await device.connect();
+        return new AndroidAgent(device);
+      },
+      staticDir,
+    );
 
-    const playgroundServer = new PlaygroundServer(device, agent, staticDir);
     const scrcpyServer = new ScrcpyServer();
 
     // Set the selected device in scrcpy server
 
@@ -45,7 +45,7 @@ export async function agentFromAdbDevice(
     usePhysicalDisplayIdForDisplayLookup:
       opts?.usePhysicalDisplayIdForDisplayLookup,
     screenshotResizeScale: opts?.screenshotResizeScale,
-    alwaysFetchScreenInfo: opts?.alwaysFetchScreenInfo,
+    alwaysRefreshScreenInfo: opts?.alwaysRefreshScreenInfo,
   });
 
   await device.connect();
Original file line number	Diff line number	Diff line change
`@@ -1,7 +1,7 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "Midscene.js",`
`3`	`3`	`"description": "Open-source SDK for automating web pages using natural language through AI.",`
`4`		`- "version": "0.136",`
	`4`	`+ "version": "0.137",`
`5`	`5`	`"manifest_version": 3,`
`6`	`6`	`"permissions": [`
`7`	`7`	`"activeTab",`