动作空间
动作空间(Action Space)定义了智能体在沙箱中的可执行的仿人类的操作集合,从而控制智能体的类人操作的行为范围。本文将介绍lybic沙箱中支持的动作类型及其使用方法。
概述
Lybic 提供统一的动作执行接口,支持两种主要的使用模式:
- Computer Use(电脑端使用):适用于 Windows、Linux 沙箱,支持鼠标、键盘操作
- Mobile Use(移动端使用):适用于 Android 沙箱,支持触摸、滑动、应用管理等操作
我们建议你使用SDK封装的动作执行接口来简化调用流程、规范调用参数,详见 SDK 文档。
API 接口
执行动作
端点: POST /api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute
通用参数:
action(object, 必填): 要执行的动作对象(详见下文动作类型)includeScreenShot(boolean, 可选): 是否在响应中包含动作执行后的截图 URL,默认为trueincludeCursorPosition(boolean, 可选): 是否在响应中包含动作执行后的光标/触摸位置,默认为true
响应格式:
{
"screenShot": "https://...",
"cursorPosition": {
"x": 500,
"y": 300,
"screenWidth": 1920,
"screenHeight": 1080,
"screenIndex": 0
},
"actionResult": {}
}Computer Use 动作空间
适用于 Windows、Linux 沙箱的桌面操作。
1. 鼠标操作
单击 (mouse:click)
在指定坐标处点击鼠标。
参数:
type:"mouse:click"x: X 坐标(像素或分数)y: Y 坐标(像素或分数)button: 鼠标按钮标志(1=左键, 2=右键, 4=中键, 8=后退, 16=前进)holdKey(可选): 点击时按住的修饰键,如"ctrl","alt","alt+shift"relative(可选): 坐标是否相对于当前鼠标位置
坐标格式:
像素坐标:
{
"type": "px",
"value": 500
}分数坐标(推荐,适配不同分辨率):
{
"type": "/",
"numerator": 1,
"denominator": 2
}cURL 请求示例:
# 在屏幕中央(50%, 50%)左键单击
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "mouse:click",
"x": {"type": "/", "numerator": 1, "denominator": 2},
"y": {"type": "/", "numerator": 1, "denominator": 2},
"button": 1
}
}'
# 在绝对位置 (500, 300) 右键单击
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "mouse:click",
"x": {"type": "px", "value": 500},
"y": {"type": "px", "value": 300},
"button": 2
}
}'
# Ctrl + 左键单击
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "mouse:click",
"x": {"type": "px", "value": 500},
"y": {"type": "px", "value": 300},
"button": 1,
"holdKey": "ctrl"
}
}'双击 (mouse:doubleClick)
在指定坐标处双击鼠标。
参数: 与单击相同
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "mouse:doubleClick",
"x": {"type": "px", "value": 400},
"y": {"type": "px", "value": 200},
"button": 1
}
}'三击 (mouse:tripleClick)
在指定坐标处三击鼠标(通常用于选择整行文本)。
参数: 与单击相同
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "mouse:tripleClick",
"x": {"type": "px", "value": 400},
"y": {"type": "px", "value": 200},
"button": 1
}
}'移动 (mouse:move)
移动鼠标到指定坐标。
参数:
type:"mouse:move"x: X 坐标y: Y 坐标holdKey(可选): 移动时按住的修饰键relative(可选): 是否相对移动
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "mouse:move",
"x": {"type": "px", "value": 800},
"y": {"type": "px", "value": 600}
}
}'拖拽 (mouse:drag)
从起始坐标拖拽到结束坐标。
参数:
type:"mouse:drag"startX: 起始 X 坐标startY: 起始 Y 坐标endX: 结束 X 坐标endY: 结束 Y 坐标button(可选): 鼠标按钮,默认为 1(左键)holdKey(可选): 拖拽时按住的修饰键startRelative(可选): 起始坐标是否相对endRelative(可选): 结束坐标是否相对
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "mouse:drag",
"startX": {"type": "px", "value": 100},
"startY": {"type": "px", "value": 100},
"endX": {"type": "px", "value": 500},
"endY": {"type": "px", "value": 500}
}
}'滚动 (mouse:scroll)
在指定位置滚动鼠标滚轮。
参数:
type:"mouse:scroll"x: X 坐标y: Y 坐标stepVertical: 垂直滚动步数(正数向上,负数向下)stepHorizontal: 水平滚动步数(正数向右,负数向左)holdKey(可选): 滚动时按住的修饰键relative(可选): 坐标是否相对
cURL 请求示例:
# 向下滚动 5 步
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "mouse:scroll",
"x": {"type": "/", "numerator": 1, "denominator": 2},
"y": {"type": "/", "numerator": 1, "denominator": 2},
"stepVertical": -5,
"stepHorizontal": 0
}
}'2. 键盘操作
输入文本 (keyboard:type)
输入文本内容。
参数:
type:"keyboard:type"content: 要输入的文本内容treatNewLineAsEnter: 是否将换行符\n视为回车键,默认为false
cURL 请求示例:
# 输入纯文本
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "keyboard:type",
"content": "Hello, Lybic!",
"treatNewLineAsEnter": false
}
}'
# 输入多行文本(换行符作为回车)
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "keyboard:type",
"content": "First line\nSecond line\nThird line",
"treatNewLineAsEnter": true
}
}'快捷键 (keyboard:hotkey)
按下键盘快捷键组合。
参数:
type:"keyboard:hotkey"keys: 快捷键组合,使用 xdotool 键语法。例如:"Return","ctrl+c","alt+Tab","ctrl+shift+s"duration(可选): 按住持续时间(毫秒),范围 1-5000
常用按键名称:
- 字母键:
"a","b","A","B"等 - 数字键:
"1","2","KP_0"(小键盘 0) 等 - 功能键:
"F1","F2", ..."F12" - 特殊键:
"Return"(回车),"Escape","Tab","space","BackSpace","Delete" - 方向键:
"Up","Down","Left","Right" - 修饰键:
"ctrl","alt","shift","super"(Windows/Command 键)
cURL 请求示例:
# 按回车键
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "keyboard:hotkey",
"keys": "Return"
}
}'
# Ctrl+C 复制
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "keyboard:hotkey",
"keys": "ctrl+c"
}
}'
# Ctrl+Shift+S 另存为
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "keyboard:hotkey",
"keys": "ctrl+shift+s"
}
}'
# 按住 Ctrl 键 1 秒
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "keyboard:hotkey",
"keys": "ctrl",
"duration": 1000
}
}'按键按下 (key:down)
按下单个键(不释放)。仅在 keyboard:hotkey 无法满足需求时使用。
参数:
type:"key:down"key: 要按下的键名(xdotool 语法)
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "key:down",
"key": "shift"
}
}'按键释放 (key:up)
释放单个键。仅在 key:down 之后使用。
参数:
type:"key:up"key: 要释放的键名(xdotool 语法)
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "key:up",
"key": "shift"
}
}'3. 通用操作
截图 (screenshot)
获取当前屏幕截图。
参数:
type:"screenshot"
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "screenshot"
}
}'等待 (wait)
暂停指定时间。
参数:
type:"wait"duration: 等待时长(毫秒)
cURL 请求示例:
# 等待 3 秒
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "wait",
"duration": 3000
}
}'任务完成 (finished)
表示任务已成功完成。
参数:
type:"finished"message(可选): 完成消息
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "finished",
"message": "Task completed successfully"
}
}'任务失败 (failed)
表示任务失败。
参数:
type:"failed"message(可选): 失败消息
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "failed",
"message": "Unable to find the target element"
}
}'用户接管 (client:user-takeover)
表示需要人工接管控制。
参数:
type:"client:user-takeover"
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "client:user-takeover"
}
}'Mobile Use 动作空间
适用于 Android 沙箱的移动端操作。
1. 触摸操作
点击 (touch:tap)
在指定坐标处点击屏幕。
参数:
type:"touch:tap"x: X 坐标(像素或分数)y: Y 坐标(像素或分数)
cURL 请求示例:
# 点击屏幕中央
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "touch:tap",
"x": {"type": "/", "numerator": 1, "denominator": 2},
"y": {"type": "/", "numerator": 1, "denominator": 2}
}
}'长按 (touch:longPress)
在指定坐标处长按屏幕。
参数:
type:"touch:longPress"x: X 坐标y: Y 坐标duration: 按住时长(毫秒)
cURL 请求示例:
# 长按 2 秒
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "touch:longPress",
"x": {"type": "px", "value": 500},
"y": {"type": "px", "value": 800},
"duration": 2000
}
}'拖拽 (touch:drag)
从起始坐标拖拽到结束坐标。
参数:
type:"touch:drag"startX: 起始 X 坐标startY: 起始 Y 坐标endX: 结束 X 坐标endY: 结束 Y 坐标
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "touch:drag",
"startX": {"type": "px", "value": 200},
"startY": {"type": "px", "value": 500},
"endX": {"type": "px", "value": 800},
"endY": {"type": "px", "value": 500}
}
}'滑动 (touch:swipe)
在指定位置向指定方向滑动。
参数:
type:"touch:swipe"x: X 坐标y: Y 坐标direction: 滑动方向("up","down","left","right")distance: 滑动距离(像素或分数)
cURL 请求示例:
# 在屏幕中央向上滑动
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "touch:swipe",
"x": {"type": "/", "numerator": 1, "denominator": 2},
"y": {"type": "/", "numerator": 1, "denominator": 2},
"direction": "up",
"distance": {"type": "px", "value": 300}
}
}'2. Android 系统按键
返回键 (android:back)
按下 Android 返回键。
参数:
type:"android:back"
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "android:back"
}
}'主屏键 (android:home)
按下 Android 主屏键。
参数:
type:"android:home"
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "android:home"
}
}'3. 应用管理
启动应用(按包名)(os:startApp)
根据包名启动应用。
参数:
type:"os:startApp"packageName: 应用包名
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "os:startApp",
"packageName": "com.android.chrome"
}
}'启动应用(按名称)(os:startAppByName)
根据应用名称启动应用。
参数:
type:"os:startAppByName"name: 应用名称
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "os:startAppByName",
"name": "Chrome"
}
}'关闭应用(按包名)(os:closeApp)
根据包名关闭应用。
参数:
type:"os:closeApp"packageName: 应用包名
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "os:closeApp",
"packageName": "com.android.chrome"
}
}'关闭应用(按名称)(os:closeAppByName)
根据应用名称关闭应用。
参数:
type:"os:closeAppByName"name: 应用名称
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "os:closeAppByName",
"name": "Chrome"
}
}'列出所有应用 (os:listApps)
获取设备上所有已安装应用的列表。
参数:
type:"os:listApps"
cURL 请求示例:
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"action": {
"type": "os:listApps"
}
}'4. 键盘操作(Mobile)
Mobile Use 也支持键盘操作,与 Computer Use 相同:
keyboard:type- 输入文本keyboard:hotkey- 按快捷键(因为安全原因暂停支持,建议使用keyboard:type输入文本)
参见 Computer Use 键盘操作 章节。
5. 通用操作(Mobile)
Mobile Use 也支持以下通用操作:
screenshot- 截图wait- 等待finished- 任务完成failed- 任务失败client:user-takeover- 用户接管
参见 Computer Use 通用操作 章节。
使用说明
- 替换参数: 在实际使用时,请将示例中的
{orgId}、{sandboxId}、YOUR_API_KEY等替换为实际的值 - 坐标系统:
- 推荐使用分数坐标
{"type": "/", "numerator": x, "denominator": y},可自动适配不同分辨率 - 像素坐标
{"type": "px", "value": x}适用于固定分辨率场景
- 推荐使用分数坐标
- 动作组合: 复杂操作可通过多次 API 调用组合实现
- 错误处理: 检查响应中的
actionResult字段获取执行结果详情