Lybic Docs

动作空间

动作空间(Action Space)定义了智能体在沙箱中的可执行的仿人类的操作集合,从而控制智能体的类人操作的行为范围。本文将介绍lybic沙箱中支持的动作类型及其使用方法。

概述

Lybic 提供统一的动作执行接口,支持两种主要的使用模式:

  • Computer Use(电脑端使用):适用于 Windows、Linux 沙箱,支持鼠标、键盘操作
  • Mobile Use(移动端使用):适用于 Android 沙箱,支持触摸、滑动、应用管理等操作

我们建议你使用SDK封装的动作执行接口来简化调用流程、规范调用参数,详见 SDK 文档

API 接口

执行动作

端点: POST /api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute

通用参数:

  • action (object, 必填): 要执行的动作对象(详见下文动作类型)
  • includeScreenShot (boolean, 可选): 是否在响应中包含动作执行后的截图 URL,默认为 true
  • includeCursorPosition (boolean, 可选): 是否在响应中包含动作执行后的光标/触摸位置,默认为 true

响应格式:

{
  "screenShot": "https://...",
  "cursorPosition": {
    "x": 500,
    "y": 300,
    "screenWidth": 1920,
    "screenHeight": 1080,
    "screenIndex": 0
  },
  "actionResult": {}
}

Computer Use 动作空间

适用于 Windows、Linux 沙箱的桌面操作。

1. 鼠标操作

单击 (mouse:click)

在指定坐标处点击鼠标。

参数:

  • type: "mouse:click"
  • x: X 坐标(像素或分数)
  • y: Y 坐标(像素或分数)
  • button: 鼠标按钮标志(1=左键, 2=右键, 4=中键, 8=后退, 16=前进)
  • holdKey (可选): 点击时按住的修饰键,如 "ctrl", "alt", "alt+shift"
  • relative (可选): 坐标是否相对于当前鼠标位置

坐标格式:

像素坐标:

{
  "type": "px",
  "value": 500
}

分数坐标(推荐,适配不同分辨率):

{
  "type": "/",
  "numerator": 1,
  "denominator": 2
}

cURL 请求示例:

# 在屏幕中央(50%, 50%)左键单击
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "mouse:click",
      "x": {"type": "/", "numerator": 1, "denominator": 2},
      "y": {"type": "/", "numerator": 1, "denominator": 2},
      "button": 1
    }
  }'

# 在绝对位置 (500, 300) 右键单击
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "mouse:click",
      "x": {"type": "px", "value": 500},
      "y": {"type": "px", "value": 300},
      "button": 2
    }
  }'

# Ctrl + 左键单击
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "mouse:click",
      "x": {"type": "px", "value": 500},
      "y": {"type": "px", "value": 300},
      "button": 1,
      "holdKey": "ctrl"
    }
  }'

双击 (mouse:doubleClick)

在指定坐标处双击鼠标。

参数: 与单击相同

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "mouse:doubleClick",
      "x": {"type": "px", "value": 400},
      "y": {"type": "px", "value": 200},
      "button": 1
    }
  }'

三击 (mouse:tripleClick)

在指定坐标处三击鼠标(通常用于选择整行文本)。

参数: 与单击相同

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "mouse:tripleClick",
      "x": {"type": "px", "value": 400},
      "y": {"type": "px", "value": 200},
      "button": 1
    }
  }'

移动 (mouse:move)

移动鼠标到指定坐标。

参数:

  • type: "mouse:move"
  • x: X 坐标
  • y: Y 坐标
  • holdKey (可选): 移动时按住的修饰键
  • relative (可选): 是否相对移动

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "mouse:move",
      "x": {"type": "px", "value": 800},
      "y": {"type": "px", "value": 600}
    }
  }'

拖拽 (mouse:drag)

从起始坐标拖拽到结束坐标。

参数:

  • type: "mouse:drag"
  • startX: 起始 X 坐标
  • startY: 起始 Y 坐标
  • endX: 结束 X 坐标
  • endY: 结束 Y 坐标
  • button (可选): 鼠标按钮,默认为 1(左键)
  • holdKey (可选): 拖拽时按住的修饰键
  • startRelative (可选): 起始坐标是否相对
  • endRelative (可选): 结束坐标是否相对

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "mouse:drag",
      "startX": {"type": "px", "value": 100},
      "startY": {"type": "px", "value": 100},
      "endX": {"type": "px", "value": 500},
      "endY": {"type": "px", "value": 500}
    }
  }'

滚动 (mouse:scroll)

在指定位置滚动鼠标滚轮。

参数:

  • type: "mouse:scroll"
  • x: X 坐标
  • y: Y 坐标
  • stepVertical: 垂直滚动步数(正数向上,负数向下)
  • stepHorizontal: 水平滚动步数(正数向右,负数向左)
  • holdKey (可选): 滚动时按住的修饰键
  • relative (可选): 坐标是否相对

cURL 请求示例:

# 向下滚动 5 步
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "mouse:scroll",
      "x": {"type": "/", "numerator": 1, "denominator": 2},
      "y": {"type": "/", "numerator": 1, "denominator": 2},
      "stepVertical": -5,
      "stepHorizontal": 0
    }
  }'

2. 键盘操作

输入文本 (keyboard:type)

输入文本内容。

参数:

  • type: "keyboard:type"
  • content: 要输入的文本内容
  • treatNewLineAsEnter: 是否将换行符 \n 视为回车键,默认为 false

cURL 请求示例:

# 输入纯文本
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "keyboard:type",
      "content": "Hello, Lybic!",
      "treatNewLineAsEnter": false
    }
  }'

# 输入多行文本(换行符作为回车)
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "keyboard:type",
      "content": "First line\nSecond line\nThird line",
      "treatNewLineAsEnter": true
    }
  }'

快捷键 (keyboard:hotkey)

按下键盘快捷键组合。

参数:

  • type: "keyboard:hotkey"
  • keys: 快捷键组合,使用 xdotool 键语法。例如:"Return", "ctrl+c", "alt+Tab", "ctrl+shift+s"
  • duration (可选): 按住持续时间(毫秒),范围 1-5000

常用按键名称:

  • 字母键:"a", "b", "A", "B"
  • 数字键:"1", "2", "KP_0" (小键盘 0) 等
  • 功能键:"F1", "F2", ... "F12"
  • 特殊键:"Return" (回车), "Escape", "Tab", "space", "BackSpace", "Delete"
  • 方向键:"Up", "Down", "Left", "Right"
  • 修饰键:"ctrl", "alt", "shift", "super" (Windows/Command 键)

cURL 请求示例:

# 按回车键
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "keyboard:hotkey",
      "keys": "Return"
    }
  }'

# Ctrl+C 复制
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "keyboard:hotkey",
      "keys": "ctrl+c"
    }
  }'

# Ctrl+Shift+S 另存为
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "keyboard:hotkey",
      "keys": "ctrl+shift+s"
    }
  }'

# 按住 Ctrl 键 1 秒
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "keyboard:hotkey",
      "keys": "ctrl",
      "duration": 1000
    }
  }'

按键按下 (key:down)

按下单个键(不释放)。仅在 keyboard:hotkey 无法满足需求时使用。

参数:

  • type: "key:down"
  • key: 要按下的键名(xdotool 语法)

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "key:down",
      "key": "shift"
    }
  }'

按键释放 (key:up)

释放单个键。仅在 key:down 之后使用。

参数:

  • type: "key:up"
  • key: 要释放的键名(xdotool 语法)

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "key:up",
      "key": "shift"
    }
  }'

3. 通用操作

截图 (screenshot)

获取当前屏幕截图。

参数:

  • type: "screenshot"

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "screenshot"
    }
  }'

等待 (wait)

暂停指定时间。

参数:

  • type: "wait"
  • duration: 等待时长(毫秒)

cURL 请求示例:

# 等待 3 秒
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "wait",
      "duration": 3000
    }
  }'

任务完成 (finished)

表示任务已成功完成。

参数:

  • type: "finished"
  • message (可选): 完成消息

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "finished",
      "message": "Task completed successfully"
    }
  }'

任务失败 (failed)

表示任务失败。

参数:

  • type: "failed"
  • message (可选): 失败消息

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "failed",
      "message": "Unable to find the target element"
    }
  }'

用户接管 (client:user-takeover)

表示需要人工接管控制。

参数:

  • type: "client:user-takeover"

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "client:user-takeover"
    }
  }'

Mobile Use 动作空间

适用于 Android 沙箱的移动端操作。

1. 触摸操作

点击 (touch:tap)

在指定坐标处点击屏幕。

参数:

  • type: "touch:tap"
  • x: X 坐标(像素或分数)
  • y: Y 坐标(像素或分数)

cURL 请求示例:

# 点击屏幕中央
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "touch:tap",
      "x": {"type": "/", "numerator": 1, "denominator": 2},
      "y": {"type": "/", "numerator": 1, "denominator": 2}
    }
  }'

长按 (touch:longPress)

在指定坐标处长按屏幕。

参数:

  • type: "touch:longPress"
  • x: X 坐标
  • y: Y 坐标
  • duration: 按住时长(毫秒)

cURL 请求示例:

# 长按 2 秒
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "touch:longPress",
      "x": {"type": "px", "value": 500},
      "y": {"type": "px", "value": 800},
      "duration": 2000
    }
  }'

拖拽 (touch:drag)

从起始坐标拖拽到结束坐标。

参数:

  • type: "touch:drag"
  • startX: 起始 X 坐标
  • startY: 起始 Y 坐标
  • endX: 结束 X 坐标
  • endY: 结束 Y 坐标

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "touch:drag",
      "startX": {"type": "px", "value": 200},
      "startY": {"type": "px", "value": 500},
      "endX": {"type": "px", "value": 800},
      "endY": {"type": "px", "value": 500}
    }
  }'

滑动 (touch:swipe)

在指定位置向指定方向滑动。

参数:

  • type: "touch:swipe"
  • x: X 坐标
  • y: Y 坐标
  • direction: 滑动方向("up", "down", "left", "right"
  • distance: 滑动距离(像素或分数)

cURL 请求示例:

# 在屏幕中央向上滑动
curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "touch:swipe",
      "x": {"type": "/", "numerator": 1, "denominator": 2},
      "y": {"type": "/", "numerator": 1, "denominator": 2},
      "direction": "up",
      "distance": {"type": "px", "value": 300}
    }
  }'

2. Android 系统按键

返回键 (android:back)

按下 Android 返回键。

参数:

  • type: "android:back"

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "android:back"
    }
  }'

主屏键 (android:home)

按下 Android 主屏键。

参数:

  • type: "android:home"

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "android:home"
    }
  }'

3. 应用管理

启动应用(按包名)(os:startApp)

根据包名启动应用。

参数:

  • type: "os:startApp"
  • packageName: 应用包名

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "os:startApp",
      "packageName": "com.android.chrome"
    }
  }'

启动应用(按名称)(os:startAppByName)

根据应用名称启动应用。

参数:

  • type: "os:startAppByName"
  • name: 应用名称

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "os:startAppByName",
      "name": "Chrome"
    }
  }'

关闭应用(按包名)(os:closeApp)

根据包名关闭应用。

参数:

  • type: "os:closeApp"
  • packageName: 应用包名

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "os:closeApp",
      "packageName": "com.android.chrome"
    }
  }'

关闭应用(按名称)(os:closeAppByName)

根据应用名称关闭应用。

参数:

  • type: "os:closeAppByName"
  • name: 应用名称

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "os:closeAppByName",
      "name": "Chrome"
    }
  }'

列出所有应用 (os:listApps)

获取设备上所有已安装应用的列表。

参数:

  • type: "os:listApps"

cURL 请求示例:

curl -X POST "https://api.lybic.cn/api/orgs/{orgId}/sandboxes/{sandboxId}/actions/execute" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "action": {
      "type": "os:listApps"
    }
  }'

4. 键盘操作(Mobile)

Mobile Use 也支持键盘操作,与 Computer Use 相同:

  • keyboard:type - 输入文本
  • keyboard:hotkey - 按快捷键(因为安全原因暂停支持,建议使用 keyboard:type 输入文本)

参见 Computer Use 键盘操作 章节。

5. 通用操作(Mobile)

Mobile Use 也支持以下通用操作:

  • screenshot - 截图
  • wait - 等待
  • finished - 任务完成
  • failed - 任务失败
  • client:user-takeover - 用户接管

参见 Computer Use 通用操作 章节。

使用说明

  1. 替换参数: 在实际使用时,请将示例中的 {orgId}{sandboxId}YOUR_API_KEY 等替换为实际的值
  2. 坐标系统:
    • 推荐使用分数坐标 {"type": "/", "numerator": x, "denominator": y},可自动适配不同分辨率
    • 像素坐标 {"type": "px", "value": x} 适用于固定分辨率场景
  3. 动作组合: 复杂操作可通过多次 API 调用组合实现
  4. 错误处理: 检查响应中的 actionResult 字段获取执行结果详情

本页内容