Lybic Docs

GUI Agent

Lybic GUI Agent Overview

Lybic GUI Agent is an open-source framework and service for creating computer-use and mobile-use agents that can see, reason, and act on real Windows, macOS, Linux and Android (sandbox) desktops. It integrates multi-provider LLM orchestration, visual grounding (UI-TARS), action planning, memory, observability, and optional persistence.

Key capabilities:

  • Cross-platform GUI control: local backends (pyautogui*) or remote Lybic sandbox
  • Multiple model providers (OpenAI, Anthropic, Gemini, Doubao, etc.) via configurable tool graph
  • Fast vs Normal execution modes, takeover (human-in-the-loop), benchmarking (OSWorld) and Prometheus metrics
  • gRPC service mode with optional PostgreSQL task storage

Installation

From PyPI:

pip install lybic-guiagents

From source (UV):

git clone git@github.com:lybic/agent.git
cd agent
curl -LsSf https://astral.sh/uv/0.8.5/install.sh | sh
uv sync
source .venv/bin/activate
uv pip install -e .

If Windows encoding errors occur: set PYTHONUTF8=1 before install.

Explanation of optional dependencies in pyproject.toml

Agentic Lybic's default dependencies only include core functionality (the Lybic backend and interactive CLI agent). The following optional dependencies enable additional backend support or specific features:

[project.optional-dependencies]
dev=[]
all=[]
pyautogui=[]
vmware-and-evaluation=[]
restful=[]
mcp=[]
grpc=[]
prometheus=[]
postgres=[]
DependenciesDescription
devDependencies related to local development and testing.
pyautogui(Backend) Enable local desktop automation backend.
vmware-and-evaluation(Backend) Enable VMware backend and Osworld evaluation tool.
restful(Service) Enable agentic lybic RESTful API server support.
mcp(Service) Enable agentic lybic Mcp(Model Context Protocol) server support
grpc(Service) Enable agentic lybic gRPC server support
prometheus(Feature) Enable Prometheus metrics support for the Agent server.
postgres(Feature) Enables PostgreSQL memory and persistent data storage support for the Agent server.

Basic Configuration

Model Credential Mapping

The API keys you must place in .env depend on which providers your tools_config.json references.

  • Each tool entry has provider and model_name; if a provider appears (e.g. doubao, gemini, anthropic, openai), its corresponding API key in .env.example must be filled.
  • If you switch to the English tool config (tools_config_en.json), ensure keys for those providers (e.g. GEMINI_API_KEY, ARK_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY) are present as needed.
  • Chinese config (tools_config_cn.json) may only require ARK_API_KEY when all tools use Doubao.
  • Unused provider keys can be left blank or removed; they are not required unless referenced by a tool stage. Refer to gui_agents/tools/model.md for full provider list & supported models. After editing tools_config.json, verify that all its provider values have matching credentials in your environment.

Copy env example then add keys:

cp gui_agents/.env.example gui_agents/.env
# edit .env and add provider keys

Choose tool config (English vs Chinese):

cp gui_agents/tools/tools_config_en.json gui_agents/tools/tools_config.json
# or
cp gui_agents/tools/tools_config_cn.json gui_agents/tools/tools_config.json

Recommended (visual grounding + action):

  • grounding / fast_action_generator: doubao-1-5-ui-tars-250428
  • action_generator: claude-sonnet-4-20250514 or doubao-seed-1-6-250615

CLI Usage

Interactive (Lybic sandbox backend):

python gui_agents/cli_app.py --backend lybic

Local fast mode:

python gui_agents/cli_app.py --backend pyautogui --mode fast

Single query capped at 20 steps:

python gui_agents/cli_app.py --backend pyautogui --query "Find the result of 8 × 7 on a calculator" --max-steps 20

Important options:

  • --backend [lybic|lybic_mobile|pyautogui|pyautogui_vmware]
  • --max-steps N
  • --mode [normal|fast]
  • --enable-takeover
  • --disable-search

Warning: pyautogui backends directly control your local computer.

Docker

docker run --rm -it --env-file gui_agents/.env agenticlybic/guiagent --backend lybic

Sandbox Variables

LYBIC_API_KEY=your_key
LYBIC_ORG_ID=your_org
LYBIC_MAX_LIFE_SECONDS=3600
# optional precreated sandbox
LYBIC_PRECREATE_SID=SBX-XXXXXXXXXXXXXXX

Python Library Quick Start

from gui_agents import AgentService
service = AgentService()
result = service.execute_task("Take a screenshot")
print(result.status)

Core components: AgentService (high-level), AgentS2 / AgentSFast (implementations), HardwareInterface, ServiceConfig.

gRPC Server

For complete documentation and production deployment instructions, please refer to the gRPC server documentation.

Run server (port 50051):

docker run --rm -it -p 50051:50051 --env-file gui_agents/.env agenticlybic/guiagent /app/.venv/bin/lybic-guiagent-grpc

Enable Prometheus metrics:

docker run --rm -it \
  -p 50051:50051 -p 8000:8000 \
  -e ENABLE_PROMETHEUS=true -e PROMETHEUS_PORT=8000 \
  --env-file gui_agents/.env \
  agenticlybic/guiagent /app/.venv/bin/lybic-guiagent-grpc

Python async client:

import asyncio, grpc
from gui_agents.proto.pb import agent_pb2, agent_pb2_grpc

async def main():
    async with grpc.aio.insecure_channel('localhost:50051') as channel:
        stub = agent_pb2_grpc.AgentStub(channel)
        req = agent_pb2.RunAgentInstructionRequest(instruction="Open a calculator and compute 1 + 1")
        async for resp in stub.RunAgentInstruction(req):
            print(f"[{resp.stage}] {resp.message}")

asyncio.run(main())

Generate protobuf stubs (if developing locally):

python -m grpc_tools.protoc -Igui_agents/proto \
  --python_out=gui_agents/proto/pb \
  --grpc_python_out=gui_agents/proto/pb \
  --pyi_out=gui_agents/proto/pb gui_agents/proto/agent.proto

Persistence (Optional)

pip install lybic-guiagents[postgres]
# or uv pip install .[postgres]
TASK_STORAGE_BACKEND=postgres
POSTGRES_CONNECTION_STRING=postgresql://user:password@host:port/dbname

VMware Backend (Optional)

Set up VMware, download OS images, extract into vmware_vm_data/Windows-x86 and vmware_vm_data/Ubuntu-x86 then:

USE_PRECREATE_VM=Ubuntu

Troubleshooting Highlights

  • Missing API keys: verify .env and environment vars.
  • Encoding errors (Windows): set PYTHONUTF8=1.
  • Slow grounding: use recommended models; try --mode fast.
  • Sandbox issues: check LYBIC_API_KEY/ORG_ID and lifetime vars.

Next Steps

  • Try the playground (Chinese mainland only)
  • Attach GUI Agent to MCP server session
  • Integrate with Lybic SDK for automated sandbox lifecycle

On this page