GUI Agent
Lybic GUI Agent Overview
Lybic GUI Agent is an open-source framework and service for creating computer-use and mobile-use agents that can see, reason, and act on real Windows, macOS, Linux and Android (sandbox) desktops. It integrates multi-provider LLM orchestration, visual grounding (UI-TARS), action planning, memory, observability, and optional persistence.
Key capabilities:
- Cross-platform GUI control: local backends (pyautogui*) or remote Lybic sandbox
- Multiple model providers (OpenAI, Anthropic, Gemini, Doubao, etc.) via configurable tool graph
- Fast vs Normal execution modes, takeover (human-in-the-loop), benchmarking (OSWorld) and Prometheus metrics
- gRPC service mode with optional PostgreSQL task storage
Installation
From PyPI:
pip install lybic-guiagentsFrom source (UV):
git clone git@github.com:lybic/agent.git
cd agent
curl -LsSf https://astral.sh/uv/0.8.5/install.sh | sh
uv sync
source .venv/bin/activate
uv pip install -e .If Windows encoding errors occur: set PYTHONUTF8=1 before install.
Explanation of optional dependencies in pyproject.toml
Agentic Lybic's default dependencies only include core functionality (the Lybic backend and interactive CLI agent). The following optional dependencies enable additional backend support or specific features:
[project.optional-dependencies]
dev=[]
all=[]
pyautogui=[]
vmware-and-evaluation=[]
restful=[]
mcp=[]
grpc=[]
prometheus=[]
postgres=[]| Dependencies | Description |
|---|---|
| dev | Dependencies related to local development and testing. |
| pyautogui | (Backend) Enable local desktop automation backend. |
| vmware-and-evaluation | (Backend) Enable VMware backend and Osworld evaluation tool. |
| restful | (Service) Enable agentic lybic RESTful API server support. |
| mcp | (Service) Enable agentic lybic Mcp(Model Context Protocol) server support |
| grpc | (Service) Enable agentic lybic gRPC server support |
| prometheus | (Feature) Enable Prometheus metrics support for the Agent server. |
| postgres | (Feature) Enables PostgreSQL memory and persistent data storage support for the Agent server. |
Basic Configuration
Model Credential Mapping
The API keys you must place in .env depend on which providers your tools_config.json references.
- Each tool entry has
providerandmodel_name; if a provider appears (e.g.doubao,gemini,anthropic,openai), its corresponding API key in.env.examplemust be filled. - If you switch to the English tool config (
tools_config_en.json), ensure keys for those providers (e.g. GEMINI_API_KEY, ARK_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY) are present as needed. - Chinese config (
tools_config_cn.json) may only requireARK_API_KEYwhen all tools use Doubao. - Unused provider keys can be left blank or removed; they are not required unless referenced by a tool stage.
Refer to
gui_agents/tools/model.mdfor full provider list & supported models. After editingtools_config.json, verify that all itsprovidervalues have matching credentials in your environment.
Copy env example then add keys:
cp gui_agents/.env.example gui_agents/.env
# edit .env and add provider keysChoose tool config (English vs Chinese):
cp gui_agents/tools/tools_config_en.json gui_agents/tools/tools_config.json
# or
cp gui_agents/tools/tools_config_cn.json gui_agents/tools/tools_config.jsonRecommended (visual grounding + action):
- grounding / fast_action_generator: doubao-1-5-ui-tars-250428
- action_generator: claude-sonnet-4-20250514 or doubao-seed-1-6-250615
CLI Usage
Interactive (Lybic sandbox backend):
python gui_agents/cli_app.py --backend lybicLocal fast mode:
python gui_agents/cli_app.py --backend pyautogui --mode fastSingle query capped at 20 steps:
python gui_agents/cli_app.py --backend pyautogui --query "Find the result of 8 × 7 on a calculator" --max-steps 20Important options:
- --backend [lybic|lybic_mobile|pyautogui|pyautogui_vmware]
- --max-steps N
- --mode [normal|fast]
- --enable-takeover
- --disable-search
Warning: pyautogui backends directly control your local computer.
Docker
docker run --rm -it --env-file gui_agents/.env agenticlybic/guiagent --backend lybicSandbox Variables
LYBIC_API_KEY=your_key
LYBIC_ORG_ID=your_org
LYBIC_MAX_LIFE_SECONDS=3600
# optional precreated sandbox
LYBIC_PRECREATE_SID=SBX-XXXXXXXXXXXXXXXPython Library Quick Start
from gui_agents import AgentService
service = AgentService()
result = service.execute_task("Take a screenshot")
print(result.status)Core components: AgentService (high-level), AgentS2 / AgentSFast (implementations), HardwareInterface, ServiceConfig.
gRPC Server
For complete documentation and production deployment instructions, please refer to the gRPC server documentation.
Run server (port 50051):
docker run --rm -it -p 50051:50051 --env-file gui_agents/.env agenticlybic/guiagent /app/.venv/bin/lybic-guiagent-grpcEnable Prometheus metrics:
docker run --rm -it \
-p 50051:50051 -p 8000:8000 \
-e ENABLE_PROMETHEUS=true -e PROMETHEUS_PORT=8000 \
--env-file gui_agents/.env \
agenticlybic/guiagent /app/.venv/bin/lybic-guiagent-grpcPython async client:
import asyncio, grpc
from gui_agents.proto.pb import agent_pb2, agent_pb2_grpc
async def main():
async with grpc.aio.insecure_channel('localhost:50051') as channel:
stub = agent_pb2_grpc.AgentStub(channel)
req = agent_pb2.RunAgentInstructionRequest(instruction="Open a calculator and compute 1 + 1")
async for resp in stub.RunAgentInstruction(req):
print(f"[{resp.stage}] {resp.message}")
asyncio.run(main())Generate protobuf stubs (if developing locally):
python -m grpc_tools.protoc -Igui_agents/proto \
--python_out=gui_agents/proto/pb \
--grpc_python_out=gui_agents/proto/pb \
--pyi_out=gui_agents/proto/pb gui_agents/proto/agent.protoPersistence (Optional)
pip install lybic-guiagents[postgres]
# or uv pip install .[postgres]
TASK_STORAGE_BACKEND=postgres
POSTGRES_CONNECTION_STRING=postgresql://user:password@host:port/dbnameVMware Backend (Optional)
Set up VMware, download OS images, extract into vmware_vm_data/Windows-x86 and vmware_vm_data/Ubuntu-x86 then:
USE_PRECREATE_VM=UbuntuTroubleshooting Highlights
- Missing API keys: verify .env and environment vars.
- Encoding errors (Windows): set PYTHONUTF8=1.
- Slow grounding: use recommended models; try --mode fast.
- Sandbox issues: check LYBIC_API_KEY/ORG_ID and lifetime vars.
Next Steps
- Try the playground (Chinese mainland only)
- Attach GUI Agent to MCP server session
- Integrate with Lybic SDK for automated sandbox lifecycle