CLI Reference

Lybic GUI Agent CLI (`cli_app.py`)

The CLI provides an interactive or one-shot way to run a GUI agent. It wraps model/tool orchestration, environment (sandbox/local), reasoning mode, and safety controls.

Basic Invocation

From source or installed package:

python gui_agents/cli_app.py [OPTIONS] # from source
# or python -m gui_agents.cli_app [OPTIONS]
# or
lybic-guiagents-cli [OPTIONS]         # if installed via pip

If --query is omitted the CLI starts an interactive prompt (single line instructions). Each instruction runs an agent session bounded by --max-steps.

Core Options

Option	Values	Default	Purpose
`--backend`	`lybic` `lybic_mobile` `pyautogui` `pyautogui_vmware`	`lybic`	Select execution environment (cloud sandbox or local control).
`--query`	string	(none)	Run a single instruction non-interactively then exit.
`--max-steps`	integer >0	`50`	Upper bound on action steps for a task. Prevents runaway loops.
`--mode`	`normal` `fast`	`normal`	Trade-off between richer chain-of-thought/memory vs faster minimal reasoning.
`--enable-takeover`	flag	off	Allow agent to pause and request human intervention mid-task.
`--disable-search`	flag	off	Turn off web/search tool calls for offline or deterministic runs.

Backend Selection

lybic: Uses remote Lybic sandbox (recommended). Requires LYBIC_API_KEY / LYBIC_ORG_ID.
lybic_mobile: Android sandbox via Lybic. Requires LYBIC_API_KEY / LYBIC_ORG_ID.
pyautogui: Directly controls your local desktop (USE WITH CAUTION). Additional dependencies related to pyautogui need to be installed.
pyautogui_vmware: Controls a precreated VMware VM (set USE_PRECREATE_VM). Additional vmware-and-evaluation related dependencies need to be installed.

Environment Variables (Common)

# Sandbox
LYBIC_API_KEY=xxx
LYBIC_ORG_ID=org_yyy
LYBIC_PRECREATE_SID=BOX-...        # optional existing sandbox
LYBIC_MAX_LIFE_SECONDS=3600         # lifetime override

# Models (example English config)
GEMINI_ENDPOINT_URL=https://generativelanguage.googleapis.com/v1beta/openai/
GEMINI_API_KEY=sk-...
ARK_API_KEY=ark-...

Chinese model config may only need ARK_API_KEY when using tools_config_cn.json.

Tool Configuration Workflow

# English tooling
cp gui_agents/tools/tools_config_en.json gui_agents/tools/tools_config.json
# Chinese tooling
cp gui_agents/tools/tools_config_cn.json gui_agents/tools/tools_config.json

Recommended models in tools_config: grounding / fast_action_generator -> doubao-1-5-ui-tars-250428; action_generator -> claude-sonnet-4-20250514 or doubao-seed-1-6-250615.

Execution Modes

normal: Full reasoning, memory updates, more model tokens.
fast: Skips verbose reasoning chains; suitable for high-frequency small tasks.

Takeover Flow (`--enable-takeover`)

Agent may emit a takeover request when uncertain; you can then manually adjust UI (if local) or sandbox, press Enter to resume. Without flag these pauses are suppressed.

Example Scenarios

Run interactive in sandbox:

python gui_agents/cli_app.py --backend lybic

Single deterministic task locally (limit steps):

python gui_agents/cli_app.py --backend pyautogui --query "Open calculator and compute 123*45" --max-steps 25 --disable-search

Fast batch style:

python gui_agents/cli_app.py --backend lybic --query "Download today's stock data to a CSV" --mode fast

Human-in-the-loop:

python gui_agents/cli_app.py --backend lybic --enable-takeover

VMware controlled environment:

uv pip install .[vmware-and-evaluation]
export USE_PRECREATE_VM=Ubuntu
python gui_agents/cli_app.py --backend pyautogui_vmware --query "Update system packages" --max-steps 40

Safety Tips

Always close sensitive apps before using pyautogui backend.
Use --max-steps to bound cost and risk.
Prefer sandbox (lybic) for reproducibility & isolation.

Troubleshooting Quick Checks

Symptom	Fix
`KeyError` provider	Verify `.env` and tool config file copied correctly.
Slow grounding	Use recommended UI-TARS model; switch to `fast` if acceptable.
Sandbox not created	Ensure `LYBIC_API_KEY` / `LYBIC_ORG_ID` valid and quota available.
Unicode install error (Win)	`set PYTHONUTF8=1` then reinstall.

Exit Behavior

Successful single --query: exits with code 0.
Max steps reached without success: non-zero (implementation dependent) – monitor logs.
Keyboard interrupt (Ctrl+C): attempt graceful shutdown.

Logging & Observability

Enable Prometheus when running gRPC server variant (see GUI Agent main doc). CLI inherits internal metrics but does not expose /metrics directly.

Updating / Upgrading

pip install -U lybic-guiagents
# or if source
uv sync && uv pip install -e .

Minimal Testing Script

python gui_agents/cli_app.py --backend lybic --query "Take a screenshot and save it to desktop" --max-steps 15 --mode fast

When To Use CLI vs gRPC

Use CLI	Use gRPC Service
Manual experiments	Multi-language integration
Local debugging	Horizontal scaling / persistence
One-off automation	Long-running orchestrators

For deeper architecture see the GUI Agent overview page.

On this page