Mcp Server Reference

Agentic lybic MCP Server Overview

This document describes the MCP (Model Context Protocol) server for GUI Agent automation.

The MCP server provides a standardized interface for GUI automation using the Lybic sandbox infrastructure. It exposes three main tools:

create_sandbox - Create a new sandbox environment
get_sandbox_screenshot - Capture screenshots from sandboxes
execute_instruction - Execute natural language instructions with real-time streaming

Installation

The MCP server is included in the gui-agents package. Install with MCP support:

pip install lybic-guiagents[mcp]

Or install from source:

git clone https://github.com/lybic/agent
cd agent
pip install -e .[mcp]

Configuration

Environment Variables

Create a .env file in the gui_agents/ directory or set these environment variables:

# Lybic Cloud Configuration (required)
LYBIC_API_KEY=your_lybic_api_key
LYBIC_ORG_ID=your_lybic_org_id
LYBIC_API_ENDPOINT=https://api.lybic.cn/  # optional, defaults to this value

# LLM Provider Configuration (optional, can be passed per request)
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
GOOGLE_API_KEY=your_google_key

# Server Configuration (optional)
MCP_PORT=8000  # default port
MCP_HOST=0.0.0.0  # default host
LOG_LEVEL=INFO  # logging level

Access Tokens

Create an access_tokens.txt file in the gui_agents/ directory with valid Bearer tokens (one per line):

token_abc123xyz
another_token_456
# Lines starting with # are comments

Usage

Starting the Server

# Using the entry point
lybic-guiagent-mcp

# Or directly with Python
python -m gui_agents.mcp_app

# With custom port
MCP_PORT=8080 lybic-guiagent-mcp

The server will start on http://0.0.0.0:8000 by default.

API Endpoints

POST /mcp - MCP Streamable HTTP endpoint (requires Bearer token authentication)
GET /health - Health check endpoint
GET / - Server information and available tools

Authentication

All requests to the MCP endpoints require Bearer token authentication:

curl -H "Authorization: Bearer your_token" http://localhost:8000/health

MCP Tools

1. create_sandbox

Create a new sandbox environment for GUI automation.

Parameters:

apikey (string, optional) - Lybic API key (uses LYBIC_API_KEY env var if not provided)
orgid (string, optional) - Lybic Organization ID (uses LYBIC_ORG_ID env var if not provided)
shape (string, optional) - Sandbox configuration, default: "beijing-2c-4g-cpu"

Returns:

Sandbox ID and metadata

Example:

{
  "tool": "create_sandbox",
  "arguments": {
    "shape": "beijing-2c-4g-cpu"
  }
}

2. get_sandbox_screenshot

Capture a screenshot from an existing sandbox.

Parameters:

sandbox_id (string, required) - Sandbox ID from create_sandbox
apikey (string, optional) - Lybic API key
orgid (string, optional) - Lybic Organization ID

Returns:

Screenshot file path and dimensions

Example:

{
  "tool": "get_sandbox_screenshot",
  "arguments": {
    "sandbox_id": "SBX-01234567890"
  }
}

3. execute_instruction

Execute a natural language instruction in a sandbox with real-time streaming.

Parameters:

instruction (string, required) - Natural language task description
sandbox_id (string, optional) - Use existing sandbox or create new one
apikey (string, optional) - Lybic API key
orgid (string, optional) - Lybic Organization ID
mode (string, optional) - Agent mode: "normal" or "fast" (default: "fast")
max_steps (integer, optional) - Maximum execution steps (default: 50)
llm_provider (string, optional) - LLM provider (e.g., "openai", "anthropic")
llm_model (string, optional) - LLM model name (e.g., "gpt-4", "claude-3-sonnet")
llm_api_key (string, optional) - API key for LLM provider
llm_endpoint (string, optional) - Custom LLM endpoint URL

Returns:

Execution results with statistics (steps, tokens, cost, duration)

Example:

{
  "tool": "execute_instruction",
  "arguments": {
    "instruction": "Open calculator and compute 123 + 456",
    "mode": "fast",
    "max_steps": 50,
    "llm_provider": "openai",
    "llm_model": "gpt-4"
  }
}

Example Client Usage

Using MCP SDK

import asyncio
from mcp.client.streamable_http import streamablehttp_client
from mcp import ClientSession
import os

LYBIC_MCP_SERVER_API_KEY = os.environ.get("LYBIC_MCP_SERVER_API_KEY", "default_token_for_testing")
async def main():
    # Connect to a streamable HTTP server
    async with streamablehttp_client('http://localhost:8000/mcp', headers={"Authorization": f"Bearer {LYBIC_MCP_SERVER_API_KEY}"}) as (
        read_stream,
        write_stream,
        _,
    ):
        # Create a session using the client streams
        async with ClientSession(read_stream, write_stream) as session:
            # Initialize the connection
            print("Initializing connection")
            await session.initialize()
            print(await session.list_tools())
            # Call a tool
            print("Calling tool")
            tool_result = await session.call_tool("execute_instruction", {"sandbox_id":"BOX-01KADMDC6TAE8NAJX82HMHSAQT","instruction":"打开浏览器"})
            print(tool_result)

if __name__ == "__main__":
    asyncio.run(main())

Using HTTP Directly

import httpx

BASE_URL = "http://localhost:8000"
TOKEN = "your_bearer_token"

headers = {
    "Authorization": f"Bearer {TOKEN}",
    "Content-Type": "application/json"
}

# Health check
response = httpx.get(f"{BASE_URL}/health", headers=headers)
print(response.json())

# MCP communication via Streamable HTTP
# (Requires a client that can handle streaming responses)

Agent Modes

Fast Mode (Recommended)

Faster execution with direct action generation
Lower token consumption
Best for straightforward tasks
Example: "mode": "fast"

Normal Mode

Full reasoning with hierarchical planning
DAG modeling and memory systems
Better for complex multi-step tasks
Example: "mode": "normal"

LLM Configuration

You can customize the LLM provider per request:

{
  "tool": "execute_instruction",
  "arguments": {
    "instruction": "Your task here",
    "llm_provider": "anthropic",
    "llm_model": "claude-3-sonnet-20240229",
    "llm_api_key": "your_anthropic_key"
  }
}

Supported providers:

openai - GPT-4, GPT-3.5, etc.
anthropic - Claude models
google - Gemini models
doubao - Doubao models
And others configured in your tools_config

Security

Bearer Token Authentication: All requests require a valid Bearer token from access_tokens.txt
Environment Isolation: Each task runs in a separate sandbox environment
API Key Management: API keys can be provided per-request or via environment variables

Troubleshooting

Connection Issues

# Check if server is running
curl http://localhost:8000/health

# Check authentication
curl -H "Authorization: Bearer your_token" http://localhost:8000/health

Missing Dependencies

# Install all dependencies
pip install -e ".[mcp]"

# Or install specific packages
pip install mcp fastapi uvicorn

Environment Variables

Verify your .env file is properly loaded:

python -c "import os; from dotenv import load_dotenv; load_dotenv(); print(os.getenv('LYBIC_API_KEY'))"

Development

Running in Development Mode

# With auto-reload
uvicorn gui_agents.mcp_app:app --reload --port 8000

# With debug logging
LOG_LEVEL=DEBUG lybic-guiagent-mcp

Adding New Tools

Edit gui_agents/mcp_app.py:

Add tool definition in @mcp_server.list_tools()
Implement handler function
Register in @mcp_server.call_tool()

Performance Considerations

Sandbox Creation: Creating a new sandbox takes ~30-60 seconds
Reuse Sandboxes: Pass sandbox_id to reuse existing sandboxes
Fast Mode: Use fast mode for better performance (50-70% faster)
Concurrent Tasks: Multiple tasks can run in parallel with different sandboxes

Monitoring

The server provides metrics through the /health endpoint:

{
  "status": "healthy",
  "server": "gui-agent-mcp-server",
  "active_sandboxes": 2,
  "active_tasks": 1
}

Main README - General GUI Agent documentation
Lybic Documentation - Sandbox platform docs

On this page