cremote/mcp/README.md

5.0 KiB

Cremote MCP Server

This is a Model Context Protocol (MCP) server that exposes cremote's web automation capabilities to LLMs and AI agents. Instead of using CLI commands, this server provides a structured API that maintains state and provides intelligent abstractions.

Features

  • State Management: Automatically tracks current tab, tab history, and iframe context
  • Intelligent Abstractions: High-level tools that combine multiple cremote operations
  • Automatic Screenshots: Optional screenshot capture for debugging and documentation
  • Error Recovery: Better error handling and context for LLMs
  • Resource Management: Automatic cleanup and connection management

Quick Start for LLMs

For LLM agents: See the comprehensive LLM MCP Guide for detailed usage instructions, examples, and best practices.

Available Tools

1. web_navigate

Navigate to URLs with optional screenshot capture.

{
  "name": "web_navigate",
  "arguments": {
    "url": "https://example.com",
    "screenshot": true,
    "timeout": 10
  }
}

2. web_interact

Interact with web elements (click, fill, submit, upload).

{
  "name": "web_interact",
  "arguments": {
    "action": "fill",
    "selector": "#username",
    "value": "testuser",
    "timeout": 5
  }
}

3. web_extract

Extract data from pages (source, element HTML, JavaScript execution).

{
  "name": "web_extract",
  "arguments": {
    "type": "javascript",
    "code": "document.title",
    "timeout": 5
  }
}

4. web_screenshot

Take screenshots of the current page.

{
  "name": "web_screenshot",
  "arguments": {
    "output": "/tmp/page.png",
    "full_page": true,
    "timeout": 5
  }
}

5. web_manage_tabs

Manage browser tabs (open, close, list, switch).

{
  "name": "web_manage_tabs",
  "arguments": {
    "action": "open",
    "timeout": 5
  }
}

6. web_iframe

Switch iframe context for subsequent operations.

{
  "name": "web_iframe",
  "arguments": {
    "action": "enter",
    "selector": "iframe#payment-form"
  }
}

Installation & Usage

Prerequisites

  1. Cremote daemon must be running:

    cremotedaemon
    
  2. Chrome/Chromium with remote debugging:

    chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug
    

Build the MCP Server

cd mcp/
go build -o cremote-mcp .

Configuration

Set environment variables to configure the cremote connection:

export CREMOTE_HOST=localhost
export CREMOTE_PORT=8989

Running with Claude Desktop

Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "cremote": {
      "command": "/path/to/cremote-mcp",
      "env": {
        "CREMOTE_HOST": "localhost",
        "CREMOTE_PORT": "8989"
      }
    }
  }
}

Running with Other MCP Clients

The server communicates via JSON-RPC over stdio, so it can be used with any MCP-compatible client:

echo '{"method":"tools/list","params":{},"id":1}' | ./cremote-mcp

Response Format

All tool responses include:

{
  "success": true,
  "data": "...",
  "screenshot": "/tmp/screenshot.png",
  "current_tab": "tab-id-123",
  "tab_history": ["tab-id-123", "tab-id-456"],
  "iframe_mode": false,
  "error": null,
  "metadata": {}
}

Example Workflow

// 1. Navigate to a page
{
  "name": "web_navigate",
  "arguments": {
    "url": "https://example.com/login",
    "screenshot": true
  }
}

// 2. Fill login form
{
  "name": "web_interact",
  "arguments": {
    "action": "fill",
    "selector": "#username",
    "value": "testuser"
  }
}

{
  "name": "web_interact",
  "arguments": {
    "action": "fill",
    "selector": "#password",
    "value": "password123"
  }
}

// 3. Submit form
{
  "name": "web_interact",
  "arguments": {
    "action": "click",
    "selector": "#login-button"
  }
}

// 4. Extract result
{
  "name": "web_extract",
  "arguments": {
    "type": "javascript",
    "code": "document.querySelector('.welcome-message')?.textContent"
  }
}

// 5. Take final screenshot
{
  "name": "web_screenshot",
  "arguments": {
    "output": "/tmp/login-success.png",
    "full_page": true
  }
}

Benefits Over CLI

  • State Management: No need to manually track tab IDs
  • Better Error Context: Rich error information for debugging
  • Automatic Screenshots: Built-in screenshot capture for documentation
  • Intelligent Defaults: Smart parameter handling and fallbacks
  • Resource Cleanup: Automatic management of tabs and files
  • Structured Responses: Consistent, parseable response format

Development

To extend the MCP server with new tools:

  1. Add the tool definition to handleToolsList()
  2. Add a case in handleToolCall()
  3. Implement the handler function following the pattern of existing handlers
  4. Update this documentation

The server is designed to be easily extensible while maintaining consistency with the cremote client library.