# Cremote MCP Server

This is a Model Context Protocol (MCP) server that exposes cremote's web automation capabilities to LLMs and AI agents. Instead of using CLI commands, this server provides a structured API that maintains state and provides intelligent abstractions.

## 🎉 Complete Web Automation Platform

**30 comprehensive tools** across 6 enhancement phases, providing a complete web automation toolkit for LLM agents:

### 🚀 **NEW: Multi-Client Support**

The Cremote MCP server now supports **multiple concurrent clients** with isolated browser sessions:

- **Concurrent Agents**: Multiple AI agents can use the same browser simultaneously
- **Session Isolation**: Each client maintains independent browser state (tabs, history, iframe context)
- **Transport Flexibility**: Choose between stdio (single client) or HTTP (multiple clients)
- **Backward Compatible**: Existing stdio clients continue to work unchanged

See the [Multi-Client Guide](MULTI_CLIENT_GUIDE.md) for detailed setup and usage instructions.

- **Phase 1**: Element state checking and conditional logic (2 tools)
- **Phase 2**: Enhanced data extraction and batch operations (4 tools)
- **Phase 3**: Form analysis and bulk operations (3 tools)
- **Phase 4**: Page state and metadata tools (4 tools)
- **Phase 5**: Enhanced screenshots and file management (4 tools)
- **Core Tools**: Essential web automation capabilities (10 tools)

## Features

- **State Management**: Automatically tracks current tab, tab history, and iframe context
- **Intelligent Abstractions**: High-level tools that combine multiple cremote operations
- **Batch Operations**: Reduce round trips with bulk operations and multi-selector extraction
- **Form Intelligence**: Complete form analysis and bulk filling capabilities
- **Rich Context**: Page metadata, performance metrics, and content verification
- **Enhanced Screenshots**: Element-specific and metadata-rich screenshot capture
- **File Management**: Bulk file operations and automated cleanup
- **Accessibility Tree**: Chrome accessibility tree interface for semantic understanding
- **Automatic Screenshots**: Optional screenshot capture for debugging and documentation
- **Error Recovery**: Better error handling and context for LLMs
- **Resource Management**: Automatic cleanup and connection management

## Quick Start for LLMs

**For LLM agents**: See the comprehensive [LLM Usage Guide](LLM_USAGE_GUIDE.md) for detailed usage instructions, examples, and best practices.

## Available Tools (30 Total)

### Version Information

#### `version_cremotemcp`
Get version information for MCP server and daemon.

```json
{
  "name": "version_cremotemcp",
  "arguments": {}
}
```

Returns version information for both the MCP server and the connected daemon.

### Core Web Automation Tools (10 tools)

#### 1. `web_navigate_cremotemcp`
Navigate to URLs with optional screenshot capture.

```json
{
  "name": "web_navigate_cremotemcp",
  "arguments": {
    "url": "https://example.com",
    "screenshot": true,
    "timeout": 10
  }
}
```

#### 2. `web_interact_cremotemcp`
Interact with web elements (click, fill, submit, upload, select).

```json
{
  "name": "web_interact_cremotemcp",
  "arguments": {
    "action": "fill",
    "selector": "#username",
    "value": "testuser",
    "timeout": 5
  }
}
```

For select dropdowns:
```json
{
  "name": "web_interact_cremotemcp",
  "arguments": {
    "action": "select",
    "selector": "#country",
    "value": "United States",
    "timeout": 5
  }
}
```

#### 3. `web_extract_cremotemcp`
Extract data from pages (source, element HTML, JavaScript execution).

```json
{
  "name": "web_extract_cremotemcp",
  "arguments": {
    "type": "javascript",
    "code": "document.title",
    "timeout": 5
  }
}
```

#### 4. `web_screenshot_cremotemcp`
Take screenshots of the current page.

```json
{
  "name": "web_screenshot_cremotemcp",
  "arguments": {
    "output": "/tmp/page.png",
    "full_page": true,
    "timeout": 5
  }
}
```

#### 5. `web_manage_tabs_cremotemcp`
Manage browser tabs (open, close, list, switch).

```json
{
  "name": "web_manage_tabs_cremotemcp",
  "arguments": {
    "action": "open",
    "timeout": 5
  }
}
```

#### 6. `web_iframe_cremotemcp`
Switch iframe context for subsequent operations.

```json
{
  "name": "web_iframe_cremotemcp",
  "arguments": {
    "action": "enter",
    "selector": "iframe#payment-form"
  }
}
```

#### 7. `file_upload_cremotemcp`
Upload files from client to container for use in form uploads.

```json
{
  "name": "file_upload_cremotemcp",
  "arguments": {
    "local_path": "/local/file.txt",
    "container_path": "/tmp/file.txt"
  }
}
```

**Note**: The CLI `cremote upload-file` command now automatically transfers files to the daemon container first, making file uploads seamless even when the daemon runs in a container.

#### 8. `file_download_cremotemcp`
Download files from container to client (e.g., downloaded files from browser).

```json
{
  "name": "file_download_cremotemcp",
  "arguments": {
    "container_path": "/tmp/downloaded-file.pdf",
    "local_path": "/local/downloaded-file.pdf"
  }
}
```

#### 9. `console_logs_cremotemcp`
Get console logs from the browser tab.

```json
{
  "name": "console_logs_cremotemcp",
  "arguments": {
    "tab": "tab-123",
    "timeout": 5
  }
}
```

#### 10. `console_command_cremotemcp`
Execute commands in the browser console.

```json
{
  "name": "console_command_cremotemcp",
  "arguments": {
    "command": "document.getElementById('test').innerHTML = 'Hello World'",
    "tab": "tab-123",
    "timeout": 5
  }
}
```

### Phase 1: Element State and Checking Tools (2 tools)

#### 11. `web_element_check_cremotemcp`
Check element existence, visibility, enabled state, and other properties without interaction.

```json
{
  "name": "web_element_check_cremotemcp",
  "arguments": {
    "selector": "#submit-button",
    "check_type": "all",
    "timeout": 5
  }
}
```

**Check Types:**
- `exists`: Check if element exists in DOM
- `visible`: Check if element is visible (not hidden)
- `enabled`: Check if element is enabled (not disabled)
- `focused`: Check if element has focus
- `selected`: Check if element is selected (checkboxes, radio buttons)
- `all`: Check all states above

**Response includes:**
```json
{
  "exists": true,
  "visible": true,
  "enabled": false,
  "focused": false,
  "selected": true,
  "count": 1
}
```

#### 12. `web_element_attributes_cremotemcp`
Get element attributes, properties, and computed styles.

```json
{
  "name": "web_element_attributes_cremotemcp",
  "arguments": {
    "selector": "#user-profile",
    "attributes": "all",
    "timeout": 5
  }
}
```

**Attribute Options:**
- `all`: Get common attributes, properties, and styles
- `"id,class,href"`: Comma-separated list of specific attributes
- `"style_display,style_color"`: Computed styles (prefix with `style_`)
- `"prop_textContent,prop_value"`: JavaScript properties (prefix with `prop_`)

**Example Response:**
```json
{
  "id": "user-profile",
  "class": "profile-card active",
  "data-user-id": "12345",
  "textContent": "John Doe",
  "style_display": "block",
  "style_color": "rgb(0, 0, 0)"
}
```

### Phase 2: Enhanced Data Extraction Tools (4 tools)

#### 13. `web_extract_multiple_cremotemcp`
Extract data from multiple selectors in a single call for improved efficiency.

```json
{
  "name": "web_extract_multiple_cremotemcp",
  "arguments": {
    "selectors": {
      "title": "h1",
      "price": ".price",
      "description": ".product-description"
    },
    "timeout": 5
  }
}
```

#### 14. `web_extract_links_cremotemcp`
Extract all links from a page with powerful filtering options.

```json
{
  "name": "web_extract_links_cremotemcp",
  "arguments": {
    "container_selector": "nav",
    "href_pattern": "https://.*",
    "text_pattern": ".*Download.*",
    "timeout": 5
  }
}
```

#### 15. `web_extract_table_cremotemcp`
Extract table data as structured JSON with optional header processing.

```json
{
  "name": "web_extract_table_cremotemcp",
  "arguments": {
    "selector": "#data-table",
    "include_headers": true,
    "timeout": 5
  }
}
```

#### 16. `web_extract_text_cremotemcp`
Extract text content with optional pattern matching and different extraction types.

```json
{
  "name": "web_extract_text_cremotemcp",
  "arguments": {
    "selector": ".content",
    "pattern": "\\d{3}-\\d{3}-\\d{4}",
    "extract_type": "textContent",
    "timeout": 5
  }
}
```

### Phase 3: Form Analysis and Bulk Operations (3 tools)

#### 17. `web_form_analyze_cremotemcp`
Analyze forms completely to understand their structure, fields, and submission requirements.

```json
{
  "name": "web_form_analyze_cremotemcp",
  "arguments": {
    "selector": "#registration-form",
    "timeout": 10
  }
}
```

#### 18. `web_interact_multiple_cremotemcp`
Perform multiple interactions in a single call for efficient batch operations.

```json
{
  "name": "web_interact_multiple_cremotemcp",
  "arguments": {
    "interactions": [
      {"selector": "#username", "action": "fill", "value": "testuser"},
      {"selector": "#password", "action": "fill", "value": "testpass"},
      {"selector": "#remember-me", "action": "check"},
      {"selector": "#login-btn", "action": "click"}
    ],
    "timeout": 10
  }
}
```

#### 19. `web_form_fill_bulk_cremotemcp`
Fill entire forms with key-value pairs in a single operation.

```json
{
  "name": "web_form_fill_bulk_cremotemcp",
  "arguments": {
    "form_selector": "#contact-form",
    "fields": {
      "name": "John Doe",
      "email": "john@example.com",
      "message": "Hello, this is a test message."
    },
    "timeout": 10
  }
}
```

### Phase 4: Page State and Metadata Tools (4 tools)

#### 20. `web_page_info_cremotemcp`
Get comprehensive page metadata and state information.

```json
{
  "name": "web_page_info_cremotemcp",
  "arguments": {
    "tab": "tab-123",
    "timeout": 5
  }
}
```

Returns detailed page information including title, URL, loading state, domain, protocol, and browser status.

#### 21. `web_viewport_info_cremotemcp`
Get viewport and scroll information.

```json
{
  "name": "web_viewport_info_cremotemcp",
  "arguments": {
    "tab": "tab-123",
    "timeout": 5
  }
}
```

Returns viewport dimensions, scroll position, device pixel ratio, and orientation.

#### 22. `web_performance_metrics_cremotemcp`
Get page performance metrics.

```json
{
  "name": "web_performance_metrics_cremotemcp",
  "arguments": {
    "tab": "tab-123",
    "timeout": 5
  }
}
```

Returns performance data including load times, resource counts, and memory usage.

#### 23. `web_content_check_cremotemcp`
Check for specific content types and loading states.

```json
{
  "name": "web_content_check_cremotemcp",
  "arguments": {
    "type": "images",
    "tab": "tab-123",
    "timeout": 5
  }
}
```

Supported content types: `images`, `scripts`, `styles`, `forms`, `links`, `iframes`, `errors`.

### Phase 5: Enhanced Screenshot and File Management (4 tools)

#### 24. `web_screenshot_element_cremotemcp`
Take a screenshot of a specific element on the page.

```json
{
  "name": "web_screenshot_element_cremotemcp",
  "arguments": {
    "selector": "#main-content",
    "output": "/tmp/element-screenshot.png",
    "tab": "tab-123",
    "timeout": 5
  }
}
```

Automatically scrolls the element into view and captures a screenshot of just that element.

#### 25. `web_screenshot_enhanced_cremotemcp`
Take an enhanced screenshot with metadata.

```json
{
  "name": "web_screenshot_enhanced_cremotemcp",
  "arguments": {
    "output": "/tmp/enhanced-screenshot.png",
    "full_page": true,
    "tab": "tab-123",
    "timeout": 5
  }
}
```

Returns screenshot metadata including timestamp, URL, title, viewport size, and file information.

#### 26. `file_operations_bulk_cremotemcp`
Perform bulk file operations (upload/download multiple files).

```json
{
  "name": "file_operations_bulk_cremotemcp",
  "arguments": {
    "operation": "upload",
    "files": [
      {
        "local_path": "/local/file1.txt",
        "container_path": "/tmp/file1.txt"
      },
      {
        "local_path": "/local/file2.txt",
        "container_path": "/tmp/file2.txt"
      }
    ],
    "timeout": 30
  }
}
```

Supports both "upload" and "download" operations with detailed success/failure reporting.

#### 27. `file_management_cremotemcp`
Manage files (cleanup, list, get info).

```json
{
  "name": "file_management_cremotemcp",
  "arguments": {
    "operation": "cleanup",
    "pattern": "/tmp/cremote-*",
    "max_age": "24"
  }
}
```

Operations: `cleanup` (remove old files), `list` (list files), `info` (get file details).

## 🎉 Complete Enhancement Summary

All 5 phases of the MCP enhancement plan have been successfully implemented, delivering a comprehensive web automation platform with **27 tools** organized across the following capabilities:

### ✅ Phase 1: Element State and Checking (2 tools)
**Enables conditional logic without timing issues**
- `web_element_check_cremotemcp`: Check existence, visibility, enabled state, count elements
- `web_element_attributes_cremotemcp`: Get attributes, properties, computed styles

**Benefits**: LLMs can make decisions based on page state, prevent errors from trying to interact with non-existent elements, enable conditional workflows.

### ✅ Phase 2: Enhanced Data Extraction (4 tools)
**Dramatically improves data gathering efficiency**
- `web_extract_multiple_cremotemcp`: Extract from multiple selectors in one call
- `web_extract_links_cremotemcp`: Extract all links with filtering options
- `web_extract_table_cremotemcp`: Extract table data as structured JSON
- `web_extract_text_cremotemcp`: Extract text with pattern matching

**Benefits**: Reduces multiple round trips to single calls, provides structured data ready for LLM processing, enables comprehensive page analysis.

### ✅ Phase 3: Form Analysis and Bulk Operations (3 tools)
**Streamlines form handling workflows with 10x efficiency**
- `web_form_analyze_cremotemcp`: Analyze forms completely
- `web_interact_multiple_cremotemcp`: Batch interactions
- `web_form_fill_bulk_cremotemcp`: Fill entire forms with key-value pairs

**Benefits**: Complete forms in 1-2 calls instead of 10+, form intelligence provides complete understanding before interaction, error prevention through field validation.

### ✅ Phase 4: Page State and Metadata Tools (4 tools)
**Provides rich context about page state for better debugging and monitoring**
- `web_page_info_cremotemcp`: Get page metadata and loading state
- `web_viewport_info_cremotemcp`: Get viewport and scroll information
- `web_performance_metrics_cremotemcp`: Get performance data
- `web_content_check_cremotemcp`: Check for specific content types

**Benefits**: Better debugging and monitoring capabilities, performance optimization insights, content loading verification, rich page state context for LLM decision making.

### ✅ Phase 5: Enhanced Screenshot and File Management (4 tools)
**Improves debugging and file handling**
- `web_screenshot_element_cremotemcp`: Screenshot specific elements
- `web_screenshot_enhanced_cremotemcp`: Screenshots with metadata
- `file_operations_bulk_cremotemcp`: Bulk file operations
- `file_management_cremotemcp`: Temporary file cleanup

**Benefits**: Better debugging with targeted screenshots, improved file handling workflows, automatic resource management, enhanced visual debugging capabilities.

## Key Benefits for LLM Agents

### 🚀 **Efficiency Gains**
- **10x Form Efficiency**: Complete forms in 1-2 calls instead of 10+ individual interactions
- **Batch Operations**: Multiple data extractions and interactions in single calls
- **Reduced Round Trips**: Comprehensive tools minimize API call overhead

### 🧠 **Intelligence & Context**
- **Conditional Logic**: Element checking enables smart decision making without timing issues
- **Rich Page Context**: Complete page state, performance metrics, and content verification
- **Form Intelligence**: Complete form analysis before interaction prevents errors

### 🛠 **Enhanced Capabilities**
- **Visual Debugging**: Element-specific screenshots and enhanced metadata
- **File Management**: Bulk operations and automated cleanup
- **Error Prevention**: State checking and validation before actions
- **Resource Management**: Automatic cleanup and connection handling

## Installation & Usage

### Prerequisites

1. **Cremote daemon must be running**:
   ```bash
   cremotedaemon
   ```

2. **Chrome/Chromium with remote debugging**:
   ```bash
   chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug
   ```

### Build the MCP Server

```bash
cd mcp/
go build -o cremote-mcp .
```

### Configuration

#### Basic Configuration (Single Client - stdio)

Set environment variables to configure the cremote connection:

```bash
export CREMOTE_HOST=localhost
export CREMOTE_PORT=8989
export CREMOTE_TRANSPORT=stdio  # Default
```

#### Multi-Client Configuration (HTTP Transport)

For multiple concurrent clients:

```bash
export CREMOTE_HOST=localhost
export CREMOTE_PORT=8989
export CREMOTE_TRANSPORT=http
export CREMOTE_HTTP_HOST=localhost
export CREMOTE_HTTP_PORT=8990
```

#### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `CREMOTE_TRANSPORT` | `stdio` | Transport mode: `stdio` or `http` |
| `CREMOTE_HOST` | `localhost` | Cremote daemon host |
| `CREMOTE_PORT` | `8989` | Cremote daemon port |
| `CREMOTE_HTTP_HOST` | `localhost` | HTTP server host (HTTP mode only) |
| `CREMOTE_HTTP_PORT` | `8990` | HTTP server port (HTTP mode only) |

### Running with Claude Desktop

Add to your Claude Desktop configuration (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):

```json
{
  "mcpServers": {
    "cremote": {
      "command": "/path/to/cremote-mcp",
      "env": {
        "CREMOTE_HOST": "localhost",
        "CREMOTE_PORT": "8989"
      }
    }
  }
}
```

### Running with Other MCP Clients

The server communicates via JSON-RPC over stdio, so it can be used with any MCP-compatible client:

```bash
echo '{"method":"tools/list","params":{},"id":1}' | ./cremote-mcp
```

## Response Format

All tool responses include:

```json
{
  "success": true,
  "data": "...",
  "screenshot": "/tmp/screenshot.png",
  "current_tab": "tab-id-123",
  "tab_history": ["tab-id-123", "tab-id-456"],
  "iframe_mode": false,
  "error": null,
  "metadata": {}
}
```

## Example Workflows

### Basic Login Workflow (Traditional Approach)
```json
// 1. Navigate to a page
{
  "name": "web_navigate_cremotemcp",
  "arguments": {
    "url": "https://example.com/login",
    "screenshot": true
  }
}

// 2. Check if login form exists
{
  "name": "web_element_check_cremotemcp",
  "arguments": {
    "selector": "#login-form",
    "check_type": "exists"
  }
}

// 3. Fill login form using bulk operations
{
  "name": "web_form_fill_bulk_cremotemcp",
  "arguments": {
    "form_selector": "#login-form",
    "fields": {
      "username": "testuser",
      "password": "password123"
    }
  }
}

// 4. Submit and verify
{
  "name": "web_interact_cremotemcp",
  "arguments": {
    "action": "click",
    "selector": "#login-button"
  }
}

// 5. Extract multiple results at once
{
  "name": "web_extract_multiple_cremotemcp",
  "arguments": {
    "selectors": {
      "welcome_message": ".welcome-message",
      "user_name": ".user-profile .name",
      "last_login": ".user-info .last-login"
    }
  }
}

// 6. Take enhanced screenshot with metadata
{
  "name": "web_screenshot_enhanced_cremotemcp",
  "arguments": {
    "output": "/tmp/login-success.png",
    "full_page": true
  }
}
```

### Advanced E-commerce Data Extraction Workflow
```json
// 1. Navigate and check page state
{
  "name": "web_navigate_cremotemcp",
  "arguments": {
    "url": "https://shop.example.com/products",
    "screenshot": true
  }
}

// 2. Get page performance metrics
{
  "name": "web_performance_metrics_cremotemcp",
  "arguments": {}
}

// 3. Extract all product data in one call
{
  "name": "web_extract_multiple_cremotemcp",
  "arguments": {
    "selectors": {
      "product_titles": ".product-card h3",
      "prices": ".product-card .price",
      "ratings": ".product-card .rating",
      "availability": ".product-card .stock-status"
    }
  }
}

// 4. Extract all product links with filtering
{
  "name": "web_extract_links_cremotemcp",
  "arguments": {
    "container_selector": ".product-grid",
    "href_pattern": ".*/product/.*",
    "text_pattern": ".*"
  }
}

// 5. Check if more products are loading
{
  "name": "web_content_check_cremotemcp",
  "arguments": {
    "type": "scripts"
  }
}
```

### Phase 6: Accessibility Tree Support (3 Tools)

#### `get_accessibility_tree_cremotemcp`
Get the full accessibility tree for a page or with limited depth.

```json
{
  "name": "get_accessibility_tree_cremotemcp",
  "arguments": {
    "tab": "optional-tab-id",
    "depth": 3,
    "timeout": 10
  }
}
```

#### `get_partial_accessibility_tree_cremotemcp`
Get accessibility tree for a specific element and its relatives.

```json
{
  "name": "get_partial_accessibility_tree_cremotemcp",
  "arguments": {
    "selector": "form",
    "tab": "optional-tab-id",
    "fetch_relatives": true,
    "timeout": 10
  }
}
```

#### `query_accessibility_tree_cremotemcp`
Query accessibility tree for nodes matching specific criteria.

```json
{
  "name": "query_accessibility_tree_cremotemcp",
  "arguments": {
    "tab": "optional-tab-id",
    "selector": "form",
    "accessible_name": "Submit",
    "role": "button",
    "timeout": 10
  }
}
```

## Benefits Over CLI

### 🎯 **Enhanced Efficiency**
- **State Management**: No need to manually track tab IDs
- **Batch Operations**: 10x efficiency with bulk form filling and multi-selector extraction
- **Intelligent Defaults**: Smart parameter handling and fallbacks
- **Resource Cleanup**: Automatic management of tabs and files

### 🔍 **Better Intelligence**
- **Conditional Logic**: Element checking enables smart decision making
- **Rich Context**: Page state, performance metrics, and content verification
- **Form Intelligence**: Complete form analysis before interaction
- **Error Prevention**: State validation before actions

### 🛠 **Advanced Capabilities**
- **Enhanced Screenshots**: Element-specific and metadata-rich capture
- **File Management**: Bulk operations and automated cleanup
- **Better Error Context**: Rich error information for debugging
- **Structured Responses**: Consistent, parseable response format

## 🎉 Production Ready

This comprehensive web automation platform is **production ready** with:

- **30 Tools**: Complete coverage of web automation needs
- **6 Enhancement Phases**: Systematic capability building from basic to advanced
- **Extensive Testing**: All tools validated and documented
- **LLM Optimized**: Designed specifically for AI agent workflows
- **Backward Compatible**: All existing tools continue to work unchanged

### 📊 **Capability Matrix**
| Category | Tools | Key Benefits |
|----------|-------|--------------|
| **Core Web Automation** | 10 tools | Navigation, interaction, extraction, screenshots, tabs, iframes, files, console |
| **Element Intelligence** | 2 tools | Conditional logic, state checking, attribute inspection |
| **Data Extraction** | 4 tools | Batch extraction, structured data, pattern matching, table processing |
| **Form Automation** | 3 tools | Form analysis, bulk filling, batch interactions |
| **Page Intelligence** | 4 tools | Page state, performance metrics, content verification, viewport info |
| **Enhanced Capabilities** | 4 tools | Element screenshots, enhanced metadata, bulk file ops, file management |
| **Accessibility Tree** | 3 tools | Semantic understanding, accessibility testing, screen reader simulation |

## Development

To extend the MCP server with new tools:

1. Add the tool definition to `handleToolsList()`
2. Add a case in `handleToolCall()`
3. Implement the handler function following the pattern of existing handlers
4. Update this documentation

The server is designed to be easily extensible while maintaining consistency with the cremote client library.

---

**🚀 Ready for Production**: Complete web automation platform with 30 tools across 6 enhancement phases, optimized for LLM agents and production workflows.