# Cremote MCP Server This is a Model Context Protocol (MCP) server that exposes cremote's web automation capabilities to LLMs and AI agents. Instead of using CLI commands, this server provides a structured API that maintains state and provides intelligent abstractions. ## 🎉 Complete Web Automation Platform **30 comprehensive tools** across 6 enhancement phases, providing a complete web automation toolkit for LLM agents: ### 🚀 **NEW: Multi-Client Support** The Cremote MCP server now supports **multiple concurrent clients** with isolated browser sessions: - **Concurrent Agents**: Multiple AI agents can use the same browser simultaneously - **Session Isolation**: Each client maintains independent browser state (tabs, history, iframe context) - **Transport Flexibility**: Choose between stdio (single client) or HTTP (multiple clients) - **Backward Compatible**: Existing stdio clients continue to work unchanged See the [Multi-Client Guide](MULTI_CLIENT_GUIDE.md) for detailed setup and usage instructions. - **Phase 1**: Element state checking and conditional logic (2 tools) - **Phase 2**: Enhanced data extraction and batch operations (4 tools) - **Phase 3**: Form analysis and bulk operations (3 tools) - **Phase 4**: Page state and metadata tools (4 tools) - **Phase 5**: Enhanced screenshots and file management (4 tools) - **Core Tools**: Essential web automation capabilities (10 tools) ## Features - **State Management**: Automatically tracks current tab, tab history, and iframe context - **Intelligent Abstractions**: High-level tools that combine multiple cremote operations - **Batch Operations**: Reduce round trips with bulk operations and multi-selector extraction - **Form Intelligence**: Complete form analysis and bulk filling capabilities - **Rich Context**: Page metadata, performance metrics, and content verification - **Enhanced Screenshots**: Element-specific and metadata-rich screenshot capture - **File Management**: Bulk file operations and automated cleanup - **Accessibility Tree**: Chrome accessibility tree interface for semantic understanding - **Automatic Screenshots**: Optional screenshot capture for debugging and documentation - **Error Recovery**: Better error handling and context for LLMs - **Resource Management**: Automatic cleanup and connection management ## Quick Start for LLMs **For LLM agents**: See the comprehensive [LLM Usage Guide](LLM_USAGE_GUIDE.md) for detailed usage instructions, examples, and best practices. ## Available Tools (30 Total) ### Version Information #### `version_cremotemcp` Get version information for MCP server and daemon. ```json { "name": "version_cremotemcp", "arguments": {} } ``` Returns version information for both the MCP server and the connected daemon. ### Core Web Automation Tools (10 tools) #### 1. `web_navigate_cremotemcp` Navigate to URLs with optional screenshot capture. ```json { "name": "web_navigate_cremotemcp", "arguments": { "url": "https://example.com", "screenshot": true, "timeout": 10 } } ``` #### 2. `web_interact_cremotemcp` Interact with web elements (click, fill, submit, upload, select). ```json { "name": "web_interact_cremotemcp", "arguments": { "action": "fill", "selector": "#username", "value": "testuser", "timeout": 5 } } ``` For select dropdowns: ```json { "name": "web_interact_cremotemcp", "arguments": { "action": "select", "selector": "#country", "value": "United States", "timeout": 5 } } ``` #### 3. `web_extract_cremotemcp` Extract data from pages (source, element HTML, JavaScript execution). ```json { "name": "web_extract_cremotemcp", "arguments": { "type": "javascript", "code": "document.title", "timeout": 5 } } ``` #### 4. `web_screenshot_cremotemcp` Take screenshots of the current page. ```json { "name": "web_screenshot_cremotemcp", "arguments": { "output": "/tmp/page.png", "full_page": true, "timeout": 5 } } ``` #### 5. `web_manage_tabs_cremotemcp` Manage browser tabs (open, close, list, switch). ```json { "name": "web_manage_tabs_cremotemcp", "arguments": { "action": "open", "timeout": 5 } } ``` #### 6. `web_iframe_cremotemcp` Switch iframe context for subsequent operations. ```json { "name": "web_iframe_cremotemcp", "arguments": { "action": "enter", "selector": "iframe#payment-form" } } ``` #### 7. `file_upload_cremotemcp` Upload files from client to container for use in form uploads. ```json { "name": "file_upload_cremotemcp", "arguments": { "local_path": "/local/file.txt", "container_path": "/tmp/file.txt" } } ``` **Note**: The CLI `cremote upload-file` command now automatically transfers files to the daemon container first, making file uploads seamless even when the daemon runs in a container. #### 8. `file_download_cremotemcp` Download files from container to client (e.g., downloaded files from browser). ```json { "name": "file_download_cremotemcp", "arguments": { "container_path": "/tmp/downloaded-file.pdf", "local_path": "/local/downloaded-file.pdf" } } ``` #### 9. `console_logs_cremotemcp` Get console logs from the browser tab. ```json { "name": "console_logs_cremotemcp", "arguments": { "tab": "tab-123", "timeout": 5 } } ``` #### 10. `console_command_cremotemcp` Execute commands in the browser console. ```json { "name": "console_command_cremotemcp", "arguments": { "command": "document.getElementById('test').innerHTML = 'Hello World'", "tab": "tab-123", "timeout": 5 } } ``` ### Phase 1: Element State and Checking Tools (2 tools) #### 11. `web_element_check_cremotemcp` Check element existence, visibility, enabled state, and other properties without interaction. ```json { "name": "web_element_check_cremotemcp", "arguments": { "selector": "#submit-button", "check_type": "all", "timeout": 5 } } ``` **Check Types:** - `exists`: Check if element exists in DOM - `visible`: Check if element is visible (not hidden) - `enabled`: Check if element is enabled (not disabled) - `focused`: Check if element has focus - `selected`: Check if element is selected (checkboxes, radio buttons) - `all`: Check all states above **Response includes:** ```json { "exists": true, "visible": true, "enabled": false, "focused": false, "selected": true, "count": 1 } ``` #### 12. `web_element_attributes_cremotemcp` Get element attributes, properties, and computed styles. ```json { "name": "web_element_attributes_cremotemcp", "arguments": { "selector": "#user-profile", "attributes": "all", "timeout": 5 } } ``` **Attribute Options:** - `all`: Get common attributes, properties, and styles - `"id,class,href"`: Comma-separated list of specific attributes - `"style_display,style_color"`: Computed styles (prefix with `style_`) - `"prop_textContent,prop_value"`: JavaScript properties (prefix with `prop_`) **Example Response:** ```json { "id": "user-profile", "class": "profile-card active", "data-user-id": "12345", "textContent": "John Doe", "style_display": "block", "style_color": "rgb(0, 0, 0)" } ``` ### Phase 2: Enhanced Data Extraction Tools (4 tools) #### 13. `web_extract_multiple_cremotemcp` Extract data from multiple selectors in a single call for improved efficiency. ```json { "name": "web_extract_multiple_cremotemcp", "arguments": { "selectors": { "title": "h1", "price": ".price", "description": ".product-description" }, "timeout": 5 } } ``` #### 14. `web_extract_links_cremotemcp` Extract all links from a page with powerful filtering options. ```json { "name": "web_extract_links_cremotemcp", "arguments": { "container_selector": "nav", "href_pattern": "https://.*", "text_pattern": ".*Download.*", "timeout": 5 } } ``` #### 15. `web_extract_table_cremotemcp` Extract table data as structured JSON with optional header processing. ```json { "name": "web_extract_table_cremotemcp", "arguments": { "selector": "#data-table", "include_headers": true, "timeout": 5 } } ``` #### 16. `web_extract_text_cremotemcp` Extract text content with optional pattern matching and different extraction types. ```json { "name": "web_extract_text_cremotemcp", "arguments": { "selector": ".content", "pattern": "\\d{3}-\\d{3}-\\d{4}", "extract_type": "textContent", "timeout": 5 } } ``` ### Phase 3: Form Analysis and Bulk Operations (3 tools) #### 17. `web_form_analyze_cremotemcp` Analyze forms completely to understand their structure, fields, and submission requirements. ```json { "name": "web_form_analyze_cremotemcp", "arguments": { "selector": "#registration-form", "timeout": 10 } } ``` #### 18. `web_interact_multiple_cremotemcp` Perform multiple interactions in a single call for efficient batch operations. ```json { "name": "web_interact_multiple_cremotemcp", "arguments": { "interactions": [ {"selector": "#username", "action": "fill", "value": "testuser"}, {"selector": "#password", "action": "fill", "value": "testpass"}, {"selector": "#remember-me", "action": "check"}, {"selector": "#login-btn", "action": "click"} ], "timeout": 10 } } ``` #### 19. `web_form_fill_bulk_cremotemcp` Fill entire forms with key-value pairs in a single operation. ```json { "name": "web_form_fill_bulk_cremotemcp", "arguments": { "form_selector": "#contact-form", "fields": { "name": "John Doe", "email": "john@example.com", "message": "Hello, this is a test message." }, "timeout": 10 } } ``` ### Phase 4: Page State and Metadata Tools (4 tools) #### 20. `web_page_info_cremotemcp` Get comprehensive page metadata and state information. ```json { "name": "web_page_info_cremotemcp", "arguments": { "tab": "tab-123", "timeout": 5 } } ``` Returns detailed page information including title, URL, loading state, domain, protocol, and browser status. #### 21. `web_viewport_info_cremotemcp` Get viewport and scroll information. ```json { "name": "web_viewport_info_cremotemcp", "arguments": { "tab": "tab-123", "timeout": 5 } } ``` Returns viewport dimensions, scroll position, device pixel ratio, and orientation. #### 22. `web_performance_metrics_cremotemcp` Get page performance metrics. ```json { "name": "web_performance_metrics_cremotemcp", "arguments": { "tab": "tab-123", "timeout": 5 } } ``` Returns performance data including load times, resource counts, and memory usage. #### 23. `web_content_check_cremotemcp` Check for specific content types and loading states. ```json { "name": "web_content_check_cremotemcp", "arguments": { "type": "images", "tab": "tab-123", "timeout": 5 } } ``` Supported content types: `images`, `scripts`, `styles`, `forms`, `links`, `iframes`, `errors`. ### Phase 5: Enhanced Screenshot and File Management (4 tools) #### 24. `web_screenshot_element_cremotemcp` Take a screenshot of a specific element on the page. ```json { "name": "web_screenshot_element_cremotemcp", "arguments": { "selector": "#main-content", "output": "/tmp/element-screenshot.png", "tab": "tab-123", "timeout": 5 } } ``` Automatically scrolls the element into view and captures a screenshot of just that element. #### 25. `web_screenshot_enhanced_cremotemcp` Take an enhanced screenshot with metadata. ```json { "name": "web_screenshot_enhanced_cremotemcp", "arguments": { "output": "/tmp/enhanced-screenshot.png", "full_page": true, "tab": "tab-123", "timeout": 5 } } ``` Returns screenshot metadata including timestamp, URL, title, viewport size, and file information. #### 26. `file_operations_bulk_cremotemcp` Perform bulk file operations (upload/download multiple files). ```json { "name": "file_operations_bulk_cremotemcp", "arguments": { "operation": "upload", "files": [ { "local_path": "/local/file1.txt", "container_path": "/tmp/file1.txt" }, { "local_path": "/local/file2.txt", "container_path": "/tmp/file2.txt" } ], "timeout": 30 } } ``` Supports both "upload" and "download" operations with detailed success/failure reporting. #### 27. `file_management_cremotemcp` Manage files (cleanup, list, get info). ```json { "name": "file_management_cremotemcp", "arguments": { "operation": "cleanup", "pattern": "/tmp/cremote-*", "max_age": "24" } } ``` Operations: `cleanup` (remove old files), `list` (list files), `info` (get file details). ## 🎉 Complete Enhancement Summary All 5 phases of the MCP enhancement plan have been successfully implemented, delivering a comprehensive web automation platform with **27 tools** organized across the following capabilities: ### ✅ Phase 1: Element State and Checking (2 tools) **Enables conditional logic without timing issues** - `web_element_check_cremotemcp`: Check existence, visibility, enabled state, count elements - `web_element_attributes_cremotemcp`: Get attributes, properties, computed styles **Benefits**: LLMs can make decisions based on page state, prevent errors from trying to interact with non-existent elements, enable conditional workflows. ### ✅ Phase 2: Enhanced Data Extraction (4 tools) **Dramatically improves data gathering efficiency** - `web_extract_multiple_cremotemcp`: Extract from multiple selectors in one call - `web_extract_links_cremotemcp`: Extract all links with filtering options - `web_extract_table_cremotemcp`: Extract table data as structured JSON - `web_extract_text_cremotemcp`: Extract text with pattern matching **Benefits**: Reduces multiple round trips to single calls, provides structured data ready for LLM processing, enables comprehensive page analysis. ### ✅ Phase 3: Form Analysis and Bulk Operations (3 tools) **Streamlines form handling workflows with 10x efficiency** - `web_form_analyze_cremotemcp`: Analyze forms completely - `web_interact_multiple_cremotemcp`: Batch interactions - `web_form_fill_bulk_cremotemcp`: Fill entire forms with key-value pairs **Benefits**: Complete forms in 1-2 calls instead of 10+, form intelligence provides complete understanding before interaction, error prevention through field validation. ### ✅ Phase 4: Page State and Metadata Tools (4 tools) **Provides rich context about page state for better debugging and monitoring** - `web_page_info_cremotemcp`: Get page metadata and loading state - `web_viewport_info_cremotemcp`: Get viewport and scroll information - `web_performance_metrics_cremotemcp`: Get performance data - `web_content_check_cremotemcp`: Check for specific content types **Benefits**: Better debugging and monitoring capabilities, performance optimization insights, content loading verification, rich page state context for LLM decision making. ### ✅ Phase 5: Enhanced Screenshot and File Management (4 tools) **Improves debugging and file handling** - `web_screenshot_element_cremotemcp`: Screenshot specific elements - `web_screenshot_enhanced_cremotemcp`: Screenshots with metadata - `file_operations_bulk_cremotemcp`: Bulk file operations - `file_management_cremotemcp`: Temporary file cleanup **Benefits**: Better debugging with targeted screenshots, improved file handling workflows, automatic resource management, enhanced visual debugging capabilities. ## Key Benefits for LLM Agents ### 🚀 **Efficiency Gains** - **10x Form Efficiency**: Complete forms in 1-2 calls instead of 10+ individual interactions - **Batch Operations**: Multiple data extractions and interactions in single calls - **Reduced Round Trips**: Comprehensive tools minimize API call overhead ### 🧠 **Intelligence & Context** - **Conditional Logic**: Element checking enables smart decision making without timing issues - **Rich Page Context**: Complete page state, performance metrics, and content verification - **Form Intelligence**: Complete form analysis before interaction prevents errors ### 🛠 **Enhanced Capabilities** - **Visual Debugging**: Element-specific screenshots and enhanced metadata - **File Management**: Bulk operations and automated cleanup - **Error Prevention**: State checking and validation before actions - **Resource Management**: Automatic cleanup and connection handling ## Installation & Usage ### Prerequisites 1. **Cremote daemon must be running**: ```bash cremotedaemon ``` 2. **Chrome/Chromium with remote debugging**: ```bash chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug ``` ### Build the MCP Server ```bash cd mcp/ go build -o cremote-mcp . ``` ### Configuration #### Basic Configuration (Single Client - stdio) Set environment variables to configure the cremote connection: ```bash export CREMOTE_HOST=localhost export CREMOTE_PORT=8989 export CREMOTE_TRANSPORT=stdio # Default ``` #### Multi-Client Configuration (HTTP Transport) For multiple concurrent clients: ```bash export CREMOTE_HOST=localhost export CREMOTE_PORT=8989 export CREMOTE_TRANSPORT=http export CREMOTE_HTTP_HOST=localhost export CREMOTE_HTTP_PORT=8990 ``` #### Environment Variables | Variable | Default | Description | |----------|---------|-------------| | `CREMOTE_TRANSPORT` | `stdio` | Transport mode: `stdio` or `http` | | `CREMOTE_HOST` | `localhost` | Cremote daemon host | | `CREMOTE_PORT` | `8989` | Cremote daemon port | | `CREMOTE_HTTP_HOST` | `localhost` | HTTP server host (HTTP mode only) | | `CREMOTE_HTTP_PORT` | `8990` | HTTP server port (HTTP mode only) | ### Running with Claude Desktop Add to your Claude Desktop configuration (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS): ```json { "mcpServers": { "cremote": { "command": "/path/to/cremote-mcp", "env": { "CREMOTE_HOST": "localhost", "CREMOTE_PORT": "8989" } } } } ``` ### Running with Other MCP Clients The server communicates via JSON-RPC over stdio, so it can be used with any MCP-compatible client: ```bash echo '{"method":"tools/list","params":{},"id":1}' | ./cremote-mcp ``` ## Response Format All tool responses include: ```json { "success": true, "data": "...", "screenshot": "/tmp/screenshot.png", "current_tab": "tab-id-123", "tab_history": ["tab-id-123", "tab-id-456"], "iframe_mode": false, "error": null, "metadata": {} } ``` ## Example Workflows ### Basic Login Workflow (Traditional Approach) ```json // 1. Navigate to a page { "name": "web_navigate_cremotemcp", "arguments": { "url": "https://example.com/login", "screenshot": true } } // 2. Check if login form exists { "name": "web_element_check_cremotemcp", "arguments": { "selector": "#login-form", "check_type": "exists" } } // 3. Fill login form using bulk operations { "name": "web_form_fill_bulk_cremotemcp", "arguments": { "form_selector": "#login-form", "fields": { "username": "testuser", "password": "password123" } } } // 4. Submit and verify { "name": "web_interact_cremotemcp", "arguments": { "action": "click", "selector": "#login-button" } } // 5. Extract multiple results at once { "name": "web_extract_multiple_cremotemcp", "arguments": { "selectors": { "welcome_message": ".welcome-message", "user_name": ".user-profile .name", "last_login": ".user-info .last-login" } } } // 6. Take enhanced screenshot with metadata { "name": "web_screenshot_enhanced_cremotemcp", "arguments": { "output": "/tmp/login-success.png", "full_page": true } } ``` ### Advanced E-commerce Data Extraction Workflow ```json // 1. Navigate and check page state { "name": "web_navigate_cremotemcp", "arguments": { "url": "https://shop.example.com/products", "screenshot": true } } // 2. Get page performance metrics { "name": "web_performance_metrics_cremotemcp", "arguments": {} } // 3. Extract all product data in one call { "name": "web_extract_multiple_cremotemcp", "arguments": { "selectors": { "product_titles": ".product-card h3", "prices": ".product-card .price", "ratings": ".product-card .rating", "availability": ".product-card .stock-status" } } } // 4. Extract all product links with filtering { "name": "web_extract_links_cremotemcp", "arguments": { "container_selector": ".product-grid", "href_pattern": ".*/product/.*", "text_pattern": ".*" } } // 5. Check if more products are loading { "name": "web_content_check_cremotemcp", "arguments": { "type": "scripts" } } ``` ### Phase 6: Accessibility Tree Support (3 Tools) #### `get_accessibility_tree_cremotemcp` Get the full accessibility tree for a page or with limited depth. ```json { "name": "get_accessibility_tree_cremotemcp", "arguments": { "tab": "optional-tab-id", "depth": 3, "timeout": 10 } } ``` #### `get_partial_accessibility_tree_cremotemcp` Get accessibility tree for a specific element and its relatives. ```json { "name": "get_partial_accessibility_tree_cremotemcp", "arguments": { "selector": "form", "tab": "optional-tab-id", "fetch_relatives": true, "timeout": 10 } } ``` #### `query_accessibility_tree_cremotemcp` Query accessibility tree for nodes matching specific criteria. ```json { "name": "query_accessibility_tree_cremotemcp", "arguments": { "tab": "optional-tab-id", "selector": "form", "accessible_name": "Submit", "role": "button", "timeout": 10 } } ``` ## Benefits Over CLI ### 🎯 **Enhanced Efficiency** - **State Management**: No need to manually track tab IDs - **Batch Operations**: 10x efficiency with bulk form filling and multi-selector extraction - **Intelligent Defaults**: Smart parameter handling and fallbacks - **Resource Cleanup**: Automatic management of tabs and files ### 🔍 **Better Intelligence** - **Conditional Logic**: Element checking enables smart decision making - **Rich Context**: Page state, performance metrics, and content verification - **Form Intelligence**: Complete form analysis before interaction - **Error Prevention**: State validation before actions ### 🛠 **Advanced Capabilities** - **Enhanced Screenshots**: Element-specific and metadata-rich capture - **File Management**: Bulk operations and automated cleanup - **Better Error Context**: Rich error information for debugging - **Structured Responses**: Consistent, parseable response format ## 🎉 Production Ready This comprehensive web automation platform is **production ready** with: - **30 Tools**: Complete coverage of web automation needs - **6 Enhancement Phases**: Systematic capability building from basic to advanced - **Extensive Testing**: All tools validated and documented - **LLM Optimized**: Designed specifically for AI agent workflows - **Backward Compatible**: All existing tools continue to work unchanged ### 📊 **Capability Matrix** | Category | Tools | Key Benefits | |----------|-------|--------------| | **Core Web Automation** | 10 tools | Navigation, interaction, extraction, screenshots, tabs, iframes, files, console | | **Element Intelligence** | 2 tools | Conditional logic, state checking, attribute inspection | | **Data Extraction** | 4 tools | Batch extraction, structured data, pattern matching, table processing | | **Form Automation** | 3 tools | Form analysis, bulk filling, batch interactions | | **Page Intelligence** | 4 tools | Page state, performance metrics, content verification, viewport info | | **Enhanced Capabilities** | 4 tools | Element screenshots, enhanced metadata, bulk file ops, file management | | **Accessibility Tree** | 3 tools | Semantic understanding, accessibility testing, screen reader simulation | ## Development To extend the MCP server with new tools: 1. Add the tool definition to `handleToolsList()` 2. Add a case in `handleToolCall()` 3. Implement the handler function following the pattern of existing handlers 4. Update this documentation The server is designed to be easily extensible while maintaining consistency with the cremote client library. --- **🚀 Ready for Production**: Complete web automation platform with 30 tools across 6 enhancement phases, optimized for LLM agents and production workflows.