24 KiB
Cremote MCP Server
This is a Model Context Protocol (MCP) server that exposes cremote's web automation capabilities to LLMs and AI agents. Instead of using CLI commands, this server provides a structured API that maintains state and provides intelligent abstractions.
🎉 Complete Web Automation Platform
31 comprehensive tools across 6 enhancement phases, providing a complete web automation toolkit for LLM agents:
🚀 NEW: Multi-Client Support
The Cremote MCP server now supports multiple concurrent clients with isolated browser sessions:
- Concurrent Agents: Multiple AI agents can use the same browser simultaneously
- Session Isolation: Each client maintains independent browser state (tabs, history, iframe context)
- Transport Flexibility: Choose between stdio (single client) or HTTP (multiple clients)
- Backward Compatible: Existing stdio clients continue to work unchanged
See the Multi-Client Guide for detailed setup and usage instructions.
- Phase 1: Element state checking and conditional logic (2 tools)
- Phase 2: Enhanced data extraction and batch operations (4 tools)
- Phase 3: Form analysis and bulk operations (3 tools)
- Phase 4: Page state and metadata tools (4 tools)
- Phase 5: Enhanced screenshots and file management (4 tools)
- Core Tools: Essential web automation capabilities (10 tools)
Features
- State Management: Automatically tracks current tab, tab history, and iframe context
- Intelligent Abstractions: High-level tools that combine multiple cremote operations
- Batch Operations: Reduce round trips with bulk operations and multi-selector extraction
- Form Intelligence: Complete form analysis and bulk filling capabilities
- Rich Context: Page metadata, performance metrics, and content verification
- Enhanced Screenshots: Element-specific and metadata-rich screenshot capture
- File Management: Bulk file operations and automated cleanup
- Accessibility Tree: Chrome accessibility tree interface for semantic understanding
- Automatic Screenshots: Optional screenshot capture for debugging and documentation
- Error Recovery: Better error handling and context for LLMs
- Resource Management: Automatic cleanup and connection management
Quick Start for LLMs
For LLM agents: See the comprehensive LLM Usage Guide for detailed usage instructions, examples, and best practices.
Available Tools (31 Total)
Version Information
version_cremotemcp
Get version information for MCP server and daemon.
{
"name": "version_cremotemcp",
"arguments": {}
}
Returns version information for both the MCP server and the connected daemon.
Core Web Automation Tools (10 tools)
1. web_navigate_cremotemcp
Navigate to URLs with optional screenshot capture.
{
"name": "web_navigate_cremotemcp",
"arguments": {
"url": "https://example.com",
"screenshot": true,
"timeout": 10
}
}
2. web_interact_cremotemcp
Interact with web elements (click, fill, submit, upload, select).
{
"name": "web_interact_cremotemcp",
"arguments": {
"action": "fill",
"selector": "#username",
"value": "testuser",
"timeout": 5
}
}
For select dropdowns:
{
"name": "web_interact_cremotemcp",
"arguments": {
"action": "select",
"selector": "#country",
"value": "United States",
"timeout": 5
}
}
3. web_extract_cremotemcp
Extract data from pages (source, element HTML, JavaScript execution).
{
"name": "web_extract_cremotemcp",
"arguments": {
"type": "javascript",
"code": "document.title",
"timeout": 5
}
}
4. web_screenshot_cremotemcp
Take screenshots of the current page.
{
"name": "web_screenshot_cremotemcp",
"arguments": {
"output": "/tmp/page.png",
"full_page": true,
"timeout": 5
}
}
5. web_manage_tabs_cremotemcp
Manage browser tabs (open, close, list, switch).
{
"name": "web_manage_tabs_cremotemcp",
"arguments": {
"action": "open",
"timeout": 5
}
}
6. web_iframe_cremotemcp
Switch iframe context for subsequent operations.
{
"name": "web_iframe_cremotemcp",
"arguments": {
"action": "enter",
"selector": "iframe#payment-form"
}
}
7. file_upload_cremotemcp
Upload files from client to container for use in form uploads.
{
"name": "file_upload_cremotemcp",
"arguments": {
"local_path": "/local/file.txt",
"container_path": "/tmp/file.txt"
}
}
Note: The CLI cremote upload-file
command now automatically transfers files to the daemon container first, making file uploads seamless even when the daemon runs in a container.
8. file_download_cremotemcp
Download files from container to client (e.g., downloaded files from browser).
{
"name": "file_download_cremotemcp",
"arguments": {
"container_path": "/tmp/downloaded-file.pdf",
"local_path": "/local/downloaded-file.pdf"
}
}
9. console_logs_cremotemcp
Get console logs from the browser tab.
{
"name": "console_logs_cremotemcp",
"arguments": {
"tab": "tab-123",
"timeout": 5
}
}
10. console_command_cremotemcp
Execute commands in the browser console.
{
"name": "console_command_cremotemcp",
"arguments": {
"command": "document.getElementById('test').innerHTML = 'Hello World'",
"tab": "tab-123",
"timeout": 5
}
}
Phase 1: Element State and Checking Tools (2 tools)
11. web_element_check_cremotemcp
Check element existence, visibility, enabled state, and other properties without interaction.
{
"name": "web_element_check_cremotemcp",
"arguments": {
"selector": "#submit-button",
"check_type": "all",
"timeout": 5
}
}
Check Types:
exists
: Check if element exists in DOMvisible
: Check if element is visible (not hidden)enabled
: Check if element is enabled (not disabled)focused
: Check if element has focusselected
: Check if element is selected (checkboxes, radio buttons)all
: Check all states above
Response includes:
{
"exists": true,
"visible": true,
"enabled": false,
"focused": false,
"selected": true,
"count": 1
}
12. web_element_attributes_cremotemcp
Get element attributes, properties, and computed styles.
{
"name": "web_element_attributes_cremotemcp",
"arguments": {
"selector": "#user-profile",
"attributes": "all",
"timeout": 5
}
}
Attribute Options:
all
: Get common attributes, properties, and styles"id,class,href"
: Comma-separated list of specific attributes"style_display,style_color"
: Computed styles (prefix withstyle_
)"prop_textContent,prop_value"
: JavaScript properties (prefix withprop_
)
Example Response:
{
"id": "user-profile",
"class": "profile-card active",
"data-user-id": "12345",
"textContent": "John Doe",
"style_display": "block",
"style_color": "rgb(0, 0, 0)"
}
Phase 2: Enhanced Data Extraction Tools (4 tools)
13. web_extract_multiple_cremotemcp
Extract data from multiple selectors in a single call for improved efficiency.
{
"name": "web_extract_multiple_cremotemcp",
"arguments": {
"selectors": {
"title": "h1",
"price": ".price",
"description": ".product-description"
},
"timeout": 5
}
}
14. web_extract_links_cremotemcp
Extract all links from a page with powerful filtering options.
{
"name": "web_extract_links_cremotemcp",
"arguments": {
"container_selector": "nav",
"href_pattern": "https://.*",
"text_pattern": ".*Download.*",
"timeout": 5
}
}
15. web_extract_table_cremotemcp
Extract table data as structured JSON with optional header processing.
{
"name": "web_extract_table_cremotemcp",
"arguments": {
"selector": "#data-table",
"include_headers": true,
"timeout": 5
}
}
16. web_extract_text_cremotemcp
Extract text content with optional pattern matching and different extraction types.
{
"name": "web_extract_text_cremotemcp",
"arguments": {
"selector": ".content",
"pattern": "\\d{3}-\\d{3}-\\d{4}",
"extract_type": "textContent",
"timeout": 5
}
}
Phase 3: Form Analysis and Bulk Operations (3 tools)
17. web_form_analyze_cremotemcp
Analyze forms completely to understand their structure, fields, and submission requirements.
{
"name": "web_form_analyze_cremotemcp",
"arguments": {
"selector": "#registration-form",
"timeout": 10
}
}
18. web_interact_multiple_cremotemcp
Perform multiple interactions in a single call for efficient batch operations.
{
"name": "web_interact_multiple_cremotemcp",
"arguments": {
"interactions": [
{"selector": "#username", "action": "fill", "value": "testuser"},
{"selector": "#password", "action": "fill", "value": "testpass"},
{"selector": "#remember-me", "action": "check"},
{"selector": "#login-btn", "action": "click"}
],
"timeout": 10
}
}
19. web_form_fill_bulk_cremotemcp
Fill entire forms with key-value pairs in a single operation.
{
"name": "web_form_fill_bulk_cremotemcp",
"arguments": {
"form_selector": "#contact-form",
"fields": {
"name": "John Doe",
"email": "john@example.com",
"message": "Hello, this is a test message."
},
"timeout": 10
}
}
Phase 4: Page State and Metadata Tools (4 tools)
20. web_page_info_cremotemcp
Get comprehensive page metadata and state information.
{
"name": "web_page_info_cremotemcp",
"arguments": {
"tab": "tab-123",
"timeout": 5
}
}
Returns detailed page information including title, URL, loading state, domain, protocol, and browser status.
21. web_viewport_info_cremotemcp
Get viewport and scroll information.
{
"name": "web_viewport_info_cremotemcp",
"arguments": {
"tab": "tab-123",
"timeout": 5
}
}
Returns viewport dimensions, scroll position, device pixel ratio, and orientation.
22. web_performance_metrics_cremotemcp
Get page performance metrics.
{
"name": "web_performance_metrics_cremotemcp",
"arguments": {
"tab": "tab-123",
"timeout": 5
}
}
Returns performance data including load times, resource counts, and memory usage.
23. web_content_check_cremotemcp
Check for specific content types and loading states.
{
"name": "web_content_check_cremotemcp",
"arguments": {
"type": "images",
"tab": "tab-123",
"timeout": 5
}
}
Supported content types: images
, scripts
, styles
, forms
, links
, iframes
, errors
.
Phase 5: Enhanced Screenshot and File Management (4 tools)
24. web_screenshot_element_cremotemcp
Take a screenshot of a specific element on the page.
{
"name": "web_screenshot_element_cremotemcp",
"arguments": {
"selector": "#main-content",
"output": "/tmp/element-screenshot.png",
"tab": "tab-123",
"timeout": 5
}
}
Automatically scrolls the element into view and captures a screenshot of just that element.
25. web_screenshot_enhanced_cremotemcp
Take an enhanced screenshot with metadata.
{
"name": "web_screenshot_enhanced_cremotemcp",
"arguments": {
"output": "/tmp/enhanced-screenshot.png",
"full_page": true,
"tab": "tab-123",
"timeout": 5
}
}
Returns screenshot metadata including timestamp, URL, title, viewport size, and file information.
26. file_operations_bulk_cremotemcp
Perform bulk file operations (upload/download multiple files).
{
"name": "file_operations_bulk_cremotemcp",
"arguments": {
"operation": "upload",
"files": [
{
"local_path": "/local/file1.txt",
"container_path": "/tmp/file1.txt"
},
{
"local_path": "/local/file2.txt",
"container_path": "/tmp/file2.txt"
}
],
"timeout": 30
}
}
Supports both "upload" and "download" operations with detailed success/failure reporting.
27. file_management_cremotemcp
Manage files (cleanup, list, get info).
{
"name": "file_management_cremotemcp",
"arguments": {
"operation": "cleanup",
"pattern": "/tmp/cremote-*",
"max_age": "24"
}
}
Operations: cleanup
(remove old files), list
(list files), info
(get file details).
🎉 Complete Enhancement Summary
All 5 phases of the MCP enhancement plan have been successfully implemented, delivering a comprehensive web automation platform with 27 tools organized across the following capabilities:
✅ Phase 1: Element State and Checking (2 tools)
Enables conditional logic without timing issues
web_element_check_cremotemcp
: Check existence, visibility, enabled state, count elementsweb_element_attributes_cremotemcp
: Get attributes, properties, computed styles
Benefits: LLMs can make decisions based on page state, prevent errors from trying to interact with non-existent elements, enable conditional workflows.
✅ Phase 2: Enhanced Data Extraction (4 tools)
Dramatically improves data gathering efficiency
web_extract_multiple_cremotemcp
: Extract from multiple selectors in one callweb_extract_links_cremotemcp
: Extract all links with filtering optionsweb_extract_table_cremotemcp
: Extract table data as structured JSONweb_extract_text_cremotemcp
: Extract text with pattern matching
Benefits: Reduces multiple round trips to single calls, provides structured data ready for LLM processing, enables comprehensive page analysis.
✅ Phase 3: Form Analysis and Bulk Operations (3 tools)
Streamlines form handling workflows with 10x efficiency
web_form_analyze_cremotemcp
: Analyze forms completelyweb_interact_multiple_cremotemcp
: Batch interactionsweb_form_fill_bulk_cremotemcp
: Fill entire forms with key-value pairs
Benefits: Complete forms in 1-2 calls instead of 10+, form intelligence provides complete understanding before interaction, error prevention through field validation.
✅ Phase 4: Page State and Metadata Tools (4 tools)
Provides rich context about page state for better debugging and monitoring
web_page_info_cremotemcp
: Get page metadata and loading stateweb_viewport_info_cremotemcp
: Get viewport and scroll informationweb_performance_metrics_cremotemcp
: Get performance dataweb_content_check_cremotemcp
: Check for specific content types
Benefits: Better debugging and monitoring capabilities, performance optimization insights, content loading verification, rich page state context for LLM decision making.
✅ Phase 5: Enhanced Screenshot and File Management (4 tools)
Improves debugging and file handling
web_screenshot_element_cremotemcp
: Screenshot specific elementsweb_screenshot_enhanced_cremotemcp
: Screenshots with metadatafile_operations_bulk_cremotemcp
: Bulk file operationsfile_management_cremotemcp
: Temporary file cleanup
Benefits: Better debugging with targeted screenshots, improved file handling workflows, automatic resource management, enhanced visual debugging capabilities.
Key Benefits for LLM Agents
🚀 Efficiency Gains
- 10x Form Efficiency: Complete forms in 1-2 calls instead of 10+ individual interactions
- Batch Operations: Multiple data extractions and interactions in single calls
- Reduced Round Trips: Comprehensive tools minimize API call overhead
🧠 Intelligence & Context
- Conditional Logic: Element checking enables smart decision making without timing issues
- Rich Page Context: Complete page state, performance metrics, and content verification
- Form Intelligence: Complete form analysis before interaction prevents errors
🛠 Enhanced Capabilities
- Visual Debugging: Element-specific screenshots and enhanced metadata
- File Management: Bulk operations and automated cleanup
- Error Prevention: State checking and validation before actions
- Resource Management: Automatic cleanup and connection handling
Installation & Usage
Prerequisites
-
Cremote daemon must be running:
cremotedaemon
-
Chrome/Chromium with remote debugging:
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug
Build the MCP Server
cd mcp/
go build -o cremote-mcp .
Configuration
Basic Configuration (Single Client - stdio)
Set environment variables to configure the cremote connection:
export CREMOTE_HOST=localhost
export CREMOTE_PORT=8989
export CREMOTE_TRANSPORT=stdio # Default
Multi-Client Configuration (HTTP Transport)
For multiple concurrent clients:
export CREMOTE_HOST=localhost
export CREMOTE_PORT=8989
export CREMOTE_TRANSPORT=http
export CREMOTE_HTTP_HOST=localhost
export CREMOTE_HTTP_PORT=8990
Environment Variables
Variable | Default | Description |
---|---|---|
CREMOTE_TRANSPORT |
stdio |
Transport mode: stdio or http |
CREMOTE_HOST |
localhost |
Cremote daemon host |
CREMOTE_PORT |
8989 |
Cremote daemon port |
CREMOTE_HTTP_HOST |
localhost |
HTTP server host (HTTP mode only) |
CREMOTE_HTTP_PORT |
8990 |
HTTP server port (HTTP mode only) |
Running with Claude Desktop
Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json
on macOS):
{
"mcpServers": {
"cremote": {
"command": "/path/to/cremote-mcp",
"env": {
"CREMOTE_HOST": "localhost",
"CREMOTE_PORT": "8989"
}
}
}
}
Running with Other MCP Clients
The server communicates via JSON-RPC over stdio, so it can be used with any MCP-compatible client:
echo '{"method":"tools/list","params":{},"id":1}' | ./cremote-mcp
Response Format
All tool responses include:
{
"success": true,
"data": "...",
"screenshot": "/tmp/screenshot.png",
"current_tab": "tab-id-123",
"tab_history": ["tab-id-123", "tab-id-456"],
"iframe_mode": false,
"error": null,
"metadata": {}
}
Example Workflows
Basic Login Workflow (Traditional Approach)
// 1. Navigate to a page
{
"name": "web_navigate_cremotemcp",
"arguments": {
"url": "https://example.com/login",
"screenshot": true
}
}
// 2. Check if login form exists
{
"name": "web_element_check_cremotemcp",
"arguments": {
"selector": "#login-form",
"check_type": "exists"
}
}
// 3. Fill login form using bulk operations
{
"name": "web_form_fill_bulk_cremotemcp",
"arguments": {
"form_selector": "#login-form",
"fields": {
"username": "testuser",
"password": "password123"
}
}
}
// 4. Submit and verify
{
"name": "web_interact_cremotemcp",
"arguments": {
"action": "click",
"selector": "#login-button"
}
}
// 5. Extract multiple results at once
{
"name": "web_extract_multiple_cremotemcp",
"arguments": {
"selectors": {
"welcome_message": ".welcome-message",
"user_name": ".user-profile .name",
"last_login": ".user-info .last-login"
}
}
}
// 6. Take enhanced screenshot with metadata
{
"name": "web_screenshot_enhanced_cremotemcp",
"arguments": {
"output": "/tmp/login-success.png",
"full_page": true
}
}
Advanced E-commerce Data Extraction Workflow
// 1. Navigate and check page state
{
"name": "web_navigate_cremotemcp",
"arguments": {
"url": "https://shop.example.com/products",
"screenshot": true
}
}
// 2. Get page performance metrics
{
"name": "web_performance_metrics_cremotemcp",
"arguments": {}
}
// 3. Extract all product data in one call
{
"name": "web_extract_multiple_cremotemcp",
"arguments": {
"selectors": {
"product_titles": ".product-card h3",
"prices": ".product-card .price",
"ratings": ".product-card .rating",
"availability": ".product-card .stock-status"
}
}
}
// 4. Extract all product links with filtering
{
"name": "web_extract_links_cremotemcp",
"arguments": {
"container_selector": ".product-grid",
"href_pattern": ".*/product/.*",
"text_pattern": ".*"
}
}
// 5. Check if more products are loading
{
"name": "web_content_check_cremotemcp",
"arguments": {
"type": "scripts"
}
}
Phase 6: Accessibility Tree Support (3 Tools)
get_accessibility_tree_cremotemcp
Get the full accessibility tree for a page or with limited depth.
{
"name": "get_accessibility_tree_cremotemcp",
"arguments": {
"tab": "optional-tab-id",
"depth": 3,
"timeout": 10
}
}
get_partial_accessibility_tree_cremotemcp
Get accessibility tree for a specific element and its relatives.
{
"name": "get_partial_accessibility_tree_cremotemcp",
"arguments": {
"selector": "form",
"tab": "optional-tab-id",
"fetch_relatives": true,
"timeout": 10
}
}
query_accessibility_tree_cremotemcp
Query accessibility tree for nodes matching specific criteria.
{
"name": "query_accessibility_tree_cremotemcp",
"arguments": {
"tab": "optional-tab-id",
"selector": "form",
"accessible_name": "Submit",
"role": "button",
"timeout": 10
}
}
Benefits Over CLI
🎯 Enhanced Efficiency
- State Management: No need to manually track tab IDs
- Batch Operations: 10x efficiency with bulk form filling and multi-selector extraction
- Intelligent Defaults: Smart parameter handling and fallbacks
- Resource Cleanup: Automatic management of tabs and files
🔍 Better Intelligence
- Conditional Logic: Element checking enables smart decision making
- Rich Context: Page state, performance metrics, and content verification
- Form Intelligence: Complete form analysis before interaction
- Error Prevention: State validation before actions
🛠 Advanced Capabilities
- Enhanced Screenshots: Element-specific and metadata-rich capture
- File Management: Bulk operations and automated cleanup
- Better Error Context: Rich error information for debugging
- Structured Responses: Consistent, parseable response format
🎉 Production Ready
This comprehensive web automation platform is production ready with:
- 31 Tools: Complete coverage of web automation needs
- 6 Enhancement Phases: Systematic capability building from basic to advanced
- Extensive Testing: All tools validated and documented
- LLM Optimized: Designed specifically for AI agent workflows
- Backward Compatible: All existing tools continue to work unchanged
📊 Capability Matrix
Category | Tools | Key Benefits |
---|---|---|
Core Web Automation | 10 tools | Navigation, interaction, extraction, screenshots, tabs, iframes, files, console |
Element Intelligence | 2 tools | Conditional logic, state checking, attribute inspection |
Data Extraction | 4 tools | Batch extraction, structured data, pattern matching, table processing |
Form Automation | 3 tools | Form analysis, bulk filling, batch interactions |
Page Intelligence | 4 tools | Page state, performance metrics, content verification, viewport info |
Enhanced Capabilities | 4 tools | Element screenshots, enhanced metadata, bulk file ops, file management |
Accessibility Tree | 3 tools | Semantic understanding, accessibility testing, screen reader simulation |
Development
To extend the MCP server with new tools:
- Add the tool definition to
handleToolsList()
- Add a case in
handleToolCall()
- Implement the handler function following the pattern of existing handlers
- Update this documentation
The server is designed to be easily extensible while maintaining consistency with the cremote client library.
🚀 Ready for Production: Complete web automation platform with 31 tools across 6 enhancement phases, optimized for LLM agents and production workflows.