253 lines
5.0 KiB
Markdown
253 lines
5.0 KiB
Markdown
# Cremote MCP Server
|
|
|
|
This is a Model Context Protocol (MCP) server that exposes cremote's web automation capabilities to LLMs and AI agents. Instead of using CLI commands, this server provides a structured API that maintains state and provides intelligent abstractions.
|
|
|
|
## Features
|
|
|
|
- **State Management**: Automatically tracks current tab, tab history, and iframe context
|
|
- **Intelligent Abstractions**: High-level tools that combine multiple cremote operations
|
|
- **Automatic Screenshots**: Optional screenshot capture for debugging and documentation
|
|
- **Error Recovery**: Better error handling and context for LLMs
|
|
- **Resource Management**: Automatic cleanup and connection management
|
|
|
|
## Quick Start for LLMs
|
|
|
|
**For LLM agents**: See the comprehensive [LLM MCP Guide](LLM_MCP_GUIDE.md) for detailed usage instructions, examples, and best practices.
|
|
|
|
## Available Tools
|
|
|
|
### 1. `web_navigate`
|
|
Navigate to URLs with optional screenshot capture.
|
|
|
|
```json
|
|
{
|
|
"name": "web_navigate",
|
|
"arguments": {
|
|
"url": "https://example.com",
|
|
"screenshot": true,
|
|
"timeout": 10
|
|
}
|
|
}
|
|
```
|
|
|
|
### 2. `web_interact`
|
|
Interact with web elements (click, fill, submit, upload).
|
|
|
|
```json
|
|
{
|
|
"name": "web_interact",
|
|
"arguments": {
|
|
"action": "fill",
|
|
"selector": "#username",
|
|
"value": "testuser",
|
|
"timeout": 5
|
|
}
|
|
}
|
|
```
|
|
|
|
### 3. `web_extract`
|
|
Extract data from pages (source, element HTML, JavaScript execution).
|
|
|
|
```json
|
|
{
|
|
"name": "web_extract",
|
|
"arguments": {
|
|
"type": "javascript",
|
|
"code": "document.title",
|
|
"timeout": 5
|
|
}
|
|
}
|
|
```
|
|
|
|
### 4. `web_screenshot`
|
|
Take screenshots of the current page.
|
|
|
|
```json
|
|
{
|
|
"name": "web_screenshot",
|
|
"arguments": {
|
|
"output": "/tmp/page.png",
|
|
"full_page": true,
|
|
"timeout": 5
|
|
}
|
|
}
|
|
```
|
|
|
|
### 5. `web_manage_tabs`
|
|
Manage browser tabs (open, close, list, switch).
|
|
|
|
```json
|
|
{
|
|
"name": "web_manage_tabs",
|
|
"arguments": {
|
|
"action": "open",
|
|
"timeout": 5
|
|
}
|
|
}
|
|
```
|
|
|
|
### 6. `web_iframe`
|
|
Switch iframe context for subsequent operations.
|
|
|
|
```json
|
|
{
|
|
"name": "web_iframe",
|
|
"arguments": {
|
|
"action": "enter",
|
|
"selector": "iframe#payment-form"
|
|
}
|
|
}
|
|
```
|
|
|
|
## Installation & Usage
|
|
|
|
### Prerequisites
|
|
|
|
1. **Cremote daemon must be running**:
|
|
```bash
|
|
cremotedaemon
|
|
```
|
|
|
|
2. **Chrome/Chromium with remote debugging**:
|
|
```bash
|
|
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug
|
|
```
|
|
|
|
### Build the MCP Server
|
|
|
|
```bash
|
|
cd mcp/
|
|
go build -o cremote-mcp .
|
|
```
|
|
|
|
### Configuration
|
|
|
|
Set environment variables to configure the cremote connection:
|
|
|
|
```bash
|
|
export CREMOTE_HOST=localhost
|
|
export CREMOTE_PORT=8989
|
|
```
|
|
|
|
### Running with Claude Desktop
|
|
|
|
Add to your Claude Desktop configuration (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):
|
|
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"cremote": {
|
|
"command": "/path/to/cremote-mcp",
|
|
"env": {
|
|
"CREMOTE_HOST": "localhost",
|
|
"CREMOTE_PORT": "8989"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Running with Other MCP Clients
|
|
|
|
The server communicates via JSON-RPC over stdio, so it can be used with any MCP-compatible client:
|
|
|
|
```bash
|
|
echo '{"method":"tools/list","params":{},"id":1}' | ./cremote-mcp
|
|
```
|
|
|
|
## Response Format
|
|
|
|
All tool responses include:
|
|
|
|
```json
|
|
{
|
|
"success": true,
|
|
"data": "...",
|
|
"screenshot": "/tmp/screenshot.png",
|
|
"current_tab": "tab-id-123",
|
|
"tab_history": ["tab-id-123", "tab-id-456"],
|
|
"iframe_mode": false,
|
|
"error": null,
|
|
"metadata": {}
|
|
}
|
|
```
|
|
|
|
## Example Workflow
|
|
|
|
```json
|
|
// 1. Navigate to a page
|
|
{
|
|
"name": "web_navigate",
|
|
"arguments": {
|
|
"url": "https://example.com/login",
|
|
"screenshot": true
|
|
}
|
|
}
|
|
|
|
// 2. Fill login form
|
|
{
|
|
"name": "web_interact",
|
|
"arguments": {
|
|
"action": "fill",
|
|
"selector": "#username",
|
|
"value": "testuser"
|
|
}
|
|
}
|
|
|
|
{
|
|
"name": "web_interact",
|
|
"arguments": {
|
|
"action": "fill",
|
|
"selector": "#password",
|
|
"value": "password123"
|
|
}
|
|
}
|
|
|
|
// 3. Submit form
|
|
{
|
|
"name": "web_interact",
|
|
"arguments": {
|
|
"action": "click",
|
|
"selector": "#login-button"
|
|
}
|
|
}
|
|
|
|
// 4. Extract result
|
|
{
|
|
"name": "web_extract",
|
|
"arguments": {
|
|
"type": "javascript",
|
|
"code": "document.querySelector('.welcome-message')?.textContent"
|
|
}
|
|
}
|
|
|
|
// 5. Take final screenshot
|
|
{
|
|
"name": "web_screenshot",
|
|
"arguments": {
|
|
"output": "/tmp/login-success.png",
|
|
"full_page": true
|
|
}
|
|
}
|
|
```
|
|
|
|
## Benefits Over CLI
|
|
|
|
- **State Management**: No need to manually track tab IDs
|
|
- **Better Error Context**: Rich error information for debugging
|
|
- **Automatic Screenshots**: Built-in screenshot capture for documentation
|
|
- **Intelligent Defaults**: Smart parameter handling and fallbacks
|
|
- **Resource Cleanup**: Automatic management of tabs and files
|
|
- **Structured Responses**: Consistent, parseable response format
|
|
|
|
## Development
|
|
|
|
To extend the MCP server with new tools:
|
|
|
|
1. Add the tool definition to `handleToolsList()`
|
|
2. Add a case in `handleToolCall()`
|
|
3. Implement the handler function following the pattern of existing handlers
|
|
4. Update this documentation
|
|
|
|
The server is designed to be easily extensible while maintaining consistency with the cremote client library.
|