Chrome Remote Daemon (cremote)
A command line utility for automating browser interactions using Chrome's remote debugging protocol. The tool uses a daemon-client architecture to maintain persistent connections to the browser.
Architecture
The tool consists of two main components:
- Daemon (
cremotedaemon
): A long-running process that connects to Chrome and manages browser state - Client (
cremote
): A command-line client that sends commands to the daemon
This architecture provides several benefits:
- Persistent browser connection across multiple commands
- Reliable tab management
- No need to reconnect for each command
- Better performance
MCP Server
Cremote includes a Model Context Protocol (MCP) server that provides a structured API for LLMs and AI agents. Instead of using CLI commands, the MCP server offers:
- State Management: Automatic tracking of tabs, history, and iframe context
- Intelligent Abstractions: High-level tools that combine multiple operations
- Better Error Handling: Rich error context for debugging
- Automatic Screenshots: Built-in screenshot capture for documentation
See the MCP Server Documentation for setup and usage instructions.
Prerequisites
- Go 1.16 or higher
- A running instance of Chromium/Chrome with remote debugging enabled
Starting Chromium with Remote Debugging
Before using this tool, you must start Chromium with remote debugging enabled on port 9222:
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug
or for Chrome:
google-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug
Important: The --user-data-dir
flag is required to prevent conflicts with existing browser instances.
Usage
Starting the Daemon
First, start the daemon:
cremotedaemon
By default, the daemon listens on port 8989. You can specify a different port:
cremotedaemon --port=9090
Using the Client
Once the daemon is running, you can use the client to send commands:
cremote <command> [options]
Commands
version
: Show version information for CLI and daemonopen-tab
: Open a new tab and return its IDload-url
: Load a URL in a tabfill-form
: Fill a form field with a value (also handles checkboxes, radio buttons, and dropdown selections)upload-file
: Upload a file to a file inputsubmit-form
: Submit a formget-source
: Get the source code of a pageget-element
: Get the HTML of an elementclick-element
: Click on an elementclose-tab
: Close a tabwait-navigation
: Wait for a navigation eventeval-js
: Execute JavaScript code in a tabswitch-iframe
: Switch to iframe context for subsequent commandsswitch-main
: Switch back to main page contextlist-tabs
: List all open tabsdisable-cache
: Disable browser cache for a tabenable-cache
: Enable browser cache for a tabclear-cache
: Clear browser cache for a tabclear-all-site-data
: Clear all site data (cookies, storage, cache, etc.)clear-cookies
: Clear cookies for a tabclear-storage
: Clear web storage (localStorage, sessionStorage, etc.)drag-and-drop
: Drag and drop from source element to target elementdrag-and-drop-coordinates
: Drag and drop from source element to specific coordinatesdrag-and-drop-offset
: Drag and drop from source element by relative offsetright-click
: Right-click on an element to open context menusdouble-click
: Double-click on an element for file operations or text selectionmiddle-click
: Middle-click on an element (typically opens links in new tabs)hover
: Hover over an element to trigger tooltips or dropdownsmouse-move
: Move mouse to specific coordinates without clickingscroll-wheel
: Scroll with mouse wheel at specific coordinateskey-combination
: Send key combinations (Ctrl+C, Alt+Tab, Shift+Enter, etc.)special-key
: Send special keys (Enter, Escape, Tab, F1-F12, Arrow keys, etc.)modifier-click
: Click with modifier keys (Ctrl+click, Shift+click for multi-selection)status
: Check if the daemon is running
Current Tab Feature
The tool tracks the current tab, so you can omit the --tab
flag to use the most recently used tab. This makes interactive use more convenient.
For example, after opening a tab:
# Open a tab
TAB_ID=$(cremote open-tab)
# Load a URL in the current tab (no need to specify --tab)
cremote load-url --url="https://example.com"
# Click an element in the current tab
cremote click-element --selector="a.button"
You can still specify a tab ID explicitly if you need to work with multiple tabs.
Run cremote <command> -h
for more information on a specific command.
Examples
Check Daemon Status
cremote status
Open a new tab
cremote open-tab [--timeout=5]
This will return a tab ID that you can use in subsequent commands. The --timeout
parameter specifies how many seconds to wait for the tab to open (default: 5 seconds).
Load a URL in a tab
cremote load-url --tab="<tab-id>" --url="https://example.com" [--timeout=5]
The --timeout
parameter specifies how many seconds to wait for the URL to load (default: 5 seconds).
Fill a form field
cremote fill-form --tab="<tab-id>" --selector="#username" --value="user123" [--timeout=5]
The --timeout
parameter specifies how many seconds to wait for the fill operation to complete (default: 5 seconds).
Check/uncheck a checkbox or select a radio button
The same fill-form
command can be used to check/uncheck checkboxes or select radio buttons:
# Check a checkbox
cremote fill-form --tab="<tab-id>" --selector="#agree" --value="true"
# Uncheck a checkbox
cremote fill-form --tab="<tab-id>" --selector="#agree" --value="false"
# Select a radio button
cremote fill-form --tab="<tab-id>" --selector="#option2" --value="true"
Accepted values for checking a checkbox or selecting a radio button: true
, 1
, yes
, on
, checked
.
Any other value will uncheck the checkbox or deselect the radio button.
Select dropdown options
The fill-form
command can also be used to select options in dropdown elements:
# Select by option text (visible text)
cremote fill-form --tab="<tab-id>" --selector="#country" --value="United States"
# Select by option value (value attribute)
cremote fill-form --tab="<tab-id>" --selector="#state" --value="CA"
The command automatically detects dropdown elements and tries both option text and option value matching. This works with both <select>
elements and custom dropdown implementations.
Upload a file
cremote upload-file --tab="<tab-id>" --selector="input[type=file]" --file="/path/to/file.jpg" [--timeout=5]
This command automatically:
- Transfers the file from your local machine to the daemon container (if running in a container)
- Uploads the file to the specified file input element on the web page
The --timeout
parameter specifies how many seconds to wait for the upload operation to complete (default: 5 seconds).
Note: The file path should be the local path on your machine. The command will handle transferring it to the daemon container automatically.
Submit a form
cremote submit-form --tab="<tab-id>" --selector="form#login" [--timeout=5]
The --timeout
parameter specifies how many seconds to wait for the form submission to complete (default: 5 seconds).
Get the source code of a page
cremote get-source --tab="<tab-id>" [--timeout=5]
The --timeout
parameter specifies how many seconds to wait for getting the page source (default: 5 seconds).
Get the HTML of an element
cremote get-element --tab="<tab-id>" --selector=".content" [--timeout=5]
The --timeout
parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds).
Click on an element
cremote click-element --tab="<tab-id>" --selector="button.submit" [--timeout=5]
The --timeout
parameter specifies how many seconds to wait for the click operation to complete (default: 5 seconds).
Close a tab
cremote close-tab --tab="<tab-id>" [--timeout=5]
The --timeout
parameter specifies how many seconds to wait for the tab to close (default: 5 seconds).
Wait for navigation to complete
cremote wait-navigation --tab="<tab-id>" [--timeout=5]
The --timeout
parameter specifies how many seconds to wait for navigation to complete (default: 5 seconds).
Note: wait-navigation
intelligently detects if navigation is actually happening and returns immediately if the page is already stable, preventing unnecessary waiting.
Execute JavaScript code
cremote eval-js --tab="<tab-id>" --code="document.getElementById('myElement').innerHTML = 'Hello World!'" [--timeout=5]
The --timeout
parameter specifies how many seconds to wait for the JavaScript execution to complete (default: 5 seconds).
This command allows you to execute arbitrary JavaScript code in a tab. Examples:
- Set element content:
--code="document.getElementById('tinymce').innerHTML='Foo!'"
- Get element text:
--code="document.querySelector('.result').textContent"
- Trigger events:
--code="document.getElementById('button').click()"
- Manipulate DOM:
--code="document.body.style.backgroundColor = 'red'"
The command handles both JavaScript expressions and statements:
- Expressions (return values):
document.title
,2 + 3
,element.textContent
- Statements (assignments/actions):
document.title = 'New Title'
,element.click()
For statements, the command returns "undefined". For expressions, it returns the result as a string.
Take a screenshot
cremote screenshot --tab="<tab-id>" --output="/path/to/screenshot.png" [--full-page] [--timeout=5]
The --output
parameter specifies where to save the screenshot (PNG format).
The --full-page
flag captures the entire page instead of just the viewport (default: viewport only).
The --timeout
parameter specifies how many seconds to wait for the screenshot to complete (default: 5 seconds).
Working with iframes
To interact with content inside an iframe, you need to switch the context:
# Switch to iframe context
cremote switch-iframe --tab="<tab-id>" --selector="iframe#payment-form"
# Now all subsequent commands will operate within the iframe
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#expiry" --value="12/25"
cremote click-element --selector="#submit-payment"
# Switch back to main page context
cremote switch-main --tab="<tab-id>"
# Now commands operate on the main page again
cremote get-element --selector=".success-message"
Important Notes:
- Once you switch to an iframe, all subsequent commands (fill-form, click-element, eval-js, etc.) operate within that iframe
- You must use
switch-main
to return to the main page context - Each tab maintains its own iframe context independently
- Iframe context persists until explicitly switched back to main or the tab is closed
List all open tabs
cremote list-tabs
This will display all open tabs with their IDs and URLs. The current tab is marked with an asterisk (*)
Cache and Site Data Management
You can control browser cache and site data for testing, performance optimization, and privacy:
# Cache Management
# Disable cache for current tab (useful for testing)
cremote disable-cache [--tab="<tab-id>"] [--timeout=5]
# Enable cache for current tab
cremote enable-cache [--tab="<tab-id>"] [--timeout=5]
# Clear browser cache for current tab
cremote clear-cache [--tab="<tab-id>"] [--timeout=5]
# Site Data Management
# Clear ALL site data (cookies, storage, cache, etc.)
cremote clear-all-site-data [--tab="<tab-id>"] [--timeout=10]
# Clear only cookies
cremote clear-cookies [--tab="<tab-id>"] [--timeout=5]
# Clear only web storage (localStorage, sessionStorage, IndexedDB, etc.)
cremote clear-storage [--tab="<tab-id>"] [--timeout=5]
Use Cases:
- Testing: Disable cache to ensure fresh page loads without cached resources
- Performance Testing: Clear cache to test cold load performance
- Debugging: Clear cache to resolve cache-related issues
- Development: Disable cache during development to see changes immediately
- Authentication Testing: Clear cookies to test login/logout flows
- Privacy Testing: Clear all site data to test clean state scenarios
- Storage Testing: Clear web storage to test application state management
The --timeout
parameter specifies how many seconds to wait for the operation to complete (default: 5 seconds, use longer timeouts for comprehensive data clearing).
Drag and Drop Operations
You can perform drag and drop operations for testing interactive web applications:
# Drag and Drop Between Elements
# Drag from source element to target element
cremote drag-and-drop --source=".draggable-item" --target=".drop-zone" [--tab="<tab-id>"] [--timeout=5]
# Drag and Drop to Specific Coordinates
# Drag from source element to specific x,y coordinates
cremote drag-and-drop-coordinates --source=".draggable-item" --x=300 --y=200 [--tab="<tab-id>"] [--timeout=5]
# Drag and Drop by Relative Offset
# Drag from source element by relative pixel offset
cremote drag-and-drop-offset --source=".draggable-item" --offset-x=100 --offset-y=50 [--tab="<tab-id>"] [--timeout=5]
Use Cases:
- File Upload: Drag files to upload areas
- Sortable Lists: Reorder items in sortable lists
- Kanban Boards: Move cards between columns
- Image Galleries: Rearrange images or media
- Form Builders: Drag form elements to build layouts
- Dashboard Widgets: Rearrange dashboard components
- Game Testing: Test drag-based game mechanics
- UI Component Testing: Test custom drag and drop components
Technical Details:
- Enhanced HTML5 Support: Automatically injects JavaScript helpers to trigger proper HTML5 drag and drop events (dragstart, dragover, drop, dragend)
- Smart Target Detection: For coordinate/offset drags, automatically detects and targets valid drop zones at destination coordinates
- Hybrid Approach: Tries HTML5 drag events first, falls back to Chrome DevTools Protocol mouse events if needed
- Intelligent Fallback: Automatically switches between element-to-element and coordinate-based approaches for optimal compatibility
- Realistic Event Simulation: Performs drag operations with proper timing and intermediate mouse movements
- Automatic Element Detection: Calculates element center points automatically for accurate targeting
- Robust Error Handling: Supports timeout handling and graceful degradation for complex drag operations
- Universal Compatibility: Works with all modern drag and drop implementations (HTML5 Drag and Drop, jQuery UI, custom implementations)
The --timeout
parameter specifies how many seconds to wait for the drag and drop operation to complete (default: 5 seconds).
Advanced Input Operations
Cremote provides sophisticated mouse and keyboard interactions for comprehensive testing of modern web applications:
Mouse Operations
# Right-click to open context menus
cremote right-click --selector=".file-item" [--tab="<tab-id>"] [--timeout=5]
# Double-click for file operations or text selection
cremote double-click --selector=".file-icon" [--tab="<tab-id>"] [--timeout=5]
# Middle-click to open links in new tabs
cremote middle-click --selector="a[href='/dashboard']" [--tab="<tab-id>"] [--timeout=5]
# Hover to trigger tooltips or dropdowns
cremote hover --selector=".tooltip-trigger" [--tab="<tab-id>"] [--timeout=5]
# Move mouse to specific coordinates without clicking
cremote mouse-move --x=400 --y=300 [--tab="<tab-id>"] [--timeout=5]
# Scroll with mouse wheel at specific coordinates
cremote scroll-wheel --x=400 --y=300 --delta-y=-120 [--delta-x=0] [--tab="<tab-id>"] [--timeout=5]
Keyboard Operations
# Send key combinations (Ctrl+C, Alt+Tab, Shift+Enter, etc.)
cremote key-combination --keys="Ctrl+C" [--tab="<tab-id>"] [--timeout=5]
cremote key-combination --keys="Alt+Tab" [--tab="<tab-id>"] [--timeout=5]
cremote key-combination --keys="Ctrl+Shift+T" [--tab="<tab-id>"] [--timeout=5]
# Send special keys (Enter, Escape, Tab, F1-F12, Arrow keys, etc.)
cremote special-key --key="Enter" [--tab="<tab-id>"] [--timeout=5]
cremote special-key --key="Escape" [--tab="<tab-id>"] [--timeout=5]
cremote special-key --key="ArrowUp" [--tab="<tab-id>"] [--timeout=5]
cremote special-key --key="F1" [--tab="<tab-id>"] [--timeout=5]
# Click with modifier keys (Ctrl+click, Shift+click for multi-selection)
cremote modifier-click --selector=".selectable-item" --modifiers="Ctrl" [--tab="<tab-id>"] [--timeout=5]
cremote modifier-click --selector=".list-item" --modifiers="Shift" [--tab="<tab-id>"] [--timeout=5]
cremote modifier-click --selector=".table-row" --modifiers="Ctrl+Shift" [--tab="<tab-id>"] [--timeout=5]
Advanced Use Cases:
- Context Menu Testing: Right-click to test context menus and their functionality
- Accessibility Testing: Full keyboard navigation support for accessibility compliance
- Tooltip/Dropdown Testing: Hover interactions for UI elements that appear on mouse over
- Multi-Selection Testing: Ctrl+click and Shift+click for testing selection interfaces
- Copy/Paste Workflows: Test clipboard operations with Ctrl+A, Ctrl+C, Ctrl+V
- Precise Mouse Control: Pixel-perfect mouse positioning and scrolling
- Function Key Testing: Test application shortcuts using F1-F12 keys
- Arrow Key Navigation: Test keyboard navigation in lists, tables, and forms
Technical Details:
- Uses Chrome DevTools Protocol's Input domain for precise control
- Supports all modifier keys: Ctrl, Alt, Shift, Meta/Cmd
- Comprehensive key mapping for 60+ keys including letters, numbers, function keys, special keys
- Proper modifier key sequencing (key down → action → key up)
- Element positioning using content quads for pixel-perfect accuracy
- Mouse button differentiation (Left, Right, Middle)
- Realistic interaction patterns matching human behavior
Connecting to a Remote Daemon
By default, the client connects to a daemon running on localhost. To connect to a daemon running on a different host:
cremote open-tab --host="remote-host" --port=8989
Automation Example
Here's an example of how to use cremote in a shell script to automate a login process:
#!/bin/bash
# Make sure Chromium is running with remote debugging enabled
# chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug &
# Make sure the daemon is running
# cremotedaemon &
# Open a new tab
TAB_ID=$(cremote open-tab)
# Load the login page (using the current tab)
cremote load-url --url="https://example.com/login"
# Fill in the username and password (using the current tab)
cremote fill-form --selector="#username" --value="user123"
cremote fill-form --selector="#password" --value="password123"
# Check the 'Remember me' checkbox
cremote fill-form --selector="#remember" --value="true"
# Accept the terms and conditions
cremote fill-form --selector="#terms" --value="true"
# Either submit the form using the form selector (using the current tab)
cremote submit-form --selector="form#login"
# Or click the login button (using the current tab)
# cremote click-element --selector="#login-button"
# You can still specify a tab ID explicitly if needed
# cremote load-url --tab="$TAB_ID" --url="https://example.com/login"
# Wait for navigation to complete (using the current tab)
cremote wait-navigation --timeout=30
# Execute JavaScript to check if login was successful
LOGIN_STATUS=$(cremote eval-js --code="document.querySelector('.welcome-message') !== null")
if [ "$LOGIN_STATUS" = "true" ]; then
echo "Login successful!"
fi
# Example: Working with an iframe (e.g., payment form)
# Switch to iframe context
cremote switch-iframe --selector="iframe.payment-frame"
# Fill payment form inside iframe
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#expiry-date" --value="12/25"
cremote click-element --selector="#pay-button"
# Switch back to main page
cremote switch-main
# Get the source code of the page after login (using the current tab)
cremote get-source
# Take a screenshot of the logged-in page
cremote screenshot --output="/tmp/login-success.png"
# Take a full-page screenshot for documentation
cremote screenshot --output="/tmp/full-page.png" --full-page
# Close the current tab
cremote close-tab
Troubleshooting
Daemon Not Running
If you see an error like "connection refused", make sure the daemon is running:
cremote status
If the daemon is not running, start it:
cremotedaemon
Connection Issues
If the daemon can't connect to Chromium, check the following:
- Make sure Chromium/Chrome is running with remote debugging enabled on port 9222
- Verify that Chromium was started with the correct flags:
--remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug
- Check if you can access the Chromium DevTools Protocol by opening
http://localhost:9222/json/version
in your browser
Tab Management
The daemon manages tab IDs for you, so you don't need to worry about tab persistence between commands. However:
- Tab IDs are only valid for the duration of the browser session
- If Chromium is restarted, you'll need to get new tab IDs
- Store the tab ID returned by
open-tab
in a variable for use in subsequent commands - If a tab is closed by Chromium (not through the tool), you may need to run
open-tab
again
License
MIT