Josh at WLTechBlog 88d8202b0d Cleanup
2025-10-21 13:31:49 -05:00
2025-10-16 10:54:37 -05:00
2025-08-12 10:19:13 -05:00
2025-10-07 11:47:47 -05:00
2025-10-16 11:26:23 -05:00
2025-10-16 11:26:23 -05:00
2025-10-19 10:07:47 -05:00
2025-08-18 14:06:05 -05:00
2025-08-12 10:19:13 -05:00
2025-08-12 10:19:13 -05:00
2025-09-30 14:11:27 -05:00
2025-10-01 12:38:03 -05:00
2025-10-03 15:04:19 -05:00

Chrome Remote Daemon (cremote)

A command line utility for automating browser interactions using Chrome's remote debugging protocol. The tool uses a daemon-client architecture to maintain persistent connections to the browser.

Architecture

The tool consists of two main components:

  1. Daemon (cremotedaemon): A long-running process that connects to Chrome and manages browser state
  2. Client (cremote): A command-line client that sends commands to the daemon

This architecture provides several benefits:

  • Persistent browser connection across multiple commands
  • Reliable tab management
  • No need to reconnect for each command
  • Better performance

MCP Server

Cremote includes a Model Context Protocol (MCP) server that provides a structured API for LLMs and AI agents. Instead of using CLI commands, the MCP server offers:

  • State Management: Automatic tracking of tabs, history, and iframe context
  • Intelligent Abstractions: High-level tools that combine multiple operations
  • Better Error Handling: Rich error context for debugging
  • Automatic Screenshots: Built-in screenshot capture for documentation

See the MCP Server Documentation for setup and usage instructions.

Prerequisites

  • Go 1.16 or higher
  • A running instance of Chromium/Chrome with remote debugging enabled

Starting Chromium with Remote Debugging

Before using this tool, you must start Chromium with remote debugging enabled on port 9222:

chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug

or for Chrome:

google-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug

Important: The --user-data-dir flag is required to prevent conflicts with existing browser instances.

Usage

Starting the Daemon

First, start the daemon:

cremotedaemon

By default, the daemon listens on port 8989. You can specify a different port:

cremotedaemon --port=9090

Using the Client

Once the daemon is running, you can use the client to send commands:

cremote <command> [options]

Commands

  • version: Show version information for CLI and daemon
  • open-tab: Open a new tab and return its ID
  • load-url: Load a URL in a tab
  • fill-form: Fill a form field with a value (also handles checkboxes, radio buttons, and dropdown selections)
  • upload-file: Upload a file to a file input
  • submit-form: Submit a form
  • get-source: Get the source code of a page
  • get-element: Get the HTML of an element
  • click-element: Click on an element
  • close-tab: Close a tab
  • wait-navigation: Wait for a navigation event
  • eval-js: Execute JavaScript code in a tab
  • switch-iframe: Switch to iframe context for subsequent commands
  • switch-main: Switch back to main page context
  • list-tabs: List all open tabs
  • disable-cache: Disable browser cache for a tab
  • enable-cache: Enable browser cache for a tab
  • clear-cache: Clear browser cache for a tab
  • clear-all-site-data: Clear all site data (cookies, storage, cache, etc.)
  • clear-cookies: Clear cookies for a tab
  • clear-storage: Clear web storage (localStorage, sessionStorage, etc.)
  • drag-and-drop: Drag and drop from source element to target element
  • drag-and-drop-coordinates: Drag and drop from source element to specific coordinates
  • drag-and-drop-offset: Drag and drop from source element by relative offset
  • right-click: Right-click on an element to open context menus
  • double-click: Double-click on an element for file operations or text selection
  • middle-click: Middle-click on an element (typically opens links in new tabs)
  • hover: Hover over an element to trigger tooltips or dropdowns
  • mouse-move: Move mouse to specific coordinates without clicking
  • scroll-wheel: Scroll with mouse wheel at specific coordinates
  • key-combination: Send key combinations (Ctrl+C, Alt+Tab, Shift+Enter, etc.)
  • special-key: Send special keys (Enter, Escape, Tab, F1-F12, Arrow keys, etc.)
  • modifier-click: Click with modifier keys (Ctrl+click, Shift+click for multi-selection)
  • status: Check if the daemon is running

Current Tab Feature

The tool tracks the current tab, so you can omit the --tab flag to use the most recently used tab. This makes interactive use more convenient.

For example, after opening a tab:

# Open a tab
TAB_ID=$(cremote open-tab)

# Load a URL in the current tab (no need to specify --tab)
cremote load-url --url="https://example.com"

# Click an element in the current tab
cremote click-element --selector="a.button"

You can still specify a tab ID explicitly if you need to work with multiple tabs.

Run cremote <command> -h for more information on a specific command.

Examples

Check Daemon Status

cremote status

Open a new tab

cremote open-tab [--timeout=5]

This will return a tab ID that you can use in subsequent commands. The --timeout parameter specifies how many seconds to wait for the tab to open (default: 5 seconds).

Load a URL in a tab

cremote load-url --tab="<tab-id>" --url="https://example.com" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for the URL to load (default: 5 seconds).

Fill a form field

cremote fill-form --tab="<tab-id>" --selector="#username" --value="user123" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for the fill operation to complete (default: 5 seconds).

Check/uncheck a checkbox or select a radio button

The same fill-form command can be used to check/uncheck checkboxes or select radio buttons:

# Check a checkbox
cremote fill-form --tab="<tab-id>" --selector="#agree" --value="true"

# Uncheck a checkbox
cremote fill-form --tab="<tab-id>" --selector="#agree" --value="false"

# Select a radio button
cremote fill-form --tab="<tab-id>" --selector="#option2" --value="true"

Accepted values for checking a checkbox or selecting a radio button: true, 1, yes, on, checked. Any other value will uncheck the checkbox or deselect the radio button.

Select dropdown options

The fill-form command can also be used to select options in dropdown elements:

# Select by option text (visible text)
cremote fill-form --tab="<tab-id>" --selector="#country" --value="United States"

# Select by option value (value attribute)
cremote fill-form --tab="<tab-id>" --selector="#state" --value="CA"

The command automatically detects dropdown elements and tries both option text and option value matching. This works with both <select> elements and custom dropdown implementations.

Upload a file

cremote upload-file --tab="<tab-id>" --selector="input[type=file]" --file="/path/to/file.jpg" [--timeout=5]

This command automatically:

  1. Transfers the file from your local machine to the daemon container (if running in a container)
  2. Uploads the file to the specified file input element on the web page

The --timeout parameter specifies how many seconds to wait for the upload operation to complete (default: 5 seconds).

Note: The file path should be the local path on your machine. The command will handle transferring it to the daemon container automatically.

Submit a form

cremote submit-form --tab="<tab-id>" --selector="form#login" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for the form submission to complete (default: 5 seconds).

Get the source code of a page

cremote get-source --tab="<tab-id>" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for getting the page source (default: 5 seconds).

Get the HTML of an element

cremote get-element --tab="<tab-id>" --selector=".content" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds).

Click on an element

cremote click-element --tab="<tab-id>" --selector="button.submit" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for the click operation to complete (default: 5 seconds).

Close a tab

cremote close-tab --tab="<tab-id>" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for the tab to close (default: 5 seconds).

Wait for navigation to complete

cremote wait-navigation --tab="<tab-id>" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for navigation to complete (default: 5 seconds).

Note: wait-navigation intelligently detects if navigation is actually happening and returns immediately if the page is already stable, preventing unnecessary waiting.

Execute JavaScript code

cremote eval-js --tab="<tab-id>" --code="document.getElementById('myElement').innerHTML = 'Hello World!'" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for the JavaScript execution to complete (default: 5 seconds).

This command allows you to execute arbitrary JavaScript code in a tab. Examples:

  • Set element content: --code="document.getElementById('tinymce').innerHTML='Foo!'"
  • Get element text: --code="document.querySelector('.result').textContent"
  • Trigger events: --code="document.getElementById('button').click()"
  • Manipulate DOM: --code="document.body.style.backgroundColor = 'red'"

The command handles both JavaScript expressions and statements:

  • Expressions (return values): document.title, 2 + 3, element.textContent
  • Statements (assignments/actions): document.title = 'New Title', element.click()

For statements, the command returns "undefined". For expressions, it returns the result as a string.

Take a screenshot

cremote screenshot --tab="<tab-id>" --output="/path/to/screenshot.png" [--full-page] [--timeout=5]

The --output parameter specifies where to save the screenshot (PNG format). The --full-page flag captures the entire page instead of just the viewport (default: viewport only). The --timeout parameter specifies how many seconds to wait for the screenshot to complete (default: 5 seconds).

Working with iframes

To interact with content inside an iframe, you need to switch the context:

# Switch to iframe context
cremote switch-iframe --tab="<tab-id>" --selector="iframe#payment-form"

# Now all subsequent commands will operate within the iframe
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#expiry" --value="12/25"
cremote click-element --selector="#submit-payment"

# Switch back to main page context
cremote switch-main --tab="<tab-id>"

# Now commands operate on the main page again
cremote get-element --selector=".success-message"

Important Notes:

  • Once you switch to an iframe, all subsequent commands (fill-form, click-element, eval-js, etc.) operate within that iframe
  • You must use switch-main to return to the main page context
  • Each tab maintains its own iframe context independently
  • Iframe context persists until explicitly switched back to main or the tab is closed

List all open tabs

cremote list-tabs

This will display all open tabs with their IDs and URLs. The current tab is marked with an asterisk (*)

Cache and Site Data Management

You can control browser cache and site data for testing, performance optimization, and privacy:

# Cache Management
# Disable cache for current tab (useful for testing)
cremote disable-cache [--tab="<tab-id>"] [--timeout=5]

# Enable cache for current tab
cremote enable-cache [--tab="<tab-id>"] [--timeout=5]

# Clear browser cache for current tab
cremote clear-cache [--tab="<tab-id>"] [--timeout=5]

# Site Data Management
# Clear ALL site data (cookies, storage, cache, etc.)
cremote clear-all-site-data [--tab="<tab-id>"] [--timeout=10]

# Clear only cookies
cremote clear-cookies [--tab="<tab-id>"] [--timeout=5]

# Clear only web storage (localStorage, sessionStorage, IndexedDB, etc.)
cremote clear-storage [--tab="<tab-id>"] [--timeout=5]

Use Cases:

  • Testing: Disable cache to ensure fresh page loads without cached resources
  • Performance Testing: Clear cache to test cold load performance
  • Debugging: Clear cache to resolve cache-related issues
  • Development: Disable cache during development to see changes immediately
  • Authentication Testing: Clear cookies to test login/logout flows
  • Privacy Testing: Clear all site data to test clean state scenarios
  • Storage Testing: Clear web storage to test application state management

The --timeout parameter specifies how many seconds to wait for the operation to complete (default: 5 seconds, use longer timeouts for comprehensive data clearing).

Drag and Drop Operations

You can perform drag and drop operations for testing interactive web applications:

# Drag and Drop Between Elements
# Drag from source element to target element
cremote drag-and-drop --source=".draggable-item" --target=".drop-zone" [--tab="<tab-id>"] [--timeout=5]

# Drag and Drop to Specific Coordinates
# Drag from source element to specific x,y coordinates
cremote drag-and-drop-coordinates --source=".draggable-item" --x=300 --y=200 [--tab="<tab-id>"] [--timeout=5]

# Drag and Drop by Relative Offset
# Drag from source element by relative pixel offset
cremote drag-and-drop-offset --source=".draggable-item" --offset-x=100 --offset-y=50 [--tab="<tab-id>"] [--timeout=5]

Use Cases:

  • File Upload: Drag files to upload areas
  • Sortable Lists: Reorder items in sortable lists
  • Kanban Boards: Move cards between columns
  • Image Galleries: Rearrange images or media
  • Form Builders: Drag form elements to build layouts
  • Dashboard Widgets: Rearrange dashboard components
  • Game Testing: Test drag-based game mechanics
  • UI Component Testing: Test custom drag and drop components

Technical Details:

  • Enhanced HTML5 Support: Automatically injects JavaScript helpers to trigger proper HTML5 drag and drop events (dragstart, dragover, drop, dragend)
  • Smart Target Detection: For coordinate/offset drags, automatically detects and targets valid drop zones at destination coordinates
  • Hybrid Approach: Tries HTML5 drag events first, falls back to Chrome DevTools Protocol mouse events if needed
  • Intelligent Fallback: Automatically switches between element-to-element and coordinate-based approaches for optimal compatibility
  • Realistic Event Simulation: Performs drag operations with proper timing and intermediate mouse movements
  • Automatic Element Detection: Calculates element center points automatically for accurate targeting
  • Robust Error Handling: Supports timeout handling and graceful degradation for complex drag operations
  • Universal Compatibility: Works with all modern drag and drop implementations (HTML5 Drag and Drop, jQuery UI, custom implementations)

The --timeout parameter specifies how many seconds to wait for the drag and drop operation to complete (default: 5 seconds).

Advanced Input Operations

Cremote provides sophisticated mouse and keyboard interactions for comprehensive testing of modern web applications:

Mouse Operations
# Right-click to open context menus
cremote right-click --selector=".file-item" [--tab="<tab-id>"] [--timeout=5]

# Double-click for file operations or text selection
cremote double-click --selector=".file-icon" [--tab="<tab-id>"] [--timeout=5]

# Middle-click to open links in new tabs
cremote middle-click --selector="a[href='/dashboard']" [--tab="<tab-id>"] [--timeout=5]

# Hover to trigger tooltips or dropdowns
cremote hover --selector=".tooltip-trigger" [--tab="<tab-id>"] [--timeout=5]

# Move mouse to specific coordinates without clicking
cremote mouse-move --x=400 --y=300 [--tab="<tab-id>"] [--timeout=5]

# Scroll with mouse wheel at specific coordinates
cremote scroll-wheel --x=400 --y=300 --delta-y=-120 [--delta-x=0] [--tab="<tab-id>"] [--timeout=5]
Keyboard Operations
# Send key combinations (Ctrl+C, Alt+Tab, Shift+Enter, etc.)
cremote key-combination --keys="Ctrl+C" [--tab="<tab-id>"] [--timeout=5]
cremote key-combination --keys="Alt+Tab" [--tab="<tab-id>"] [--timeout=5]
cremote key-combination --keys="Ctrl+Shift+T" [--tab="<tab-id>"] [--timeout=5]

# Send special keys (Enter, Escape, Tab, F1-F12, Arrow keys, etc.)
cremote special-key --key="Enter" [--tab="<tab-id>"] [--timeout=5]
cremote special-key --key="Escape" [--tab="<tab-id>"] [--timeout=5]
cremote special-key --key="ArrowUp" [--tab="<tab-id>"] [--timeout=5]
cremote special-key --key="F1" [--tab="<tab-id>"] [--timeout=5]

# Click with modifier keys (Ctrl+click, Shift+click for multi-selection)
cremote modifier-click --selector=".selectable-item" --modifiers="Ctrl" [--tab="<tab-id>"] [--timeout=5]
cremote modifier-click --selector=".list-item" --modifiers="Shift" [--tab="<tab-id>"] [--timeout=5]
cremote modifier-click --selector=".table-row" --modifiers="Ctrl+Shift" [--tab="<tab-id>"] [--timeout=5]

Advanced Use Cases:

  • Context Menu Testing: Right-click to test context menus and their functionality
  • Accessibility Testing: Full keyboard navigation support for accessibility compliance
  • Tooltip/Dropdown Testing: Hover interactions for UI elements that appear on mouse over
  • Multi-Selection Testing: Ctrl+click and Shift+click for testing selection interfaces
  • Copy/Paste Workflows: Test clipboard operations with Ctrl+A, Ctrl+C, Ctrl+V
  • Precise Mouse Control: Pixel-perfect mouse positioning and scrolling
  • Function Key Testing: Test application shortcuts using F1-F12 keys
  • Arrow Key Navigation: Test keyboard navigation in lists, tables, and forms

Technical Details:

  • Uses Chrome DevTools Protocol's Input domain for precise control
  • Supports all modifier keys: Ctrl, Alt, Shift, Meta/Cmd
  • Comprehensive key mapping for 60+ keys including letters, numbers, function keys, special keys
  • Proper modifier key sequencing (key down → action → key up)
  • Element positioning using content quads for pixel-perfect accuracy
  • Mouse button differentiation (Left, Right, Middle)
  • Realistic interaction patterns matching human behavior

Connecting to a Remote Daemon

By default, the client connects to a daemon running on localhost. To connect to a daemon running on a different host:

cremote open-tab --host="remote-host" --port=8989

Automation Example

Here's an example of how to use cremote in a shell script to automate a login process:

#!/bin/bash

# Make sure Chromium is running with remote debugging enabled
# chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug &

# Make sure the daemon is running
# cremotedaemon &

# Open a new tab
TAB_ID=$(cremote open-tab)

# Load the login page (using the current tab)
cremote load-url --url="https://example.com/login"

# Fill in the username and password (using the current tab)
cremote fill-form --selector="#username" --value="user123"
cremote fill-form --selector="#password" --value="password123"

# Check the 'Remember me' checkbox
cremote fill-form --selector="#remember" --value="true"

# Accept the terms and conditions
cremote fill-form --selector="#terms" --value="true"

# Either submit the form using the form selector (using the current tab)
cremote submit-form --selector="form#login"

# Or click the login button (using the current tab)
# cremote click-element --selector="#login-button"

# You can still specify a tab ID explicitly if needed
# cremote load-url --tab="$TAB_ID" --url="https://example.com/login"

# Wait for navigation to complete (using the current tab)
cremote wait-navigation --timeout=30

# Execute JavaScript to check if login was successful
LOGIN_STATUS=$(cremote eval-js --code="document.querySelector('.welcome-message') !== null")
if [ "$LOGIN_STATUS" = "true" ]; then
    echo "Login successful!"
fi

# Example: Working with an iframe (e.g., payment form)
# Switch to iframe context
cremote switch-iframe --selector="iframe.payment-frame"

# Fill payment form inside iframe
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#expiry-date" --value="12/25"
cremote click-element --selector="#pay-button"

# Switch back to main page
cremote switch-main

# Get the source code of the page after login (using the current tab)
cremote get-source

# Take a screenshot of the logged-in page
cremote screenshot --output="/tmp/login-success.png"

# Take a full-page screenshot for documentation
cremote screenshot --output="/tmp/full-page.png" --full-page

# Close the current tab
cremote close-tab

Troubleshooting

Daemon Not Running

If you see an error like "connection refused", make sure the daemon is running:

cremote status

If the daemon is not running, start it:

cremotedaemon

Connection Issues

If the daemon can't connect to Chromium, check the following:

  1. Make sure Chromium/Chrome is running with remote debugging enabled on port 9222
  2. Verify that Chromium was started with the correct flags: --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug
  3. Check if you can access the Chromium DevTools Protocol by opening http://localhost:9222/json/version in your browser

Tab Management

The daemon manages tab IDs for you, so you don't need to worry about tab persistence between commands. However:

  1. Tab IDs are only valid for the duration of the browser session
  2. If Chromium is restarted, you'll need to get new tab IDs
  3. Store the tab ID returned by open-tab in a variable for use in subsequent commands
  4. If a tab is closed by Chromium (not through the tool), you may need to run open-tab again

License

MIT

fyne-bug

Description
chrome remote debugger easy interface
Readme 144 MiB
Languages
Go 97%
JavaScript 2.8%
Makefile 0.2%