Josh at WLTechBlog e1f2c45c3a mcp iframe updates
2025-08-16 07:13:33 -05:00
2025-08-12 10:19:13 -05:00
2025-08-16 07:13:33 -05:00
2025-08-16 07:13:33 -05:00
2025-08-16 07:13:33 -05:00
2025-08-12 10:19:13 -05:00
2025-08-15 07:41:30 -05:00
2025-08-12 10:19:13 -05:00
2025-08-12 10:19:13 -05:00
2025-08-12 10:19:13 -05:00
2025-08-16 07:13:33 -05:00
2025-08-12 10:19:13 -05:00
2025-08-14 18:55:38 -05:00
2025-08-12 10:19:13 -05:00
2025-08-12 10:19:13 -05:00
2025-08-16 07:13:33 -05:00
2025-08-16 07:13:33 -05:00
2025-08-12 10:19:13 -05:00
2025-08-12 10:19:13 -05:00
2025-08-14 18:55:38 -05:00

Chrome Remote Daemon (cremote)

A command line utility for automating browser interactions using Chrome's remote debugging protocol. The tool uses a daemon-client architecture to maintain persistent connections to the browser.

Architecture

The tool consists of two main components:

  1. Daemon (cremotedaemon): A long-running process that connects to Chrome and manages browser state
  2. Client (cremote): A command-line client that sends commands to the daemon

This architecture provides several benefits:

  • Persistent browser connection across multiple commands
  • Reliable tab management
  • No need to reconnect for each command
  • Better performance

MCP Server

Cremote includes a Model Context Protocol (MCP) server that provides a structured API for LLMs and AI agents. Instead of using CLI commands, the MCP server offers:

  • State Management: Automatic tracking of tabs, history, and iframe context
  • Intelligent Abstractions: High-level tools that combine multiple operations
  • Better Error Handling: Rich error context for debugging
  • Automatic Screenshots: Built-in screenshot capture for documentation

See the MCP Server Documentation for setup and usage instructions.

Prerequisites

  • Go 1.16 or higher
  • A running instance of Chromium/Chrome with remote debugging enabled

Starting Chromium with Remote Debugging

Before using this tool, you must start Chromium with remote debugging enabled on port 9222:

chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug

or for Chrome:

google-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug

Important: The --user-data-dir flag is required to prevent conflicts with existing browser instances.

Usage

Starting the Daemon

First, start the daemon:

cremotedaemon

By default, the daemon listens on port 8989. You can specify a different port:

cremotedaemon --port=9090

Using the Client

Once the daemon is running, you can use the client to send commands:

cremote <command> [options]

Commands

  • open-tab: Open a new tab and return its ID
  • load-url: Load a URL in a tab
  • fill-form: Fill a form field with a value
  • upload-file: Upload a file to a file input
  • submit-form: Submit a form
  • get-source: Get the source code of a page
  • get-element: Get the HTML of an element
  • click-element: Click on an element
  • close-tab: Close a tab
  • wait-navigation: Wait for a navigation event
  • eval-js: Execute JavaScript code in a tab
  • switch-iframe: Switch to iframe context for subsequent commands
  • switch-main: Switch back to main page context
  • list-tabs: List all open tabs
  • status: Check if the daemon is running

Current Tab Feature

The tool tracks the current tab, so you can omit the --tab flag to use the most recently used tab. This makes interactive use more convenient.

For example, after opening a tab:

# Open a tab
TAB_ID=$(cremote open-tab)

# Load a URL in the current tab (no need to specify --tab)
cremote load-url --url="https://example.com"

# Click an element in the current tab
cremote click-element --selector="a.button"

You can still specify a tab ID explicitly if you need to work with multiple tabs.

Run cremote <command> -h for more information on a specific command.

Examples

Check Daemon Status

cremote status

Open a new tab

cremote open-tab [--timeout=5]

This will return a tab ID that you can use in subsequent commands. The --timeout parameter specifies how many seconds to wait for the tab to open (default: 5 seconds).

Load a URL in a tab

cremote load-url --tab="<tab-id>" --url="https://example.com" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for the URL to load (default: 5 seconds).

Fill a form field

cremote fill-form --tab="<tab-id>" --selector="#username" --value="user123" [--selection-timeout=5] [--action-timeout=5]

The --selection-timeout parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds). The --action-timeout parameter specifies how many seconds to wait for the fill action to complete (default: 5 seconds).

Check/uncheck a checkbox or select a radio button

The same fill-form command can be used to check/uncheck checkboxes or select radio buttons:

# Check a checkbox
cremote fill-form --tab="<tab-id>" --selector="#agree" --value="true"

# Uncheck a checkbox
cremote fill-form --tab="<tab-id>" --selector="#agree" --value="false"

# Select a radio button
cremote fill-form --tab="<tab-id>" --selector="#option2" --value="true"

Accepted values for checking a checkbox or selecting a radio button: true, 1, yes, on, checked. Any other value will uncheck the checkbox or deselect the radio button.

Upload a file

cremote upload-file --tab="<tab-id>" --selector="input[type=file]" --file="/path/to/file.jpg" [--selection-timeout=5] [--action-timeout=5]

The --selection-timeout parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds). The --action-timeout parameter specifies how many seconds to wait for the upload action to complete (default: 5 seconds).

Submit a form

cremote submit-form --tab="<tab-id>" --selector="form#login" [--selection-timeout=5] [--action-timeout=5]

The --selection-timeout parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds). The --action-timeout parameter specifies how many seconds to wait for the form submission to complete (default: 5 seconds).

Get the source code of a page

cremote get-source --tab="<tab-id>" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for getting the page source (default: 5 seconds).

Get the HTML of an element

cremote get-element --tab="<tab-id>" --selector=".content" [--selection-timeout=5]

The --selection-timeout parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds).

Click on an element

cremote click-element --tab="<tab-id>" --selector="button.submit" [--selection-timeout=5] [--action-timeout=5]

The --selection-timeout parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds). The --action-timeout parameter specifies how many seconds to wait for the click action to complete (default: 5 seconds).

Close a tab

cremote close-tab --tab="<tab-id>" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for the tab to close (default: 5 seconds).

Wait for navigation to complete

cremote wait-navigation --tab="<tab-id>" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for navigation to complete (default: 5 seconds).

Note: wait-navigation intelligently detects if navigation is actually happening and returns immediately if the page is already stable, preventing unnecessary waiting.

Execute JavaScript code

cremote eval-js --tab="<tab-id>" --code="document.getElementById('myElement').innerHTML = 'Hello World!'" [--timeout=5]

The --timeout parameter specifies how many seconds to wait for the JavaScript execution to complete (default: 5 seconds).

This command allows you to execute arbitrary JavaScript code in a tab. Examples:

  • Set element content: --code="document.getElementById('tinymce').innerHTML='Foo!'"
  • Get element text: --code="document.querySelector('.result').textContent"
  • Trigger events: --code="document.getElementById('button').click()"
  • Manipulate DOM: --code="document.body.style.backgroundColor = 'red'"

The command handles both JavaScript expressions and statements:

  • Expressions (return values): document.title, 2 + 3, element.textContent
  • Statements (assignments/actions): document.title = 'New Title', element.click()

For statements, the command returns "undefined". For expressions, it returns the result as a string.

Take a screenshot

cremote screenshot --tab="<tab-id>" --output="/path/to/screenshot.png" [--full-page] [--timeout=5]

The --output parameter specifies where to save the screenshot (PNG format). The --full-page flag captures the entire page instead of just the viewport (default: viewport only). The --timeout parameter specifies how many seconds to wait for the screenshot to complete (default: 5 seconds).

Working with iframes

To interact with content inside an iframe, you need to switch the context:

# Switch to iframe context
cremote switch-iframe --tab="<tab-id>" --selector="iframe#payment-form"

# Now all subsequent commands will operate within the iframe
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#expiry" --value="12/25"
cremote click-element --selector="#submit-payment"

# Switch back to main page context
cremote switch-main --tab="<tab-id>"

# Now commands operate on the main page again
cremote get-element --selector=".success-message"

Important Notes:

  • Once you switch to an iframe, all subsequent commands (fill-form, click-element, eval-js, etc.) operate within that iframe
  • You must use switch-main to return to the main page context
  • Each tab maintains its own iframe context independently
  • Iframe context persists until explicitly switched back to main or the tab is closed

List all open tabs

cremote list-tabs

This will display all open tabs with their IDs and URLs. The current tab is marked with an asterisk (*)

Connecting to a Remote Daemon

By default, the client connects to a daemon running on localhost. To connect to a daemon running on a different host:

cremote open-tab --host="remote-host" --port=8989

Automation Example

Here's an example of how to use cremote in a shell script to automate a login process:

#!/bin/bash

# Make sure Chromium is running with remote debugging enabled
# chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug &

# Make sure the daemon is running
# cremotedaemon &

# Open a new tab
TAB_ID=$(cremote open-tab)

# Load the login page (using the current tab)
cremote load-url --url="https://example.com/login"

# Fill in the username and password (using the current tab)
cremote fill-form --selector="#username" --value="user123"
cremote fill-form --selector="#password" --value="password123"

# Check the 'Remember me' checkbox
cremote fill-form --selector="#remember" --value="true"

# Accept the terms and conditions
cremote fill-form --selector="#terms" --value="true"

# Either submit the form using the form selector (using the current tab)
cremote submit-form --selector="form#login"

# Or click the login button (using the current tab)
# cremote click-element --selector="#login-button"

# You can still specify a tab ID explicitly if needed
# cremote load-url --tab="$TAB_ID" --url="https://example.com/login"

# Wait for navigation to complete (using the current tab)
cremote wait-navigation --timeout=30

# Execute JavaScript to check if login was successful
LOGIN_STATUS=$(cremote eval-js --code="document.querySelector('.welcome-message') !== null")
if [ "$LOGIN_STATUS" = "true" ]; then
    echo "Login successful!"
fi

# Example: Working with an iframe (e.g., payment form)
# Switch to iframe context
cremote switch-iframe --selector="iframe.payment-frame"

# Fill payment form inside iframe
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#expiry-date" --value="12/25"
cremote click-element --selector="#pay-button"

# Switch back to main page
cremote switch-main

# Get the source code of the page after login (using the current tab)
cremote get-source

# Take a screenshot of the logged-in page
cremote screenshot --output="/tmp/login-success.png"

# Take a full-page screenshot for documentation
cremote screenshot --output="/tmp/full-page.png" --full-page

# Close the current tab
cremote close-tab

Troubleshooting

Daemon Not Running

If you see an error like "connection refused", make sure the daemon is running:

cremote status

If the daemon is not running, start it:

cremotedaemon

Connection Issues

If the daemon can't connect to Chromium, check the following:

  1. Make sure Chromium/Chrome is running with remote debugging enabled on port 9222
  2. Verify that Chromium was started with the correct flags: --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug
  3. Check if you can access the Chromium DevTools Protocol by opening http://localhost:9222/json/version in your browser

Tab Management

The daemon manages tab IDs for you, so you don't need to worry about tab persistence between commands. However:

  1. Tab IDs are only valid for the duration of the browser session
  2. If Chromium is restarted, you'll need to get new tab IDs
  3. Store the tab ID returned by open-tab in a variable for use in subsequent commands
  4. If a tab is closed by Chromium (not through the tool), you may need to run open-tab again

License

MIT

Description
chrome remote debugger easy interface
Readme 144 MiB
Languages
Go 97%
JavaScript 2.8%
Makefile 0.2%