first commit

This commit is contained in:
Josh at WLTechBlog 2025-08-12 10:00:49 -05:00
commit 70d9ed30de
1 changed files with 394 additions and 0 deletions

394
README.md Normal file
View File

@ -0,0 +1,394 @@
# Chrome Remote Daemon (cremote)
A command line utility for automating browser interactions using Chrome's remote debugging protocol. The tool uses a daemon-client architecture to maintain persistent connections to the browser.
## Architecture
The tool consists of two main components:
1. **Daemon (`cremotedaemon`)**: A long-running process that connects to Chrome and manages browser state
2. **Client (`cremote`)**: A command-line client that sends commands to the daemon
This architecture provides several benefits:
- Persistent browser connection across multiple commands
- Reliable tab management
- No need to reconnect for each command
- Better performance
## Prerequisites
- Go 1.16 or higher
- A running instance of Chromium/Chrome with remote debugging enabled
### Starting Chromium with Remote Debugging
Before using this tool, you **must** start Chromium with remote debugging enabled on port 9222:
```bash
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug
```
or for Chrome:
```bash
google-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug
```
**Important**: The `--user-data-dir` flag is required to prevent conflicts with existing browser instances.
## Usage
### Starting the Daemon
First, start the daemon:
```bash
cremotedaemon
```
By default, the daemon listens on port 8989. You can specify a different port:
```bash
cremotedaemon --port=9090
```
### Using the Client
Once the daemon is running, you can use the client to send commands:
```
cremote <command> [options]
```
### Commands
- `open-tab`: Open a new tab and return its ID
- `load-url`: Load a URL in a tab
- `fill-form`: Fill a form field with a value
- `upload-file`: Upload a file to a file input
- `submit-form`: Submit a form
- `get-source`: Get the source code of a page
- `get-element`: Get the HTML of an element
- `click-element`: Click on an element
- `close-tab`: Close a tab
- `wait-navigation`: Wait for a navigation event
- `eval-js`: Execute JavaScript code in a tab
- `switch-iframe`: Switch to iframe context for subsequent commands
- `switch-main`: Switch back to main page context
- `list-tabs`: List all open tabs
- `status`: Check if the daemon is running
### Current Tab Feature
The tool tracks the current tab, so you can omit the `--tab` flag to use the most recently used tab. This makes interactive use more convenient.
For example, after opening a tab:
```bash
# Open a tab
TAB_ID=$(cremote open-tab)
# Load a URL in the current tab (no need to specify --tab)
cremote load-url --url="https://example.com"
# Click an element in the current tab
cremote click-element --selector="a.button"
```
You can still specify a tab ID explicitly if you need to work with multiple tabs.
Run `cremote <command> -h` for more information on a specific command.
### Examples
#### Check Daemon Status
```bash
cremote status
```
#### Open a new tab
```bash
cremote open-tab [--timeout=5]
```
This will return a tab ID that you can use in subsequent commands. The `--timeout` parameter specifies how many seconds to wait for the tab to open (default: 5 seconds).
#### Load a URL in a tab
```bash
cremote load-url --tab="<tab-id>" --url="https://example.com" [--timeout=5]
```
The `--timeout` parameter specifies how many seconds to wait for the URL to load (default: 5 seconds).
#### Fill a form field
```bash
cremote fill-form --tab="<tab-id>" --selector="#username" --value="user123" [--selection-timeout=5] [--action-timeout=5]
```
The `--selection-timeout` parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds).
The `--action-timeout` parameter specifies how many seconds to wait for the fill action to complete (default: 5 seconds).
#### Check/uncheck a checkbox or select a radio button
The same `fill-form` command can be used to check/uncheck checkboxes or select radio buttons:
```bash
# Check a checkbox
cremote fill-form --tab="<tab-id>" --selector="#agree" --value="true"
# Uncheck a checkbox
cremote fill-form --tab="<tab-id>" --selector="#agree" --value="false"
# Select a radio button
cremote fill-form --tab="<tab-id>" --selector="#option2" --value="true"
```
Accepted values for checking a checkbox or selecting a radio button: `true`, `1`, `yes`, `on`, `checked`.
Any other value will uncheck the checkbox or deselect the radio button.
#### Upload a file
```bash
cremote upload-file --tab="<tab-id>" --selector="input[type=file]" --file="/path/to/file.jpg" [--selection-timeout=5] [--action-timeout=5]
```
The `--selection-timeout` parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds).
The `--action-timeout` parameter specifies how many seconds to wait for the upload action to complete (default: 5 seconds).
#### Submit a form
```bash
cremote submit-form --tab="<tab-id>" --selector="form#login" [--selection-timeout=5] [--action-timeout=5]
```
The `--selection-timeout` parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds).
The `--action-timeout` parameter specifies how many seconds to wait for the form submission to complete (default: 5 seconds).
#### Get the source code of a page
```bash
cremote get-source --tab="<tab-id>" [--timeout=5]
```
The `--timeout` parameter specifies how many seconds to wait for getting the page source (default: 5 seconds).
#### Get the HTML of an element
```bash
cremote get-element --tab="<tab-id>" --selector=".content" [--selection-timeout=5]
```
The `--selection-timeout` parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds).
#### Click on an element
```bash
cremote click-element --tab="<tab-id>" --selector="button.submit" [--selection-timeout=5] [--action-timeout=5]
```
The `--selection-timeout` parameter specifies how many seconds to wait for the element to appear in the DOM (default: 5 seconds).
The `--action-timeout` parameter specifies how many seconds to wait for the click action to complete (default: 5 seconds).
#### Close a tab
```bash
cremote close-tab --tab="<tab-id>" [--timeout=5]
```
The `--timeout` parameter specifies how many seconds to wait for the tab to close (default: 5 seconds).
#### Wait for navigation to complete
```bash
cremote wait-navigation --tab="<tab-id>" [--timeout=5]
```
The `--timeout` parameter specifies how many seconds to wait for navigation to complete (default: 5 seconds).
**Note**: `wait-navigation` intelligently detects if navigation is actually happening and returns immediately if the page is already stable, preventing unnecessary waiting.
#### Execute JavaScript code
```bash
cremote eval-js --tab="<tab-id>" --code="document.getElementById('myElement').innerHTML = 'Hello World!'" [--timeout=5]
```
The `--timeout` parameter specifies how many seconds to wait for the JavaScript execution to complete (default: 5 seconds).
This command allows you to execute arbitrary JavaScript code in a tab. Examples:
- Set element content: `--code="document.getElementById('tinymce').innerHTML='Foo!'"`
- Get element text: `--code="document.querySelector('.result').textContent"`
- Trigger events: `--code="document.getElementById('button').click()"`
- Manipulate DOM: `--code="document.body.style.backgroundColor = 'red'"`
The command handles both JavaScript expressions and statements:
- **Expressions** (return values): `document.title`, `2 + 3`, `element.textContent`
- **Statements** (assignments/actions): `document.title = 'New Title'`, `element.click()`
For statements, the command returns "undefined". For expressions, it returns the result as a string.
#### Take a screenshot
```bash
cremote screenshot --tab="<tab-id>" --output="/path/to/screenshot.png" [--full-page] [--timeout=5]
```
The `--output` parameter specifies where to save the screenshot (PNG format).
The `--full-page` flag captures the entire page instead of just the viewport (default: viewport only).
The `--timeout` parameter specifies how many seconds to wait for the screenshot to complete (default: 5 seconds).
#### Working with iframes
To interact with content inside an iframe, you need to switch the context:
```bash
# Switch to iframe context
cremote switch-iframe --tab="<tab-id>" --selector="iframe#payment-form"
# Now all subsequent commands will operate within the iframe
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#expiry" --value="12/25"
cremote click-element --selector="#submit-payment"
# Switch back to main page context
cremote switch-main --tab="<tab-id>"
# Now commands operate on the main page again
cremote get-element --selector=".success-message"
```
**Important Notes:**
- Once you switch to an iframe, all subsequent commands (fill-form, click-element, eval-js, etc.) operate within that iframe
- You must use `switch-main` to return to the main page context
- Each tab maintains its own iframe context independently
- Iframe context persists until explicitly switched back to main or the tab is closed
#### List all open tabs
```bash
cremote list-tabs
```
This will display all open tabs with their IDs and URLs. The current tab is marked with an asterisk (*)
### Connecting to a Remote Daemon
By default, the client connects to a daemon running on localhost. To connect to a daemon running on a different host:
```bash
cremote open-tab --host="remote-host" --port=8989
```
## Automation Example
Here's an example of how to use cremote in a shell script to automate a login process:
```bash
#!/bin/bash
# Make sure Chromium is running with remote debugging enabled
# chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug &
# Make sure the daemon is running
# cremotedaemon &
# Open a new tab
TAB_ID=$(cremote open-tab)
# Load the login page (using the current tab)
cremote load-url --url="https://example.com/login"
# Fill in the username and password (using the current tab)
cremote fill-form --selector="#username" --value="user123"
cremote fill-form --selector="#password" --value="password123"
# Check the 'Remember me' checkbox
cremote fill-form --selector="#remember" --value="true"
# Accept the terms and conditions
cremote fill-form --selector="#terms" --value="true"
# Either submit the form using the form selector (using the current tab)
cremote submit-form --selector="form#login"
# Or click the login button (using the current tab)
# cremote click-element --selector="#login-button"
# You can still specify a tab ID explicitly if needed
# cremote load-url --tab="$TAB_ID" --url="https://example.com/login"
# Wait for navigation to complete (using the current tab)
cremote wait-navigation --timeout=30
# Execute JavaScript to check if login was successful
LOGIN_STATUS=$(cremote eval-js --code="document.querySelector('.welcome-message') !== null")
if [ "$LOGIN_STATUS" = "true" ]; then
echo "Login successful!"
fi
# Example: Working with an iframe (e.g., payment form)
# Switch to iframe context
cremote switch-iframe --selector="iframe.payment-frame"
# Fill payment form inside iframe
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#expiry-date" --value="12/25"
cremote click-element --selector="#pay-button"
# Switch back to main page
cremote switch-main
# Get the source code of the page after login (using the current tab)
cremote get-source
# Take a screenshot of the logged-in page
cremote screenshot --output="/tmp/login-success.png"
# Take a full-page screenshot for documentation
cremote screenshot --output="/tmp/full-page.png" --full-page
# Close the current tab
cremote close-tab
```
## Troubleshooting
### Daemon Not Running
If you see an error like "connection refused", make sure the daemon is running:
```bash
cremote status
```
If the daemon is not running, start it:
```bash
cremotedaemon
```
### Connection Issues
If the daemon can't connect to Chromium, check the following:
1. Make sure Chromium/Chrome is running with remote debugging enabled on port 9222
2. Verify that Chromium was started with the correct flags: `--remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug`
3. Check if you can access the Chromium DevTools Protocol by opening `http://localhost:9222/json/version` in your browser
### Tab Management
The daemon manages tab IDs for you, so you don't need to worry about tab persistence between commands. However:
1. Tab IDs are only valid for the duration of the browser session
2. If Chromium is restarted, you'll need to get new tab IDs
3. Store the tab ID returned by `open-tab` in a variable for use in subsequent commands
4. If a tab is closed by Chromium (not through the tool), you may need to run `open-tab` again
## License
MIT