cremote/LLM_CODING_AGENT_GUIDE.md

18 KiB

LLM Agent Guide: Using cremote for Web Application Testing

This document provides comprehensive guidance for LLM coding agents on how to use cremote (Chrome Remote Daemon) as a testing tool for web applications. cremote enables automated browser testing of public web interfaces through programmatic control of Chrome browser tabs.

What is cremote?

cremote is a browser automation tool that allows you to:

  • Control Chromium browser tabs programmatically
  • Fill forms and interact with web elements
  • Navigate web pages and wait for content to load
  • Extract page content and element HTML
  • Test user workflows end-to-end

It uses a daemon-client architecture where a background daemon maintains persistent connections to Chromium, and a command-line client sends testing commands.

Prerequisites for Testing

Before using cremote for web application testing, ensure:

  1. Check to see if everything is already running, if so you can skip the steps to start it:

    cremote status
    

    Note: cremote status always exits with code 0, whether the daemon is running or not. Check the output message to determine status.

  2. Chromium/Chrome is running with remote debugging enabled:

    # Create a temporary user data directory for the debug instance
    chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug &
    # or
    google-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug &
    
    # Alternative: Use a random temporary directory
    chromium --remote-debugging-port=9222 --user-data-dir=$(mktemp -d) &
    

    Important: The --user-data-dir flag is required to prevent Chromium from trying to use an existing window. Without it, Chromium will attempt to connect to an already running instance instead of starting a new debug-enabled instance.

  3. cremote daemon is running:

    cremotedaemon &
    

    Note: The daemon will automatically check if Chromium is running and provide helpful error messages if:

    • Chromium is not running on port 9222
    • Something else is using port 9222 (not Chromium DevTools)
    • Chromium is running but not accepting connections
  4. Verify connectivity:

    cremote status
    

Core Testing Workflow

1. Basic Test Session Setup

Every testing session follows this pattern:

# 1. Open a new browser tab for testing
TAB_ID=$(cremote open-tab)

# 2. Navigate to the application under test
cremote load-url --url="https://your-app.com"

# 3. Perform test actions (forms, clicks, navigation)
# ... testing commands ...

# 4. Clean up (optional)
cremote close-tab --tab="$TAB_ID"

2. Current Tab Feature

cremote automatically tracks the "current tab" - the most recently used tab. This means you can omit the --tab flag in most commands:

# Open tab (becomes current tab)
cremote open-tab

# All subsequent commands use current tab automatically
cremote load-url --url="https://example.com"
cremote fill-form --selector="#username" --value="testuser"
cremote click-element --selector="#login-btn"

Essential Testing Commands

Navigation and Page Loading

# Load a specific URL
cremote load-url --url="https://your-app.com/login"

# Wait for navigation to complete (useful after form submissions)
cremote wait-navigation --timeout=10

# Note: wait-navigation is smart - it returns immediately if no navigation is happening

# Get current page source for verification
cremote get-source

Form Testing

# Fill text inputs
cremote fill-form --selector="#username" --value="testuser"
cremote fill-form --selector="#password" --value="testpass123"

# Handle checkboxes (check)
cremote fill-form --selector="#remember-me" --value="true"
cremote fill-form --selector="#terms-agreed" --value="checked"

# Handle checkboxes (uncheck)
cremote fill-form --selector="#newsletter" --value="false"

# Handle radio buttons
cremote fill-form --selector="#payment-credit" --value="true"

# Submit forms
cremote submit-form --selector="form#login-form"

Checkbox/Radio Button Values:

  • To check/select: true, 1, yes, on, checked
  • To uncheck/deselect: false, 0, no, off, or any other value

File Upload Testing

# Upload files to file inputs (automatically transfers to daemon container first)
cremote upload-file --selector="input[type=file]" --file="/path/to/test-file.pdf"
cremote upload-file --selector="#profile-photo" --file="/home/user/test-image.jpg"

# The command automatically:
# 1. Transfers the file from local machine to daemon container
# 2. Uploads the file to the web form input element

Element Interaction

# Click buttons, links, or any clickable elements
cremote click-element --selector="button.submit"
cremote click-element --selector="a[href='/dashboard']"
cremote click-element --selector="#save-changes"

# Get HTML content of specific elements for verification
cremote get-element --selector=".error-message"
cremote get-element --selector="#user-profile"

JavaScript Execution

# Execute JavaScript code directly in the page (default 5 second timeout)
cremote eval-js --code="document.getElementById('tinymce').innerHTML='Foo!'"

# Get values from the page
cremote eval-js --code="document.querySelector('#result').textContent"

# Manipulate page elements
cremote eval-js --code="document.body.style.backgroundColor = 'red'"

# Trigger JavaScript events
cremote eval-js --code="document.getElementById('submit-btn').click()"

# Work with complex objects (returns JSON string)
cremote eval-js --code="document.querySelectorAll('.item').length"

# Set form values programmatically (statement - returns "undefined")
cremote eval-js --code="document.getElementById('hidden-field').value = 'secret-value'"

# Get form values (expression - returns the value)
cremote eval-js --code="document.getElementById('hidden-field').value"

# Use custom timeout for long-running JavaScript
cremote eval-js --code="await new Promise(resolve => setTimeout(resolve, 8000))" --timeout=10

Screenshots

# Take a viewport screenshot (default)
cremote screenshot --output="/tmp/page-screenshot.png"

# Take a full page screenshot
cremote screenshot --output="/tmp/full-page.png" --full-page

# Screenshot with custom timeout
cremote screenshot --output="/tmp/slow-page.png" --timeout=10

# Screenshot of specific tab
cremote screenshot --tab="$TAB_ID" --output="/tmp/tab-screenshot.png"

Use Cases for JavaScript Execution:

  • Interact with rich text editors (TinyMCE, CKEditor, etc.)
  • Trigger JavaScript events that aren't accessible via normal clicks
  • Extract computed values or complex data structures
  • Manipulate hidden form fields
  • Test JavaScript functionality directly
  • Set up test data or page state

Expression vs Statement Handling:

  • Expressions return values: document.title, element.textContent, array.length
  • Statements perform actions and return "undefined": element.click(), variable = value
  • Both types are supported seamlessly in the same command

Working with Iframes

# Switch to iframe context
cremote switch-iframe --selector="iframe#payment-form"

# All subsequent commands now operate within the iframe
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#cvv" --value="123"
cremote click-element --selector="#submit-payment"

# Switch back to main page context
cremote switch-main

# Commands now operate on the main page again
cremote get-element --selector=".payment-success"

Common Iframe Scenarios:

  • Payment Forms: Credit card processing iframes
  • Embedded Widgets: Social media, maps, chat widgets
  • Third-party Content: Ads, analytics, external forms
  • Security Contexts: Sandboxed content, cross-origin frames

Important Notes:

  • Iframe context is maintained per tab
  • All commands (fill-form, click-element, eval-js, etc.) work within iframe context
  • Must explicitly switch back to main context with switch-main
  • Iframe context persists until switched back or tab is closed

Tab Management

# List all open tabs (current tab marked with *)
cremote list-tabs

# Open multiple tabs for complex testing
TAB1=$(cremote open-tab)
TAB2=$(cremote open-tab)

# Work with specific tabs
cremote load-url --tab="$TAB1" --url="https://app.com/admin"
cremote load-url --tab="$TAB2" --url="https://app.com/user"

# Close specific tabs
cremote close-tab --tab="$TAB1"

Timeout Configuration

Many commands support timeout parameters for robust testing:

# Wait up to 10 seconds for operation to complete
cremote fill-form --selector="#slow-loading-field" --value="test" --timeout=10

# Wait for elements that load dynamically
cremote click-element --selector=".ajax-button" --timeout=15

# Get elements that may take time to render
cremote get-element --selector=".dynamic-content" --timeout=20

Timeout Parameter:

  • --timeout: Seconds to wait for operation to complete (default: 5 seconds)

Smart Navigation Waiting: The wait-navigation command intelligently detects if navigation is actually happening:

  • Returns immediately if the page is already stable and loaded
  • Monitors for 2 seconds to detect if navigation starts
  • Only waits for the full timeout if navigation is actually in progress
  • This prevents hanging when no navigation occurs

Common Testing Patterns

1. Login Flow Testing

#!/bin/bash
# Test user login functionality

# Setup
cremote open-tab
cremote load-url --url="https://myapp.com/login"

# Test valid login
cremote fill-form --selector="#email" --value="user@example.com"
cremote fill-form --selector="#password" --value="validpassword"
cremote click-element --selector="#login-button"

# Wait for redirect and verify success
cremote wait-navigation --timeout=10
PAGE_SOURCE=$(cremote get-source)

if echo "$PAGE_SOURCE" | grep -q "Welcome"; then
    echo "✓ Login successful"
else
    echo "✗ Login failed"
    exit 1
fi

2. Form Validation Testing

#!/bin/bash
# Test form validation

cremote open-tab
cremote load-url --url="https://myapp.com/register"

# Test empty form submission
cremote click-element --selector="#submit-btn"

# Check for validation errors
ERROR_MSG=$(cremote get-element --selector=".error-message" --timeout=5)
if [ -n "$ERROR_MSG" ]; then
    echo "✓ Validation working: $ERROR_MSG"
else
    echo "✗ No validation error shown"
fi

# Test invalid email format
cremote fill-form --selector="#email" --value="invalid-email"
cremote click-element --selector="#submit-btn"

# Verify email validation
EMAIL_ERROR=$(cremote get-element --selector="#email-error" --timeout=5)
if echo "$EMAIL_ERROR" | grep -q "valid email"; then
    echo "✓ Email validation working"
fi

# Test JavaScript validation directly
JS_VALIDATION=$(cremote eval-js --code="document.getElementById('email').validity.valid")
if [ "$JS_VALIDATION" = "false" ]; then
    echo "✓ JavaScript validation also working"
fi

3. Multi-Step Workflow Testing

#!/bin/bash
# Test complete user workflow

# Step 1: Registration
cremote open-tab
cremote load-url --url="https://myapp.com/register"
cremote fill-form --selector="#username" --value="newuser123"
cremote fill-form --selector="#email" --value="newuser@test.com"
cremote fill-form --selector="#password" --value="securepass123"
cremote fill-form --selector="#confirm-password" --value="securepass123"
cremote click-element --selector="#register-btn"
cremote wait-navigation --timeout=15

# Step 2: Email verification simulation
cremote load-url --url="https://myapp.com/verify?token=test-token"
cremote wait-navigation --timeout=10

# Step 3: Profile setup
cremote fill-form --selector="#first-name" --value="Test"
cremote fill-form --selector="#last-name" --value="User"
cremote upload-file --selector="#profile-photo" --file="/tmp/avatar.jpg"
cremote click-element --selector="#save-profile"

# Step 4: Verify completion
cremote wait-navigation --timeout=10
PROFILE_PAGE=$(cremote get-source)
if echo "$PROFILE_PAGE" | grep -q "Profile completed"; then
    echo "✓ Complete workflow successful"
fi

4. Error Handling and Edge Cases

#!/bin/bash
# Test error scenarios

# Test network timeout handling
cremote open-tab
cremote load-url --url="https://httpbin.org/delay/30"
# This should timeout - test how app handles it

# Test invalid form data
cremote load-url --url="https://myapp.com/contact"
cremote fill-form --selector="#phone" --value="invalid-phone-123abc"
cremote submit-form --selector="#contact-form"

# Check error handling
ERROR_RESPONSE=$(cremote get-element --selector=".validation-error")
echo "Error handling: $ERROR_RESPONSE"

# Test file upload limits
cremote upload-file --selector="#file-upload" --file="/path/to/large-file.zip"
UPLOAD_ERROR=$(cremote get-element --selector=".upload-error" --timeout=10)

# Test iframe interaction (e.g., payment form)
cremote switch-iframe --selector="iframe.payment-widget"
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#expiry" --value="12/25"
cremote click-element --selector="#pay-now"

# Check for payment processing within iframe
PAYMENT_STATUS=$(cremote get-element --selector=".payment-status" --timeout=10)
echo "Payment status: $PAYMENT_STATUS"

# Switch back to main page to check results
cremote switch-main
MAIN_STATUS=$(cremote get-element --selector=".order-confirmation" --timeout=10)

Testing Best Practices

1. Robust Element Selection

Use specific, stable selectors:

# Good - specific and stable
cremote click-element --selector="#submit-button"
cremote click-element --selector="button[data-testid='login-submit']"
cremote fill-form --selector="input[name='username']" --value="test"

# Avoid - fragile selectors
cremote click-element --selector="div > div:nth-child(3) > button"
cremote click-element --selector=".btn.btn-primary.mt-2"

2. Wait for Dynamic Content

Always use appropriate timeouts for dynamic content:

# Wait for AJAX content to load
cremote get-element --selector=".search-results" --timeout=15

# Wait for form submission to complete
cremote submit-form --selector="#payment-form" --timeout=30
cremote wait-navigation --timeout=20

3. Verify Test Results

Always verify that actions had the expected effect:

# After login, verify we're on the dashboard
cremote click-element --selector="#login-btn"
cremote wait-navigation --timeout=10
CURRENT_URL=$(cremote get-source | grep -o 'https://[^"]*dashboard[^"]*')
if [ -n "$CURRENT_URL" ]; then
    echo "✓ Successfully redirected to dashboard"
fi

# After form submission, check for success message
cremote submit-form --selector="#contact-form"
SUCCESS_MSG=$(cremote get-element --selector=".success-message" --timeout=10)
if echo "$SUCCESS_MSG" | grep -q "Thank you"; then
    echo "✓ Form submitted successfully"
fi

4. Clean Test Environment

# Start each test with a fresh tab
cremote open-tab

# Clear any existing state if needed
cremote load-url --url="https://myapp.com/logout"
cremote wait-navigation --timeout=5

# Begin actual test
cremote load-url --url="https://myapp.com/test-page"

5. Iframe Context Management

Always manage iframe context properly:

# Good - explicit context management
cremote switch-iframe --selector="iframe.payment-form"
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote switch-main  # Always switch back

# Good - verify iframe exists before switching
IFRAME_EXISTS=$(cremote get-element --selector="iframe.payment-form" --timeout=5)
if [ -n "$IFRAME_EXISTS" ]; then
    cremote switch-iframe --selector="iframe.payment-form"
    # ... iframe operations ...
    cremote switch-main
fi

# Avoid - forgetting to switch back to main context
cremote switch-iframe --selector="iframe.widget"
cremote fill-form --selector="#field" --value="test"
# Missing: cremote switch-main

Debugging Failed Tests

1. Inspect Current State

# Check what's currently on the page
cremote get-source > debug-page-source.html

# Check specific elements
cremote get-element --selector=".error-message"
cremote get-element --selector="form"

# List all tabs to verify state
cremote list-tabs

2. Verify Element Selectors

# Test if element exists before interacting
ELEMENT=$(cremote get-element --selector="#target-button" --timeout=5)
if [ -n "$ELEMENT" ]; then
    cremote click-element --selector="#target-button"
else
    echo "Element not found - check selector"
fi

3. Increase Timeouts for Slow Pages

# For slow-loading applications
cremote fill-form --selector="#username" --value="test" --timeout=30

cremote wait-navigation --timeout=60

Troubleshooting Chromium Connection Issues

Chromium Not Running

If you see: "Chromium is not running with remote debugging enabled on port 9222"

Solution: Start Chromium with the correct flags:

chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug &

Port Conflict

If you see: "Something is listening on port 9222 but it's not Chromium DevTools protocol"

Cause: Another application is using port 9222 Solution:

  1. Find what's using the port: netstat -tlnp | grep 9222
  2. Stop the conflicting process
  3. Start Chromium with the correct flags

Chromium Running But Not Connecting

If Chromium appears to be running but cremotedaemon can't connect:

Possible causes:

  • Chromium started without --remote-debugging-port=9222
  • Chromium started with a different port
  • Firewall blocking connections

Solution: Restart Chromium with the correct command:

pkill -f chromium  # Stop existing Chromium
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug &

Verify Chromium DevTools is Working

You can manually check if Chromium DevTools is responding:

curl http://localhost:9222/json/version

This should return JSON with Chromium version information.

Integration with Test Suites

cremote can be integrated into larger test suites:

#!/bin/bash
# test-suite.sh

# Setup
echo "Starting Chromium with remote debugging..."
chromium --remote-debugging-port=9222 --user-data-dir=$(mktemp -d) &
CHROMIUM_PID=$!
sleep 3

echo "Starting cremote daemon..."
cremotedaemon &
DAEMON_PID=$!
sleep 2

# Run tests
echo "Running login tests..."
./test-login.sh

echo "Running form tests..."
./test-forms.sh

echo "Running workflow tests..."
./test-workflows.sh

# Cleanup
echo "Stopping daemon..."
kill $DAEMON_PID
echo "Stopping Chromium..."
kill $CHROMIUM_PID

This guide provides the foundation for using cremote as a comprehensive web application testing tool. Focus on testing real user workflows, handling edge cases, and verifying expected behaviors through the browser interface.