cremote/LLM_CODING_AGENT_GUIDE.md

628 lines
19 KiB
Markdown

# LLM Agent Guide: Using cremote for Web Application Testing
This document provides comprehensive guidance for LLM coding agents on how to use **cremote** (Chrome Remote Daemon) as a testing tool for web applications. cremote enables automated browser testing of public web interfaces through programmatic control of Chrome browser tabs.
## What is cremote?
**cremote** is a browser automation tool that allows you to:
- Control Chromium browser tabs programmatically
- Fill forms and interact with web elements
- Navigate web pages and wait for content to load
- Extract page content and element HTML
- Test user workflows end-to-end
It uses a daemon-client architecture where a background daemon maintains persistent connections to Chromium, and a command-line client sends testing commands.
## Prerequisites for Testing
Before using cremote for web application testing, ensure:
0. **Check to see if everything is already running, if so you can skip the steps to start it:**
```bash
cremote status
```
**Note**: `cremote status` always exits with code 0, whether the daemon is running or not. Check the output message to determine status.
1. **Chromium/Chrome is running with remote debugging enabled:**
```bash
# Create a temporary user data directory for the debug instance
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug &
# or
google-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug &
# Alternative: Use a random temporary directory
chromium --remote-debugging-port=9222 --user-data-dir=$(mktemp -d) &
```
**Important**: The `--user-data-dir` flag is required to prevent Chromium from trying to use an existing window. Without it, Chromium will attempt to connect to an already running instance instead of starting a new debug-enabled instance.
2. **cremote daemon is running:**
```bash
cremotedaemon &
```
**Note**: The daemon will automatically check if Chromium is running and provide helpful error messages if:
- Chromium is not running on port 9222
- Something else is using port 9222 (not Chromium DevTools)
- Chromium is running but not accepting connections
3. **Verify connectivity:**
```bash
cremote status
```
## Core Testing Workflow
### 1. Basic Test Session Setup
Every testing session follows this pattern:
```bash
# 1. Open a new browser tab for testing
TAB_ID=$(cremote open-tab)
# 2. Navigate to the application under test
cremote load-url --url="https://your-app.com"
# 3. Perform test actions (forms, clicks, navigation)
# ... testing commands ...
# 4. Clean up (optional)
cremote close-tab --tab="$TAB_ID"
```
### 2. Current Tab Feature
cremote automatically tracks the "current tab" - the most recently used tab. This means you can omit the `--tab` flag in most commands:
```bash
# Open tab (becomes current tab)
cremote open-tab
# All subsequent commands use current tab automatically
cremote load-url --url="https://example.com"
cremote fill-form --selector="#username" --value="testuser"
cremote click-element --selector="#login-btn"
```
## Essential Testing Commands
### Navigation and Page Loading
```bash
# Load a specific URL
cremote load-url --url="https://your-app.com/login"
# Wait for navigation to complete (useful after form submissions)
cremote wait-navigation --timeout=10
# Note: wait-navigation is smart - it returns immediately if no navigation is happening
# Get current page source for verification
cremote get-source
```
### Form Testing
```bash
# Fill text inputs
cremote fill-form --selector="#username" --value="testuser"
cremote fill-form --selector="#password" --value="testpass123"
# Handle checkboxes (check)
cremote fill-form --selector="#remember-me" --value="true"
cremote fill-form --selector="#terms-agreed" --value="checked"
# Handle checkboxes (uncheck)
cremote fill-form --selector="#newsletter" --value="false"
# Handle radio buttons
cremote fill-form --selector="#payment-credit" --value="true"
# Submit forms
cremote submit-form --selector="form#login-form"
```
**Checkbox/Radio Button Values:**
- To check/select: `true`, `1`, `yes`, `on`, `checked`
- To uncheck/deselect: `false`, `0`, `no`, `off`, or any other value
### File Upload Testing
```bash
# Upload files to file inputs
cremote upload-file --selector="input[type=file]" --file="/path/to/test-file.pdf"
cremote upload-file --selector="#profile-photo" --file="/tmp/test-image.jpg"
```
### Element Interaction
```bash
# Click buttons, links, or any clickable elements
cremote click-element --selector="button.submit"
cremote click-element --selector="a[href='/dashboard']"
cremote click-element --selector="#save-changes"
# Get HTML content of specific elements for verification
cremote get-element --selector=".error-message"
cremote get-element --selector="#user-profile"
```
### JavaScript Execution
```bash
# Execute JavaScript code directly in the page (default 5 second timeout)
cremote eval-js --code="document.getElementById('tinymce').innerHTML='Foo!'"
# Get values from the page
cremote eval-js --code="document.querySelector('#result').textContent"
# Manipulate page elements
cremote eval-js --code="document.body.style.backgroundColor = 'red'"
# Trigger JavaScript events
cremote eval-js --code="document.getElementById('submit-btn').click()"
# Work with complex objects (returns JSON string)
cremote eval-js --code="document.querySelectorAll('.item').length"
# Set form values programmatically (statement - returns "undefined")
cremote eval-js --code="document.getElementById('hidden-field').value = 'secret-value'"
# Get form values (expression - returns the value)
cremote eval-js --code="document.getElementById('hidden-field').value"
# Use custom timeout for long-running JavaScript
cremote eval-js --code="await new Promise(resolve => setTimeout(resolve, 8000))" --timeout=10
```
### Screenshots
```bash
# Take a viewport screenshot (default)
cremote screenshot --output="/tmp/page-screenshot.png"
# Take a full page screenshot
cremote screenshot --output="/tmp/full-page.png" --full-page
# Screenshot with custom timeout
cremote screenshot --output="/tmp/slow-page.png" --timeout=10
# Screenshot of specific tab
cremote screenshot --tab="$TAB_ID" --output="/tmp/tab-screenshot.png"
```
**Use Cases for JavaScript Execution:**
- Interact with rich text editors (TinyMCE, CKEditor, etc.)
- Trigger JavaScript events that aren't accessible via normal clicks
- Extract computed values or complex data structures
- Manipulate hidden form fields
- Test JavaScript functionality directly
- Set up test data or page state
**Expression vs Statement Handling:**
- **Expressions** return values: `document.title`, `element.textContent`, `array.length`
- **Statements** perform actions and return "undefined": `element.click()`, `variable = value`
- Both types are supported seamlessly in the same command
### Working with Iframes
```bash
# Switch to iframe context
cremote switch-iframe --selector="iframe#payment-form"
# All subsequent commands now operate within the iframe
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#cvv" --value="123"
cremote click-element --selector="#submit-payment"
# Switch back to main page context
cremote switch-main
# Commands now operate on the main page again
cremote get-element --selector=".payment-success"
```
**Common Iframe Scenarios:**
- **Payment Forms**: Credit card processing iframes
- **Embedded Widgets**: Social media, maps, chat widgets
- **Third-party Content**: Ads, analytics, external forms
- **Security Contexts**: Sandboxed content, cross-origin frames
**Important Notes:**
- Iframe context is maintained per tab
- All commands (fill-form, click-element, eval-js, etc.) work within iframe context
- Must explicitly switch back to main context with `switch-main`
- Iframe context persists until switched back or tab is closed
### Tab Management
```bash
# List all open tabs (current tab marked with *)
cremote list-tabs
# Open multiple tabs for complex testing
TAB1=$(cremote open-tab)
TAB2=$(cremote open-tab)
# Work with specific tabs
cremote load-url --tab="$TAB1" --url="https://app.com/admin"
cremote load-url --tab="$TAB2" --url="https://app.com/user"
# Close specific tabs
cremote close-tab --tab="$TAB1"
```
## Timeout Configuration
Many commands support timeout parameters for robust testing:
```bash
# Wait up to 10 seconds for element to appear, then 5 seconds for action
cremote fill-form --selector="#slow-loading-field" --value="test" \
--selection-timeout=10 --action-timeout=5
# Wait for elements that load dynamically
cremote click-element --selector=".ajax-button" \
--selection-timeout=15 --action-timeout=10
# Get elements that may take time to render
cremote get-element --selector=".dynamic-content" --selection-timeout=20
```
**Timeout Parameters:**
- `--selection-timeout`: Seconds to wait for element to appear in DOM (default: 5 seconds)
- `--action-timeout`: Seconds to wait for action to complete (default: 5 seconds)
- `--timeout`: General timeout for operations (default: 5 seconds)
**Smart Navigation Waiting:**
The `wait-navigation` command intelligently detects if navigation is actually happening:
- Returns immediately if the page is already stable and loaded
- Monitors for 2 seconds to detect if navigation starts
- Only waits for the full timeout if navigation is actually in progress
- This prevents hanging when no navigation occurs
## Common Testing Patterns
### 1. Login Flow Testing
```bash
#!/bin/bash
# Test user login functionality
# Setup
cremote open-tab
cremote load-url --url="https://myapp.com/login"
# Test valid login
cremote fill-form --selector="#email" --value="user@example.com"
cremote fill-form --selector="#password" --value="validpassword"
cremote click-element --selector="#login-button"
# Wait for redirect and verify success
cremote wait-navigation --timeout=10
PAGE_SOURCE=$(cremote get-source)
if echo "$PAGE_SOURCE" | grep -q "Welcome"; then
echo "✓ Login successful"
else
echo "✗ Login failed"
exit 1
fi
```
### 2. Form Validation Testing
```bash
#!/bin/bash
# Test form validation
cremote open-tab
cremote load-url --url="https://myapp.com/register"
# Test empty form submission
cremote click-element --selector="#submit-btn"
# Check for validation errors
ERROR_MSG=$(cremote get-element --selector=".error-message" --selection-timeout=5)
if [ -n "$ERROR_MSG" ]; then
echo "✓ Validation working: $ERROR_MSG"
else
echo "✗ No validation error shown"
fi
# Test invalid email format
cremote fill-form --selector="#email" --value="invalid-email"
cremote click-element --selector="#submit-btn"
# Verify email validation
EMAIL_ERROR=$(cremote get-element --selector="#email-error" --selection-timeout=5)
if echo "$EMAIL_ERROR" | grep -q "valid email"; then
echo "✓ Email validation working"
fi
# Test JavaScript validation directly
JS_VALIDATION=$(cremote eval-js --code="document.getElementById('email').validity.valid")
if [ "$JS_VALIDATION" = "false" ]; then
echo "✓ JavaScript validation also working"
fi
```
### 3. Multi-Step Workflow Testing
```bash
#!/bin/bash
# Test complete user workflow
# Step 1: Registration
cremote open-tab
cremote load-url --url="https://myapp.com/register"
cremote fill-form --selector="#username" --value="newuser123"
cremote fill-form --selector="#email" --value="newuser@test.com"
cremote fill-form --selector="#password" --value="securepass123"
cremote fill-form --selector="#confirm-password" --value="securepass123"
cremote click-element --selector="#register-btn"
cremote wait-navigation --timeout=15
# Step 2: Email verification simulation
cremote load-url --url="https://myapp.com/verify?token=test-token"
cremote wait-navigation --timeout=10
# Step 3: Profile setup
cremote fill-form --selector="#first-name" --value="Test"
cremote fill-form --selector="#last-name" --value="User"
cremote upload-file --selector="#profile-photo" --file="/tmp/avatar.jpg"
cremote click-element --selector="#save-profile"
# Step 4: Verify completion
cremote wait-navigation --timeout=10
PROFILE_PAGE=$(cremote get-source)
if echo "$PROFILE_PAGE" | grep -q "Profile completed"; then
echo "✓ Complete workflow successful"
fi
```
### 4. Error Handling and Edge Cases
```bash
#!/bin/bash
# Test error scenarios
# Test network timeout handling
cremote open-tab
cremote load-url --url="https://httpbin.org/delay/30"
# This should timeout - test how app handles it
# Test invalid form data
cremote load-url --url="https://myapp.com/contact"
cremote fill-form --selector="#phone" --value="invalid-phone-123abc"
cremote submit-form --selector="#contact-form"
# Check error handling
ERROR_RESPONSE=$(cremote get-element --selector=".validation-error")
echo "Error handling: $ERROR_RESPONSE"
# Test file upload limits
cremote upload-file --selector="#file-upload" --file="/path/to/large-file.zip"
UPLOAD_ERROR=$(cremote get-element --selector=".upload-error" --selection-timeout=10)
# Test iframe interaction (e.g., payment form)
cremote switch-iframe --selector="iframe.payment-widget"
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote fill-form --selector="#expiry" --value="12/25"
cremote click-element --selector="#pay-now"
# Check for payment processing within iframe
PAYMENT_STATUS=$(cremote get-element --selector=".payment-status" --selection-timeout=10)
echo "Payment status: $PAYMENT_STATUS"
# Switch back to main page to check results
cremote switch-main
MAIN_STATUS=$(cremote get-element --selector=".order-confirmation" --selection-timeout=10)
```
## Testing Best Practices
### 1. Robust Element Selection
Use specific, stable selectors:
```bash
# Good - specific and stable
cremote click-element --selector="#submit-button"
cremote click-element --selector="button[data-testid='login-submit']"
cremote fill-form --selector="input[name='username']" --value="test"
# Avoid - fragile selectors
cremote click-element --selector="div > div:nth-child(3) > button"
cremote click-element --selector=".btn.btn-primary.mt-2"
```
### 2. Wait for Dynamic Content
Always use appropriate timeouts for dynamic content:
```bash
# Wait for AJAX content to load
cremote get-element --selector=".search-results" --selection-timeout=15
# Wait for form submission to complete
cremote submit-form --selector="#payment-form" --action-timeout=30
cremote wait-navigation --timeout=20
```
### 3. Verify Test Results
Always verify that actions had the expected effect:
```bash
# After login, verify we're on the dashboard
cremote click-element --selector="#login-btn"
cremote wait-navigation --timeout=10
CURRENT_URL=$(cremote get-source | grep -o 'https://[^"]*dashboard[^"]*')
if [ -n "$CURRENT_URL" ]; then
echo "✓ Successfully redirected to dashboard"
fi
# After form submission, check for success message
cremote submit-form --selector="#contact-form"
SUCCESS_MSG=$(cremote get-element --selector=".success-message" --selection-timeout=10)
if echo "$SUCCESS_MSG" | grep -q "Thank you"; then
echo "✓ Form submitted successfully"
fi
```
### 4. Clean Test Environment
```bash
# Start each test with a fresh tab
cremote open-tab
# Clear any existing state if needed
cremote load-url --url="https://myapp.com/logout"
cremote wait-navigation --timeout=5
# Begin actual test
cremote load-url --url="https://myapp.com/test-page"
```
### 5. Iframe Context Management
Always manage iframe context properly:
```bash
# Good - explicit context management
cremote switch-iframe --selector="iframe.payment-form"
cremote fill-form --selector="#card-number" --value="4111111111111111"
cremote switch-main # Always switch back
# Good - verify iframe exists before switching
IFRAME_EXISTS=$(cremote get-element --selector="iframe.payment-form" --selection-timeout=5)
if [ -n "$IFRAME_EXISTS" ]; then
cremote switch-iframe --selector="iframe.payment-form"
# ... iframe operations ...
cremote switch-main
fi
# Avoid - forgetting to switch back to main context
cremote switch-iframe --selector="iframe.widget"
cremote fill-form --selector="#field" --value="test"
# Missing: cremote switch-main
```
## Debugging Failed Tests
### 1. Inspect Current State
```bash
# Check what's currently on the page
cremote get-source > debug-page-source.html
# Check specific elements
cremote get-element --selector=".error-message"
cremote get-element --selector="form"
# List all tabs to verify state
cremote list-tabs
```
### 2. Verify Element Selectors
```bash
# Test if element exists before interacting
ELEMENT=$(cremote get-element --selector="#target-button" --selection-timeout=5)
if [ -n "$ELEMENT" ]; then
cremote click-element --selector="#target-button"
else
echo "Element not found - check selector"
fi
```
### 3. Increase Timeouts for Slow Pages
```bash
# For slow-loading applications
cremote fill-form --selector="#username" --value="test" \
--selection-timeout=30 --action-timeout=15
cremote wait-navigation --timeout=60
```
## Troubleshooting Chromium Connection Issues
### Chromium Not Running
If you see: "Chromium is not running with remote debugging enabled on port 9222"
**Solution**: Start Chromium with the correct flags:
```bash
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug &
```
### Port Conflict
If you see: "Something is listening on port 9222 but it's not Chromium DevTools protocol"
**Cause**: Another application is using port 9222
**Solution**:
1. Find what's using the port: `netstat -tlnp | grep 9222`
2. Stop the conflicting process
3. Start Chromium with the correct flags
### Chromium Running But Not Connecting
If Chromium appears to be running but cremotedaemon can't connect:
**Possible causes**:
- Chromium started without `--remote-debugging-port=9222`
- Chromium started with a different port
- Firewall blocking connections
**Solution**: Restart Chromium with the correct command:
```bash
pkill -f chromium # Stop existing Chromium
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug &
```
### Verify Chromium DevTools is Working
You can manually check if Chromium DevTools is responding:
```bash
curl http://localhost:9222/json/version
```
This should return JSON with Chromium version information.
## Integration with Test Suites
cremote can be integrated into larger test suites:
```bash
#!/bin/bash
# test-suite.sh
# Setup
echo "Starting Chromium with remote debugging..."
chromium --remote-debugging-port=9222 --user-data-dir=$(mktemp -d) &
CHROMIUM_PID=$!
sleep 3
echo "Starting cremote daemon..."
cremotedaemon &
DAEMON_PID=$!
sleep 2
# Run tests
echo "Running login tests..."
./test-login.sh
echo "Running form tests..."
./test-forms.sh
echo "Running workflow tests..."
./test-workflows.sh
# Cleanup
echo "Stopping daemon..."
kill $DAEMON_PID
echo "Stopping Chromium..."
kill $CHROMIUM_PID
```
This guide provides the foundation for using cremote as a comprehensive web application testing tool. Focus on testing real user workflows, handling edge cases, and verifying expected behaviors through the browser interface.