cremote/mcp/LLM_USAGE_GUIDE.md

8.9 KiB

Cremote MCP Tools - LLM Usage Guide

This guide explains how LLMs can use the cremote MCP (Model Context Protocol) tools for web automation tasks.

Available Tools

The cremote MCP server provides six comprehensive web automation tools:

1. web_navigate_cremotemcp

Navigate to URLs and optionally take screenshots.

Parameters:

  • url (required): The URL to navigate to
  • screenshot (optional): Boolean, take a screenshot after navigation
  • tab (optional): Specific tab ID to use
  • timeout (optional): Timeout in seconds (default: 5)

Example Usage:

web_navigate_cremotemcp:
  url: "https://example.com"
  screenshot: true

2. web_interact_cremotemcp

Interact with web elements through various actions.

Parameters:

  • action (required): One of "click", "fill", "submit", "upload"
  • selector (required): CSS selector for the target element
  • value (optional): Value for fill/upload actions
  • tab (optional): Specific tab ID to use
  • timeout (optional): Timeout in seconds (default: 5)

Example Usage:

web_interact_cremotemcp:
  action: "click"
  selector: "button.submit"

web_interact_cremotemcp:
  action: "fill"
  selector: "input[name='email']"
  value: "user@example.com"

3. web_extract_cremotemcp

Extract data from web pages (HTML source, element content, or JavaScript execution).

Parameters:

  • type (required): One of "source", "element", "javascript"
  • selector (optional): CSS selector (required for "element" type)
  • code (optional): JavaScript code (required for "javascript" type)
  • tab (optional): Specific tab ID to use
  • timeout (optional): Timeout in seconds (default: 5)

Example Usage:

web_extract_cremotemcp:
  type: "source"

web_extract_cremotemcp:
  type: "element"
  selector: "div.content"

web_extract_cremotemcp:
  type: "javascript"
  code: "document.title"

4. web_screenshot_cremotemcp

Take screenshots of web pages.

Parameters:

  • output (required): File path where screenshot will be saved
  • full_page (optional): Capture full page (default: false)
  • tab (optional): Specific tab ID to use
  • timeout (optional): Timeout in seconds (default: 5)

Example Usage:

web_screenshot_cremotemcp:
  output: "/tmp/page-screenshot.png"
  full_page: true

5. web_manage_tabs_cremotemcp

Manage browser tabs (open, close, list, switch).

Parameters:

  • action (required): One of "open", "close", "list", "switch"
  • tab (optional): Tab ID (required for "close" and "switch" actions)
  • timeout (optional): Timeout in seconds (default: 5)

Example Usage:

web_manage_tabs_cremotemcp:
  action: "open"

web_manage_tabs_cremotemcp:
  action: "list"

web_manage_tabs_cremotemcp:
  action: "switch"
  tab: "ABC123"

6. web_iframe_cremotemcp

Switch iframe context for subsequent operations.

Parameters:

  • action (required): One of "enter", "exit"
  • selector (optional): Iframe CSS selector (required for "enter" action)
  • tab (optional): Specific tab ID to use

Example Usage:

web_iframe_cremotemcp:
  action: "enter"
  selector: "iframe#payment-form"

web_iframe_cremotemcp:
  action: "exit"

Common Usage Patterns

1. Basic Web Navigation

# Navigate to a website
web_navigate_cremotemcp:
  url: "https://example.com"
  screenshot: true

2. Form Interaction Sequence

# 1. Navigate to the page
web_navigate_cremotemcp:
  url: "https://example.com/login"

# 2. Fill username field
web_interact_cremotemcp:
  action: "fill"
  selector: "input[name='username']"
  value: "myusername"

# 3. Fill password field
web_interact_cremotemcp:
  action: "fill"
  selector: "input[name='password']"
  value: "mypassword"

# 4. Submit the form
web_interact_cremotemcp:
  action: "submit"
  selector: "form"

3. Clicking Elements

# Click a button
web_interact_cremotemcp:
  action: "click"
  selector: "button#submit-btn"

# Click a link
web_interact_cremotemcp:
  action: "click"
  selector: "a[href='/dashboard']"

Best Practices for LLMs

1. Always Start with Navigation

Before interacting with elements, navigate to the target page:

web_navigate_cremotemcp:
  url: "https://target-website.com"

2. Use Specific CSS Selectors

Be as specific as possible with selectors to avoid ambiguity:

  • Good: input[name='email'], button.primary-submit
  • Avoid: input, button

3. Take Screenshots for Debugging

When troubleshooting or documenting, use screenshots:

web_navigate_cremotemcp:
  url: "https://example.com"
  screenshot: true

4. Handle Timeouts Appropriately

For slow-loading pages, increase timeout:

web_navigate_cremotemcp:
  url: "https://slow-website.com"
  timeout: 10

5. Sequential Operations

Perform operations in logical sequence:

  1. Navigate to page
  2. Fill required fields
  3. Submit forms
  4. Navigate to next page if needed

Error Handling

Common Error Scenarios:

  1. Element not found: Selector doesn't match any elements
  2. Timeout: Page takes too long to load or element to appear
  3. Navigation failed: URL is invalid or unreachable

Troubleshooting Tips:

  1. Verify the URL is correct and accessible
  2. Check CSS selectors using browser developer tools
  3. Increase timeout for slow-loading content
  4. Take screenshots to see current page state

Tab Management

The tools automatically manage browser tabs:

  • If no tab is specified, a new tab is created automatically
  • Tab IDs are returned in responses for reference
  • Multiple tabs can be managed by specifying tab IDs

Security Considerations

Safe Practices:

  • Only navigate to trusted websites
  • Be cautious with form submissions
  • Avoid entering sensitive information in examples
  • Use screenshots sparingly to avoid exposing sensitive data

Limitations:

  • Cannot bypass CAPTCHA or other anti-automation measures
  • Subject to same-origin policy restrictions
  • May not work with heavily JavaScript-dependent sites

Example: Complete Web Automation Task

Here's a complete example of automating a web form:

# Step 1: Navigate to the form page
web_navigate_cremotemcp:
  url: "https://example.com/contact"
  screenshot: true

# Step 2: Fill out the contact form
web_interact_cremotemcp:
  action: "fill"
  selector: "input[name='name']"
  value: "John Doe"

web_interact_cremotemcp:
  action: "fill"
  selector: "input[name='email']"
  value: "john@example.com"

web_interact_cremotemcp:
  action: "fill"
  selector: "textarea[name='message']"
  value: "Hello, this is a test message."

# Step 3: Submit the form
web_interact_cremotemcp:
  action: "submit"
  selector: "form#contact-form"

# Step 4: Take a screenshot of the result
web_navigate_cremotemcp:
  url: "current"  # Stay on current page
  screenshot: true

Integration Notes

  • Tools use the _cremotemcp suffix to avoid naming conflicts
  • Responses include success status and descriptive messages
  • Screenshots are saved to /tmp/ directory with timestamps
  • The underlying cremote daemon handles browser management

Advanced Usage Examples

Testing Web Applications

# Navigate to application
web_navigate_cremotemcp:
  url: "https://myapp.com/login"
  screenshot: true

# Test login functionality
web_interact_cremotemcp:
  action: "fill"
  selector: "#username"
  value: "testuser"

web_interact_cremotemcp:
  action: "fill"
  selector: "#password"
  value: "testpass"

web_interact_cremotemcp:
  action: "click"
  selector: "button[type='submit']"

# Verify successful login
web_navigate_cremotemcp:
  url: "current"
  screenshot: true

Data Extraction Workflows

# Navigate to data source
web_navigate_cremotemcp:
  url: "https://data-site.com/table"

# Click through pagination or filters
web_interact_cremotemcp:
  action: "click"
  selector: ".filter-button"

# Take screenshot to document current state
web_navigate_cremotemcp:
  url: "current"
  screenshot: true

File Upload Testing

# Navigate to upload form
web_navigate_cremotemcp:
  url: "https://example.com/upload"

# Upload a file
web_interact_cremotemcp:
  action: "upload"
  selector: "input[type='file']"
  value: "/path/to/test-file.pdf"

# Submit the upload form
web_interact_cremotemcp:
  action: "click"
  selector: "button.upload-submit"

Tool Response Format

Both tools return structured responses:

Success Response:

"Successfully navigated to https://example.com in tab ABC123 (screenshot saved to /tmp/navigate-1234567890.png)"

Error Response:

"failed to load URL: context deadline exceeded"

CSS Selector Best Practices

  1. ID selectors: #unique-id (most reliable)
  2. Name attributes: input[name='fieldname']
  3. Class combinations: .primary.button
  4. Attribute selectors: button[data-action='submit']

Avoid These Selectors:

  1. Generic tags: div, span, input (too broad)
  2. Position-based: :nth-child() (fragile)
  3. Text-based: :contains() (not standard CSS)

This documentation should help LLMs effectively use the cremote MCP tools for web automation tasks.