cremote/docs/accessibility_tree.md

7.8 KiB

Accessibility Tree Support in Cremote

Cremote now supports interfacing with Chrome's accessibility tree through the Chrome DevTools Protocol. This enables AI agents and automation tools to understand and interact with web pages using accessibility information, which is crucial for building inclusive and robust web automation.

Overview

The accessibility tree is a representation of the web page structure that assistive technologies (like screen readers) use to understand and navigate content. It provides semantic information about elements including their roles, names, descriptions, states, and relationships.

Features

1. Full Accessibility Tree Retrieval

Get the complete accessibility tree for a page or limit the depth for performance.

2. Partial Accessibility Tree

Retrieve accessibility information for a specific element and its relatives (ancestors, siblings, children).

3. Accessibility Tree Queries

Search for elements by accessible name, ARIA role, or within a specific scope.

API Reference

Daemon Commands

get-accessibility-tree

Retrieves the full accessibility tree for a tab.

Parameters:

  • tab (optional): Tab ID, uses current tab if not specified
  • depth (optional): Maximum depth to retrieve, omit for full tree
  • timeout (optional): Timeout in seconds, default 5

Example:

curl -X POST http://localhost:8989/command \
  -H "Content-Type: application/json" \
  -d '{"action": "get-accessibility-tree", "params": {"depth": "3"}}'

get-partial-accessibility-tree

Retrieves accessibility tree for a specific element.

Parameters:

  • selector: CSS selector for the target element
  • tab (optional): Tab ID, uses current tab if not specified
  • fetch-relatives (optional): Whether to fetch relatives, default "true"
  • timeout (optional): Timeout in seconds, default 5

Example:

curl -X POST http://localhost:8989/command \
  -H "Content-Type: application/json" \
  -d '{"action": "get-partial-accessibility-tree", "params": {"selector": "form", "fetch-relatives": "true"}}'

query-accessibility-tree

Queries the accessibility tree for nodes matching criteria.

Parameters:

  • tab (optional): Tab ID, uses current tab if not specified
  • selector (optional): CSS selector to limit search scope
  • accessible-name (optional): Accessible name to match
  • role (optional): ARIA role to match
  • timeout (optional): Timeout in seconds, default 5

Example:

curl -X POST http://localhost:8989/command \
  -H "Content-Type: application/json" \
  -d '{"action": "query-accessibility-tree", "params": {"role": "button", "accessible-name": "Submit"}}'

Client API

GetAccessibilityTree(tabID string, depth *int, timeout int) (*AccessibilityTreeResult, error)

Retrieves the full accessibility tree.

// Get full tree
result, err := client.GetAccessibilityTree("", nil, 10)

// Get tree with limited depth
depth := 2
result, err := client.GetAccessibilityTree("tab123", &depth, 10)

GetPartialAccessibilityTree(tabID, selector string, fetchRelatives bool, timeout int) (*AccessibilityTreeResult, error)

Retrieves partial accessibility tree for an element.

result, err := client.GetPartialAccessibilityTree("", "form", true, 10)

QueryAccessibilityTree(tabID, selector, accessibleName, role string, timeout int) (*AccessibilityQueryResult, error)

Queries accessibility tree by criteria.

// Find all buttons
result, err := client.QueryAccessibilityTree("", "", "", "button", 10)

// Find element by accessible name
result, err := client.QueryAccessibilityTree("", "", "Submit", "", 10)

// Find buttons within a form
result, err := client.QueryAccessibilityTree("", "form", "", "button", 10)

MCP Tools

get_accessibility_tree_cremotemcp

MCP tool for getting the full accessibility tree.

Parameters:

  • tab (optional): Tab ID
  • depth (optional): Maximum depth
  • timeout (optional): Timeout in seconds

get_partial_accessibility_tree_cremotemcp

MCP tool for getting partial accessibility tree.

Parameters:

  • selector: CSS selector for target element
  • tab (optional): Tab ID
  • fetch_relatives (optional): Include relatives, default true
  • timeout (optional): Timeout in seconds

query_accessibility_tree_cremotemcp

MCP tool for querying accessibility tree.

Parameters:

  • tab (optional): Tab ID
  • selector (optional): CSS selector scope
  • accessible_name (optional): Accessible name to match
  • role (optional): ARIA role to match
  • timeout (optional): Timeout in seconds

Data Structures

AXNode

Represents a node in the accessibility tree.

type AXNode struct {
    NodeID           string       `json:"nodeId"`
    Ignored          bool         `json:"ignored"`
    IgnoredReasons   []AXProperty `json:"ignoredReasons,omitempty"`
    Role             *AXValue     `json:"role,omitempty"`
    ChromeRole       *AXValue     `json:"chromeRole,omitempty"`
    Name             *AXValue     `json:"name,omitempty"`
    Description      *AXValue     `json:"description,omitempty"`
    Value            *AXValue     `json:"value,omitempty"`
    Properties       []AXProperty `json:"properties,omitempty"`
    ParentID         string       `json:"parentId,omitempty"`
    ChildIDs         []string     `json:"childIds,omitempty"`
    BackendDOMNodeID int          `json:"backendDOMNodeId,omitempty"`
    FrameID          string       `json:"frameId,omitempty"`
}

AXValue

Represents a computed accessibility value.

type AXValue struct {
    Type         string          `json:"type"`
    Value        interface{}     `json:"value,omitempty"`
    RelatedNodes []AXRelatedNode `json:"relatedNodes,omitempty"`
    Sources      []AXValueSource `json:"sources,omitempty"`
}

Use Cases

1. Accessibility Testing

Verify that web pages have proper accessibility attributes and structure.

2. Screen Reader Simulation

Understand how assistive technologies would interpret the page.

3. Semantic Web Automation

Use semantic information for more robust element selection and interaction.

4. Form Analysis

Analyze form structure and labeling for accessibility compliance.

5. Content Analysis

Extract structured content based on semantic roles and relationships.

Testing

Run the accessibility tree tests:

# Make the test script executable
chmod +x test_accessibility.sh

# Run the tests
./test_accessibility.sh

The test suite will:

  1. Verify daemon connectivity
  2. Test full accessibility tree retrieval
  3. Test partial accessibility tree retrieval
  4. Test accessibility tree queries by role and name
  5. Test scoped queries

Note: The test files are located in the tests/ directory to avoid conflicts with the main application build.

Best Practices

  1. Use Appropriate Depth: For performance, limit tree depth when full tree isn't needed
  2. Scope Queries: Use CSS selectors to limit query scope for better performance
  3. Handle Ignored Nodes: Check the Ignored field to filter out non-accessible elements
  4. Combine Criteria: Use multiple search criteria for more precise queries
  5. Error Handling: Always handle cases where elements might not have accessibility information

Troubleshooting

Common Issues

  1. Empty Results: Some elements may not have accessibility information if they're decorative or improperly marked up
  2. Performance: Large pages may have extensive accessibility trees; use depth limits or scoped queries
  3. Dynamic Content: Accessibility tree may change as page content updates; re-query as needed

Debug Tips

  1. Use browser DevTools Accessibility panel to compare results
  2. Check element roles and names in the browser first
  3. Verify that accessibility features are enabled in Chrome
  4. Test with simple pages first before complex applications