Files
cremote/AUTOMATION_ENHANCEMENT_PLAN.md
Josh at WLTechBlog a27273b581 bump
2025-10-03 10:19:06 -05:00

20 KiB

CREMOTE ADA AUTOMATION ENHANCEMENT PLAN

Date: October 2, 2025
Status: APPROVED FOR IMPLEMENTATION
Goal: Increase automated testing coverage from 70% to 85%
Timeline: 6-8 weeks
Philosophy: KISS - Keep it Simple, Stupid


EXECUTIVE SUMMARY

This plan outlines practical enhancements to the cremote MCP accessibility testing suite. We will implement 6 new automated testing capabilities using proven, simple tools. The caption accuracy validation using speech-to-text is EXCLUDED as it's beyond our current platform capabilities.

Target Coverage Increase: 70% → 85% (15 percentage point improvement)


SCOPE EXCLUSIONS

NOT INCLUDED IN THIS PLAN:

  1. Speech-to-Text Caption Accuracy Validation

    • Reason: Requires external services (Whisper API, Google Speech-to-Text)
    • Complexity: High (video processing, audio extraction, STT integration)
    • Cost: Ongoing API costs or significant compute resources
    • Alternative: Manual review or future enhancement
  2. Real-time Live Caption Testing

    • Reason: Requires live streaming infrastructure
    • Complexity: Very high (real-time monitoring, streaming protocols)
    • Alternative: Manual testing during live events
  3. Complex Video Content Analysis

    • Reason: Determining if visual content requires audio description needs human judgment
    • Alternative: Flag all videos without descriptions for manual review

IMPLEMENTATION PHASES

PHASE 1: FOUNDATION (Weeks 1-2)

Goal: Implement high-impact, low-effort enhancements
Effort: 28-36 hours

1.1 Gradient Contrast Analysis (ImageMagick)

Priority: CRITICAL
Effort: 8-12 hours
Solves: "Incomplete" findings for text on gradient backgrounds

Deliverables:

  • New MCP tool: web_gradient_contrast_check_cremotemcp_cremotemcp
  • Takes element selector, analyzes background gradient
  • Returns worst-case contrast ratio
  • Integrates with existing contrast checker

Technical Approach:

# 1. Screenshot element
web_screenshot_element(selector=".hero-section")

# 2. Extract text color from computed styles
text_color = getComputedStyle(element).color

# 3. Sample 100 points across background using ImageMagick
convert screenshot.png -resize 10x10! -depth 8 txt:- | parse_colors

# 4. Calculate contrast against darkest/lightest points
# 5. Return worst-case ratio

Files to Create/Modify:

  • mcp/tools/gradient_contrast.go (new)
  • mcp/server.go (register new tool)
  • docs/llm_ada_testing.md (document usage)

1.2 Time-Based Media Validation (Basic)

Priority: CRITICAL
Effort: 8-12 hours
Solves: WCAG 1.2.2, 1.2.3, 1.2.5, 1.4.2 violations

Deliverables:

  • New MCP tool: web_media_validation_cremotemcp_cremotemcp
  • Detects all video/audio elements
  • Checks for caption tracks, audio description tracks, transcripts
  • Validates track files are accessible
  • Checks for autoplay violations

What We Test: Presence of <track kind="captions">
Presence of <track kind="descriptions">
Presence of transcript links
Caption file accessibility (HTTP fetch)
Controls attribute present
Autoplay detection
Embedded player detection (YouTube, Vimeo)

What We DON'T Test: Caption accuracy (requires speech-to-text)
Audio description quality (requires human judgment)
Transcript completeness (requires human judgment)

Technical Approach:

// JavaScript injection via console_command
const mediaInventory = {
  videos: Array.from(document.querySelectorAll('video')).map(v => ({
    src: v.src,
    hasCaptions: !!v.querySelector('track[kind="captions"], track[kind="subtitles"]'),
    hasDescriptions: !!v.querySelector('track[kind="descriptions"]'),
    hasControls: v.hasAttribute('controls'),
    autoplay: v.hasAttribute('autoplay'),
    captionTracks: Array.from(v.querySelectorAll('track')).map(t => ({
      kind: t.kind,
      src: t.src,
      srclang: t.srclang
    }))
  })),
  audios: Array.from(document.querySelectorAll('audio')).map(a => ({
    src: a.src,
    hasControls: a.hasAttribute('controls'),
    autoplay: a.hasAttribute('autoplay')
  })),
  embeds: Array.from(document.querySelectorAll('iframe[src*="youtube"], iframe[src*="vimeo"]')).map(i => ({
    src: i.src,
    type: i.src.includes('youtube') ? 'youtube' : 'vimeo'
  }))
};

// For each video, validate caption files
for (const video of mediaInventory.videos) {
  for (const track of video.captionTracks) {
    const response = await fetch(track.src);
    track.accessible = response.ok;
  }
}

// Check for transcript links near videos
const transcriptLinks = Array.from(document.querySelectorAll('a[href*="transcript"]'));

return {mediaInventory, transcriptLinks};

Files to Create/Modify:

  • mcp/tools/media_validation.go (new)
  • mcp/server.go (register new tool)
  • docs/llm_ada_testing.md (document usage)

1.3 Hover/Focus Content Persistence Testing

Priority: HIGH
Effort: 12-16 hours
Solves: WCAG 1.4.13 violations (tooltips, dropdowns, popovers)

Deliverables:

  • New MCP tool: web_hover_focus_test_cremotemcp_cremotemcp
  • Identifies elements with hover/focus-triggered content
  • Tests dismissibility (Esc key)
  • Tests hoverability (can mouse move to triggered content)
  • Tests persistence (doesn't disappear immediately)

Technical Approach:

// 1. Find all elements with hover/focus handlers
const interactiveElements = Array.from(document.querySelectorAll('*')).filter(el => {
  const events = getEventListeners(el);
  return events.mouseover || events.mouseenter || events.focus;
});

// 2. Test each element
for (const el of interactiveElements) {
  // Trigger hover
  el.dispatchEvent(new MouseEvent('mouseover', {bubbles: true}));
  await sleep(100);
  
  // Check for new content
  const tooltip = document.querySelector('[role="tooltip"], .tooltip, .popover');
  
  if (tooltip) {
    // Test dismissibility
    document.dispatchEvent(new KeyboardEvent('keydown', {key: 'Escape'}));
    const dismissed = !document.contains(tooltip);
    
    // Test hoverability
    const rect = tooltip.getBoundingClientRect();
    const hoverable = rect.width > 0 && rect.height > 0;
    
    // Test persistence
    el.dispatchEvent(new MouseEvent('mouseout', {bubbles: true}));
    await sleep(500);
    const persistent = document.contains(tooltip);
    
    results.push({element: el, dismissed, hoverable, persistent});
  }
}

Files to Create/Modify:

  • mcp/tools/hover_focus_test.go (new)
  • mcp/server.go (register new tool)
  • docs/llm_ada_testing.md (document usage)

PHASE 2: EXPANSION (Weeks 3-4)

Goal: Add medium-complexity enhancements
Effort: 32-44 hours

2.1 Text-in-Images Detection (OCR)

Priority: HIGH
Effort: 12-16 hours
Solves: WCAG 1.4.5 violations (images of text)

Deliverables:

  • New MCP tool: web_text_in_images_check_cremotemcp_cremotemcp
  • Downloads all images from page
  • Runs Tesseract OCR on each image
  • Flags images containing significant text (>5 words)
  • Compares detected text with alt text
  • Excludes logos (configurable)

Technical Approach:

# 1. Extract all image URLs
images=$(console_command "Array.from(document.querySelectorAll('img')).map(img => ({src: img.src, alt: img.alt}))")

# 2. Download each image to container
for img in $images; do
  curl -o /tmp/img_$i.png $img.src
  
  # 3. Run OCR
  tesseract /tmp/img_$i.png /tmp/img_$i_text --psm 6
  
  # 4. Count words
  word_count=$(wc -w < /tmp/img_$i_text.txt)
  
  # 5. If >5 words, flag for review
  if [ $word_count -gt 5 ]; then
    echo "WARNING: Image contains text ($word_count words)"
    echo "Image: $img.src"
    echo "Alt text: $img.alt"
    echo "Detected text: $(cat /tmp/img_$i_text.txt)"
    echo "MANUAL REVIEW: Verify if this should be HTML text instead"
  fi
done

Dependencies:

  • Tesseract OCR (install in container)
  • curl or wget for image download

Files to Create/Modify:

  • mcp/tools/text_in_images.go (new)
  • Dockerfile (add tesseract-ocr)
  • mcp/server.go (register new tool)
  • docs/llm_ada_testing.md (document usage)

2.2 Cross-Page Consistency Analysis

Priority: MEDIUM
Effort: 16-24 hours
Solves: WCAG 3.2.3, 3.2.4 violations (consistent navigation/identification)

Deliverables:

  • New MCP tool: web_consistency_check_cremotemcp_cremotemcp
  • Crawls multiple pages (configurable limit)
  • Extracts navigation structure from each page
  • Compares navigation order across pages
  • Identifies common elements (search, login, cart)
  • Verifies consistent labeling

Technical Approach:

// 1. Crawl site (limit to 20 pages for performance)
const pages = [];
const visited = new Set();

async function crawlPage(url, depth = 0) {
  if (depth > 2 || visited.has(url)) return;
  visited.add(url);
  
  await navigateTo(url);
  
  pages.push({
    url,
    navigation: Array.from(document.querySelectorAll('nav a, header a')).map(a => ({
      text: a.textContent.trim(),
      href: a.href,
      order: Array.from(a.parentElement.children).indexOf(a)
    })),
    commonElements: {
      search: document.querySelector('[type="search"], [role="search"]')?.outerHTML,
      login: document.querySelector('a[href*="login"]')?.textContent,
      cart: document.querySelector('a[href*="cart"]')?.textContent
    }
  });
  
  // Find more pages
  const links = Array.from(document.querySelectorAll('a[href]'))
    .map(a => a.href)
    .filter(href => href.startsWith(window.location.origin))
    .slice(0, 10);
  
  for (const link of links) {
    await crawlPage(link, depth + 1);
  }
}

// 2. Analyze consistency
const navOrders = pages.map(p => p.navigation.map(n => n.text).join('|'));
const uniqueOrders = [...new Set(navOrders)];

if (uniqueOrders.length > 1) {
  // Navigation order varies - FAIL WCAG 3.2.3
}

// Check common element consistency
const searchLabels = pages.map(p => p.commonElements.search).filter(Boolean);
if (new Set(searchLabels).size > 1) {
  // Search identified inconsistently - FAIL WCAG 3.2.4
}

Files to Create/Modify:

  • mcp/tools/consistency_check.go (new)
  • mcp/server.go (register new tool)
  • docs/llm_ada_testing.md (document usage)

2.3 Sensory Characteristics Detection (Pattern Matching)

Priority: MEDIUM
Effort: 8-12 hours
Solves: WCAG 1.3.3 violations (instructions relying on sensory characteristics)

Deliverables:

  • New MCP tool: web_sensory_check_cremotemcp_cremotemcp
  • Scans page text for sensory-only instructions
  • Flags phrases like "click the red button", "square icon", "on the right"
  • Uses regex pattern matching
  • Provides context for manual review

Technical Approach:

// Pattern matching for sensory-only instructions
const sensoryPatterns = [
  // Color-only
  /click (the )?(red|green|blue|yellow|orange|purple|pink|gray|grey) (button|link|icon)/gi,
  /the (red|green|blue|yellow|orange|purple|pink|gray|grey) (button|link|icon)/gi,
  
  // Shape-only
  /(round|square|circular|rectangular|triangular) (button|icon|shape)/gi,
  /click (the )?(circle|square|triangle|rectangle)/gi,
  
  // Position-only
  /(on the |at the )?(left|right|top|bottom|above|below)/gi,
  /button (on the |at the )?(left|right|top|bottom)/gi,
  
  // Size-only
  /(large|small|big|little) (button|icon|link)/gi,
  
  // Sound-only
  /when you hear (the )?(beep|sound|tone|chime)/gi
];

const pageText = document.body.innerText;
const violations = [];

for (const pattern of sensoryPatterns) {
  const matches = pageText.matchAll(pattern);
  for (const match of matches) {
    // Get context (50 chars before and after)
    const index = match.index;
    const context = pageText.substring(index - 50, index + match[0].length + 50);
    
    violations.push({
      text: match[0],
      context,
      pattern: pattern.source,
      wcag: '1.3.3 Sensory Characteristics'
    });
  }
}

return violations;

Files to Create/Modify:

  • mcp/tools/sensory_check.go (new)
  • mcp/server.go (register new tool)
  • docs/llm_ada_testing.md (document usage)

PHASE 3: ADVANCED (Weeks 5-6)

Goal: Add complex but valuable enhancements
Effort: 24-32 hours

3.1 Animation & Flash Detection (Video Analysis)

Priority: MEDIUM
Effort: 16-24 hours
Solves: WCAG 2.3.1 violations (three flashes or below threshold)

Deliverables:

  • New MCP tool: web_flash_detection_cremotemcp_cremotemcp
  • Records page for 10 seconds using CDP screencast
  • Analyzes frames for brightness changes
  • Counts flashes per second
  • Flags if >3 flashes/second detected

Technical Approach:

// Use Chrome DevTools Protocol to capture screencast
func (t *FlashDetectionTool) Execute(params map[string]interface{}) (interface{}, error) {
    // 1. Start screencast
    err := t.cdp.Page.StartScreencast(&page.StartScreencastArgs{
        Format:    "png",
        Quality:   80,
        MaxWidth:  1280,
        MaxHeight: 800,
    })
    
    // 2. Collect frames for 10 seconds
    frames := [][]byte{}
    timeout := time.After(10 * time.Second)
    
    for {
        select {
        case frame := <-t.cdp.Page.ScreencastFrame:
            frames = append(frames, frame.Data)
        case <-timeout:
            goto analyze
        }
    }
    
analyze:
    // 3. Analyze brightness changes between consecutive frames
    flashes := 0
    for i := 1; i < len(frames); i++ {
        brightness1 := calculateBrightness(frames[i-1])
        brightness2 := calculateBrightness(frames[i])
        
        // If brightness change >20%, count as flash
        if math.Abs(brightness2 - brightness1) > 0.2 {
            flashes++
        }
    }
    
    // 4. Calculate flashes per second
    flashesPerSecond := float64(flashes) / 10.0
    
    return map[string]interface{}{
        "flashes_detected": flashes,
        "flashes_per_second": flashesPerSecond,
        "passes": flashesPerSecond <= 3.0,
        "wcag": "2.3.1 Three Flashes or Below Threshold",
    }, nil
}

Dependencies:

  • Chrome DevTools Protocol screencast API
  • Image processing library (Go image package)

Files to Create/Modify:

  • mcp/tools/flash_detection.go (new)
  • mcp/server.go (register new tool)
  • docs/llm_ada_testing.md (document usage)

3.2 Enhanced Accessibility Tree Analysis

Priority: MEDIUM
Effort: 8-12 hours
Solves: Better detection of ARIA issues, role/name/value problems

Deliverables:

  • Enhance existing get_accessibility_tree_cremotemcp_cremotemcp tool
  • Add validation rules for common ARIA mistakes
  • Check for invalid role combinations
  • Verify required ARIA properties
  • Detect orphaned ARIA references

Technical Approach:

// Validate ARIA usage
const ariaValidation = {
  // Check for invalid role combinations
  invalidRoles: Array.from(document.querySelectorAll('[role]')).filter(el => {
    const role = el.getAttribute('role');
    const validRoles = ['button', 'link', 'navigation', 'main', 'complementary', ...];
    return !validRoles.includes(role);
  }),
  
  // Check for required ARIA properties
  missingProperties: Array.from(document.querySelectorAll('[role="button"]')).filter(el => {
    return !el.hasAttribute('aria-label') && !el.textContent.trim();
  }),
  
  // Check for orphaned aria-describedby/labelledby
  orphanedReferences: Array.from(document.querySelectorAll('[aria-describedby], [aria-labelledby]')).filter(el => {
    const describedby = el.getAttribute('aria-describedby');
    const labelledby = el.getAttribute('aria-labelledby');
    const id = describedby || labelledby;
    return id && !document.getElementById(id);
  })
};

Files to Create/Modify:

  • mcp/tools/accessibility_tree.go (enhance existing)
  • docs/llm_ada_testing.md (document new validations)

IMPLEMENTATION SCHEDULE

Week 1-2: Phase 1 Foundation

  • Day 1-3: Gradient contrast analysis (ImageMagick)
  • Day 4-6: Time-based media validation (basic)
  • Day 7-10: Hover/focus content testing

Week 3-4: Phase 2 Expansion

  • Day 11-14: Text-in-images detection (OCR)
  • Day 15-20: Cross-page consistency analysis
  • Day 21-23: Sensory characteristics detection

Week 5-6: Phase 3 Advanced

  • Day 24-30: Animation/flash detection
  • Day 31-35: Enhanced accessibility tree analysis

Week 7-8: Testing & Documentation

  • Day 36-40: Integration testing
  • Day 41-45: Documentation updates
  • Day 46-50: User acceptance testing

TECHNICAL REQUIREMENTS

Container Dependencies

# Add to Dockerfile
RUN apt-get update && apt-get install -y \
    imagemagick \
    tesseract-ocr \
    tesseract-ocr-eng \
    && rm -rf /var/lib/apt/lists/*

Go Dependencies

// Add to go.mod
require (
    github.com/chromedp/cdproto v0.0.0-20231011050154-1d073bb38998
    github.com/disintegration/imaging v1.6.2 // Image processing
)

Configuration

# Add to cremote config
automation_enhancements:
  gradient_contrast:
    enabled: true
    sample_points: 100
  
  media_validation:
    enabled: true
    check_embedded_players: true
    youtube_api_key: "" # Optional
  
  text_in_images:
    enabled: true
    min_word_threshold: 5
    exclude_logos: true
  
  consistency_check:
    enabled: true
    max_pages: 20
    max_depth: 2
  
  flash_detection:
    enabled: true
    recording_duration: 10
    brightness_threshold: 0.2

SUCCESS METRICS

Coverage Targets

  • Current: 70% automated coverage
  • After Phase 1: 78% automated coverage (+8%)
  • After Phase 2: 83% automated coverage (+5%)
  • After Phase 3: 85% automated coverage (+2%)

Quality Metrics

  • False Positive Rate: <10%
  • False Negative Rate: <5%
  • Test Execution Time: <5 minutes per page
  • Report Clarity: 100% actionable findings

Performance Targets

  • Gradient contrast: <2 seconds per element
  • Media validation: <5 seconds per page
  • Text-in-images: <1 second per image
  • Consistency check: <30 seconds for 20 pages
  • Flash detection: 10 seconds (fixed recording time)

RISK MITIGATION

Technical Risks

  1. ImageMagick performance on large images

    • Mitigation: Resize images before analysis
    • Fallback: Skip images >5MB
  2. Tesseract OCR accuracy

    • Mitigation: Set confidence threshold
    • Fallback: Flag low-confidence results for manual review
  3. CDP screencast reliability

    • Mitigation: Implement retry logic
    • Fallback: Skip flash detection if screencast fails
  4. Cross-page crawling performance

    • Mitigation: Limit to 20 pages, depth 2
    • Fallback: Allow user to specify page list

Operational Risks

  1. Container size increase

    • Mitigation: Use multi-stage Docker builds
    • Monitor: Keep container <500MB
  2. Increased test execution time

    • Mitigation: Make all enhancements optional
    • Allow: Users to enable/disable specific tests

DELIVERABLES

Code

  • 6 new MCP tools (gradient, media, hover, OCR, consistency, flash)
  • 1 enhanced tool (accessibility tree)
  • Updated Dockerfile with dependencies
  • Updated configuration schema
  • Integration tests for all new tools

Documentation

  • Updated docs/llm_ada_testing.md with new tools
  • Updated enhanced_chromium_ada_checklist.md with automation notes
  • New docs/AUTOMATION_TOOLS.md with technical details
  • Updated README with new capabilities
  • Example usage for each new tool

Testing

  • Unit tests for each new tool
  • Integration tests with real websites
  • Performance benchmarks
  • Accuracy validation against manual testing

MAINTENANCE PLAN

Ongoing Support

  • Monitor false positive/negative rates
  • Update pattern matching rules (sensory characteristics)
  • Keep dependencies updated (ImageMagick, Tesseract)
  • Add new ARIA validation rules as spec evolves

Future Enhancements (Post-Plan)

  • LLM-assisted semantic analysis (if budget allows)
  • Speech-to-text caption validation (if external service available)
  • Real-time live caption testing (if streaming infrastructure added)
  • Advanced video content analysis (if AI/ML resources available)

APPROVAL & SIGN-OFF

Plan Status: READY FOR APPROVAL

Estimated Total Effort: 84-112 hours (10-14 business days)

Estimated Timeline: 6-8 weeks (with testing and documentation)

Budget Impact: Minimal (only open-source dependencies)

Risk Level: LOW (all technologies proven and stable)


Next Steps:

  1. Review and approve this plan
  2. Set up development environment with new dependencies
  3. Begin Phase 1 implementation
  4. Schedule weekly progress reviews

Document Prepared By: Cremote Development Team
Date: October 2, 2025
Version: 1.0