Files

Josh at WLTechBlog a27273b581 bump

2025-10-03 10:19:06 -05:00

20 KiB

Raw Blame History

CREMOTE ADA AUTOMATION ENHANCEMENT PLAN

Date: October 2, 2025
Status: APPROVED FOR IMPLEMENTATION
Goal: Increase automated testing coverage from 70% to 85%
Timeline: 6-8 weeks
Philosophy: KISS - Keep it Simple, Stupid

EXECUTIVE SUMMARY

This plan outlines practical enhancements to the cremote MCP accessibility testing suite. We will implement 6 new automated testing capabilities using proven, simple tools. The caption accuracy validation using speech-to-text is EXCLUDED as it's beyond our current platform capabilities.

Target Coverage Increase: 70% → 85% (15 percentage point improvement)

SCOPE EXCLUSIONS

❌ NOT INCLUDED IN THIS PLAN:

Speech-to-Text Caption Accuracy Validation
- Reason: Requires external services (Whisper API, Google Speech-to-Text)
- Complexity: High (video processing, audio extraction, STT integration)
- Cost: Ongoing API costs or significant compute resources
- Alternative: Manual review or future enhancement
Real-time Live Caption Testing
- Reason: Requires live streaming infrastructure
- Complexity: Very high (real-time monitoring, streaming protocols)
- Alternative: Manual testing during live events
Complex Video Content Analysis
- Reason: Determining if visual content requires audio description needs human judgment
- Alternative: Flag all videos without descriptions for manual review

IMPLEMENTATION PHASES

PHASE 1: FOUNDATION (Weeks 1-2)

Goal: Implement high-impact, low-effort enhancements
Effort: 28-36 hours

1.1 Gradient Contrast Analysis (ImageMagick)

Priority: CRITICAL
Effort: 8-12 hours
Solves: "Incomplete" findings for text on gradient backgrounds

Deliverables:

New MCP tool: web_gradient_contrast_check_cremotemcp_cremotemcp
Takes element selector, analyzes background gradient
Returns worst-case contrast ratio
Integrates with existing contrast checker

Technical Approach:

# 1. Screenshot element
web_screenshot_element(selector=".hero-section")

# 2. Extract text color from computed styles
text_color = getComputedStyle(element).color

# 3. Sample 100 points across background using ImageMagick
convert screenshot.png -resize 10x10! -depth 8 txt:- | parse_colors

# 4. Calculate contrast against darkest/lightest points
# 5. Return worst-case ratio

Files to Create/Modify:

mcp/tools/gradient_contrast.go (new)
mcp/server.go (register new tool)
docs/llm_ada_testing.md (document usage)

1.2 Time-Based Media Validation (Basic)

Priority: CRITICAL
Effort: 8-12 hours
Solves: WCAG 1.2.2, 1.2.3, 1.2.5, 1.4.2 violations

Deliverables:

New MCP tool: web_media_validation_cremotemcp_cremotemcp
Detects all video/audio elements
Checks for caption tracks, audio description tracks, transcripts
Validates track files are accessible
Checks for autoplay violations

What We Test: ✅ Presence of <track kind="captions">
✅ Presence of <track kind="descriptions">
✅ Presence of transcript links
✅ Caption file accessibility (HTTP fetch)
✅ Controls attribute present
✅ Autoplay detection
✅ Embedded player detection (YouTube, Vimeo)

What We DON'T Test: ❌ Caption accuracy (requires speech-to-text)
❌ Audio description quality (requires human judgment)
❌ Transcript completeness (requires human judgment)

Technical Approach:

// JavaScript injection via console_command
const mediaInventory = {
  videos: Array.from(document.querySelectorAll('video')).map(v => ({
    src: v.src,
    hasCaptions: !!v.querySelector('track[kind="captions"], track[kind="subtitles"]'),
    hasDescriptions: !!v.querySelector('track[kind="descriptions"]'),
    hasControls: v.hasAttribute('controls'),
    autoplay: v.hasAttribute('autoplay'),
    captionTracks: Array.from(v.querySelectorAll('track')).map(t => ({
      kind: t.kind,
      src: t.src,
      srclang: t.srclang
    }))
  })),
  audios: Array.from(document.querySelectorAll('audio')).map(a => ({
    src: a.src,
    hasControls: a.hasAttribute('controls'),
    autoplay: a.hasAttribute('autoplay')
  })),
  embeds: Array.from(document.querySelectorAll('iframe[src*="youtube"], iframe[src*="vimeo"]')).map(i => ({
    src: i.src,
    type: i.src.includes('youtube') ? 'youtube' : 'vimeo'
  }))
};

// For each video, validate caption files
for (const video of mediaInventory.videos) {
  for (const track of video.captionTracks) {
    const response = await fetch(track.src);
    track.accessible = response.ok;
  }
}

// Check for transcript links near videos
const transcriptLinks = Array.from(document.querySelectorAll('a[href*="transcript"]'));

return {mediaInventory, transcriptLinks};

Files to Create/Modify:

mcp/tools/media_validation.go (new)
mcp/server.go (register new tool)
docs/llm_ada_testing.md (document usage)

1.3 Hover/Focus Content Persistence Testing

Priority: HIGH
Effort: 12-16 hours
Solves: WCAG 1.4.13 violations (tooltips, dropdowns, popovers)

Deliverables:

New MCP tool: web_hover_focus_test_cremotemcp_cremotemcp
Identifies elements with hover/focus-triggered content
Tests dismissibility (Esc key)
Tests hoverability (can mouse move to triggered content)
Tests persistence (doesn't disappear immediately)

Technical Approach:

// 1. Find all elements with hover/focus handlers
const interactiveElements = Array.from(document.querySelectorAll('*')).filter(el => {
  const events = getEventListeners(el);
  return events.mouseover || events.mouseenter || events.focus;
});

// 2. Test each element
for (const el of interactiveElements) {
  // Trigger hover
  el.dispatchEvent(new MouseEvent('mouseover', {bubbles: true}));
  await sleep(100);
  
  // Check for new content
  const tooltip = document.querySelector('[role="tooltip"], .tooltip, .popover');
  
  if (tooltip) {
    // Test dismissibility
    document.dispatchEvent(new KeyboardEvent('keydown', {key: 'Escape'}));
    const dismissed = !document.contains(tooltip);
    
    // Test hoverability
    const rect = tooltip.getBoundingClientRect();
    const hoverable = rect.width > 0 && rect.height > 0;
    
    // Test persistence
    el.dispatchEvent(new MouseEvent('mouseout', {bubbles: true}));
    await sleep(500);
    const persistent = document.contains(tooltip);
    
    results.push({element: el, dismissed, hoverable, persistent});
  }
}

Files to Create/Modify:

mcp/tools/hover_focus_test.go (new)
mcp/server.go (register new tool)
docs/llm_ada_testing.md (document usage)

PHASE 2: EXPANSION (Weeks 3-4)

Goal: Add medium-complexity enhancements
Effort: 32-44 hours

2.1 Text-in-Images Detection (OCR)

Priority: HIGH
Effort: 12-16 hours
Solves: WCAG 1.4.5 violations (images of text)

Deliverables:

New MCP tool: web_text_in_images_check_cremotemcp_cremotemcp
Downloads all images from page
Runs Tesseract OCR on each image
Flags images containing significant text (>5 words)
Compares detected text with alt text
Excludes logos (configurable)

Technical Approach:

# 1. Extract all image URLs
images=$(console_command "Array.from(document.querySelectorAll('img')).map(img => ({src: img.src, alt: img.alt}))")

# 2. Download each image to container
for img in $images; do
  curl -o /tmp/img_$i.png $img.src
  
  # 3. Run OCR
  tesseract /tmp/img_$i.png /tmp/img_$i_text --psm 6
  
  # 4. Count words
  word_count=$(wc -w < /tmp/img_$i_text.txt)
  
  # 5. If >5 words, flag for review
  if [ $word_count -gt 5 ]; then
    echo "WARNING: Image contains text ($word_count words)"
    echo "Image: $img.src"
    echo "Alt text: $img.alt"
    echo "Detected text: $(cat /tmp/img_$i_text.txt)"
    echo "MANUAL REVIEW: Verify if this should be HTML text instead"
  fi
done

Dependencies:

Tesseract OCR (install in container)
curl or wget for image download

Files to Create/Modify:

mcp/tools/text_in_images.go (new)
Dockerfile (add tesseract-ocr)
mcp/server.go (register new tool)
docs/llm_ada_testing.md (document usage)

2.2 Cross-Page Consistency Analysis

Priority: MEDIUM
Effort: 16-24 hours
Solves: WCAG 3.2.3, 3.2.4 violations (consistent navigation/identification)

Deliverables:

New MCP tool: web_consistency_check_cremotemcp_cremotemcp
Crawls multiple pages (configurable limit)
Extracts navigation structure from each page
Compares navigation order across pages
Identifies common elements (search, login, cart)
Verifies consistent labeling

Technical Approach:

// 1. Crawl site (limit to 20 pages for performance)
const pages = [];
const visited = new Set();

async function crawlPage(url, depth = 0) {
  if (depth > 2 || visited.has(url)) return;
  visited.add(url);
  
  await navigateTo(url);
  
  pages.push({
    url,
    navigation: Array.from(document.querySelectorAll('nav a, header a')).map(a => ({
      text: a.textContent.trim(),
      href: a.href,
      order: Array.from(a.parentElement.children).indexOf(a)
    })),
    commonElements: {
      search: document.querySelector('[type="search"], [role="search"]')?.outerHTML,
      login: document.querySelector('a[href*="login"]')?.textContent,
      cart: document.querySelector('a[href*="cart"]')?.textContent
    }
  });
  
  // Find more pages
  const links = Array.from(document.querySelectorAll('a[href]'))
    .map(a => a.href)
    .filter(href => href.startsWith(window.location.origin))
    .slice(0, 10);
  
  for (const link of links) {
    await crawlPage(link, depth + 1);
  }
}

// 2. Analyze consistency
const navOrders = pages.map(p => p.navigation.map(n => n.text).join('|'));
const uniqueOrders = [...new Set(navOrders)];

if (uniqueOrders.length > 1) {
  // Navigation order varies - FAIL WCAG 3.2.3
}

// Check common element consistency
const searchLabels = pages.map(p => p.commonElements.search).filter(Boolean);
if (new Set(searchLabels).size > 1) {
  // Search identified inconsistently - FAIL WCAG 3.2.4
}

Files to Create/Modify:

mcp/tools/consistency_check.go (new)
mcp/server.go (register new tool)
docs/llm_ada_testing.md (document usage)

2.3 Sensory Characteristics Detection (Pattern Matching)

Priority: MEDIUM
Effort: 8-12 hours
Solves: WCAG 1.3.3 violations (instructions relying on sensory characteristics)

Deliverables:

New MCP tool: web_sensory_check_cremotemcp_cremotemcp
Scans page text for sensory-only instructions
Flags phrases like "click the red button", "square icon", "on the right"
Uses regex pattern matching
Provides context for manual review

Technical Approach:

// Pattern matching for sensory-only instructions
const sensoryPatterns = [
  // Color-only
  /click (the )?(red|green|blue|yellow|orange|purple|pink|gray|grey) (button|link|icon)/gi,
  /the (red|green|blue|yellow|orange|purple|pink|gray|grey) (button|link|icon)/gi,
  
  // Shape-only
  /(round|square|circular|rectangular|triangular) (button|icon|shape)/gi,
  /click (the )?(circle|square|triangle|rectangle)/gi,
  
  // Position-only
  /(on the |at the )?(left|right|top|bottom|above|below)/gi,
  /button (on the |at the )?(left|right|top|bottom)/gi,
  
  // Size-only
  /(large|small|big|little) (button|icon|link)/gi,
  
  // Sound-only
  /when you hear (the )?(beep|sound|tone|chime)/gi
];

const pageText = document.body.innerText;
const violations = [];

for (const pattern of sensoryPatterns) {
  const matches = pageText.matchAll(pattern);
  for (const match of matches) {
    // Get context (50 chars before and after)
    const index = match.index;
    const context = pageText.substring(index - 50, index + match[0].length + 50);
    
    violations.push({
      text: match[0],
      context,
      pattern: pattern.source,
      wcag: '1.3.3 Sensory Characteristics'
    });
  }
}

return violations;

Files to Create/Modify:

mcp/tools/sensory_check.go (new)
mcp/server.go (register new tool)
docs/llm_ada_testing.md (document usage)

PHASE 3: ADVANCED (Weeks 5-6)

Goal: Add complex but valuable enhancements
Effort: 24-32 hours

3.1 Animation & Flash Detection (Video Analysis)

Priority: MEDIUM
Effort: 16-24 hours
Solves: WCAG 2.3.1 violations (three flashes or below threshold)

Deliverables:

New MCP tool: web_flash_detection_cremotemcp_cremotemcp
Records page for 10 seconds using CDP screencast
Analyzes frames for brightness changes
Counts flashes per second
Flags if >3 flashes/second detected

Technical Approach:

// Use Chrome DevTools Protocol to capture screencast
func (t *FlashDetectionTool) Execute(params map[string]interface{}) (interface{}, error) {
    // 1. Start screencast
    err := t.cdp.Page.StartScreencast(&page.StartScreencastArgs{
        Format:    "png",
        Quality:   80,
        MaxWidth:  1280,
        MaxHeight: 800,
    })
    
    // 2. Collect frames for 10 seconds
    frames := [][]byte{}
    timeout := time.After(10 * time.Second)
    
    for {
        select {
        case frame := <-t.cdp.Page.ScreencastFrame:
            frames = append(frames, frame.Data)
        case <-timeout:
            goto analyze
        }
    }
    
analyze:
    // 3. Analyze brightness changes between consecutive frames
    flashes := 0
    for i := 1; i < len(frames); i++ {
        brightness1 := calculateBrightness(frames[i-1])
        brightness2 := calculateBrightness(frames[i])
        
        // If brightness change >20%, count as flash
        if math.Abs(brightness2 - brightness1) > 0.2 {
            flashes++
        }
    }
    
    // 4. Calculate flashes per second
    flashesPerSecond := float64(flashes) / 10.0
    
    return map[string]interface{}{
        "flashes_detected": flashes,
        "flashes_per_second": flashesPerSecond,
        "passes": flashesPerSecond <= 3.0,
        "wcag": "2.3.1 Three Flashes or Below Threshold",
    }, nil
}

Dependencies:

Chrome DevTools Protocol screencast API
Image processing library (Go image package)

Files to Create/Modify:

mcp/tools/flash_detection.go (new)
mcp/server.go (register new tool)
docs/llm_ada_testing.md (document usage)

3.2 Enhanced Accessibility Tree Analysis

Priority: MEDIUM
Effort: 8-12 hours
Solves: Better detection of ARIA issues, role/name/value problems

Deliverables:

Enhance existing get_accessibility_tree_cremotemcp_cremotemcp tool
Add validation rules for common ARIA mistakes
Check for invalid role combinations
Verify required ARIA properties
Detect orphaned ARIA references

Technical Approach:

// Validate ARIA usage
const ariaValidation = {
  // Check for invalid role combinations
  invalidRoles: Array.from(document.querySelectorAll('[role]')).filter(el => {
    const role = el.getAttribute('role');
    const validRoles = ['button', 'link', 'navigation', 'main', 'complementary', ...];
    return !validRoles.includes(role);
  }),
  
  // Check for required ARIA properties
  missingProperties: Array.from(document.querySelectorAll('[role="button"]')).filter(el => {
    return !el.hasAttribute('aria-label') && !el.textContent.trim();
  }),
  
  // Check for orphaned aria-describedby/labelledby
  orphanedReferences: Array.from(document.querySelectorAll('[aria-describedby], [aria-labelledby]')).filter(el => {
    const describedby = el.getAttribute('aria-describedby');
    const labelledby = el.getAttribute('aria-labelledby');
    const id = describedby || labelledby;
    return id && !document.getElementById(id);
  })
};

Files to Create/Modify:

mcp/tools/accessibility_tree.go (enhance existing)
docs/llm_ada_testing.md (document new validations)

IMPLEMENTATION SCHEDULE

Week 1-2: Phase 1 Foundation

Day 1-3: Gradient contrast analysis (ImageMagick)
Day 4-6: Time-based media validation (basic)
Day 7-10: Hover/focus content testing

Week 3-4: Phase 2 Expansion

Day 11-14: Text-in-images detection (OCR)
Day 15-20: Cross-page consistency analysis
Day 21-23: Sensory characteristics detection

Week 5-6: Phase 3 Advanced

Day 24-30: Animation/flash detection
Day 31-35: Enhanced accessibility tree analysis

Week 7-8: Testing & Documentation

Day 36-40: Integration testing
Day 41-45: Documentation updates
Day 46-50: User acceptance testing

TECHNICAL REQUIREMENTS

Container Dependencies

# Add to Dockerfile
RUN apt-get update && apt-get install -y \
    imagemagick \
    tesseract-ocr \
    tesseract-ocr-eng \
    && rm -rf /var/lib/apt/lists/*

Go Dependencies

// Add to go.mod
require (
    github.com/chromedp/cdproto v0.0.0-20231011050154-1d073bb38998
    github.com/disintegration/imaging v1.6.2 // Image processing
)

Configuration

# Add to cremote config
automation_enhancements:
  gradient_contrast:
    enabled: true
    sample_points: 100
  
  media_validation:
    enabled: true
    check_embedded_players: true
    youtube_api_key: "" # Optional
  
  text_in_images:
    enabled: true
    min_word_threshold: 5
    exclude_logos: true
  
  consistency_check:
    enabled: true
    max_pages: 20
    max_depth: 2
  
  flash_detection:
    enabled: true
    recording_duration: 10
    brightness_threshold: 0.2

SUCCESS METRICS

Coverage Targets

Current: 70% automated coverage
After Phase 1: 78% automated coverage (+8%)
After Phase 2: 83% automated coverage (+5%)
After Phase 3: 85% automated coverage (+2%)

Quality Metrics

False Positive Rate: <10%
False Negative Rate: <5%
Test Execution Time: <5 minutes per page
Report Clarity: 100% actionable findings

Performance Targets

Gradient contrast: <2 seconds per element
Media validation: <5 seconds per page
Text-in-images: <1 second per image
Consistency check: <30 seconds for 20 pages
Flash detection: 10 seconds (fixed recording time)

RISK MITIGATION

Technical Risks

ImageMagick performance on large images
- Mitigation: Resize images before analysis
- Fallback: Skip images >5MB
Tesseract OCR accuracy
- Mitigation: Set confidence threshold
- Fallback: Flag low-confidence results for manual review
CDP screencast reliability
- Mitigation: Implement retry logic
- Fallback: Skip flash detection if screencast fails
Cross-page crawling performance
- Mitigation: Limit to 20 pages, depth 2
- Fallback: Allow user to specify page list

Operational Risks

Container size increase
- Mitigation: Use multi-stage Docker builds
- Monitor: Keep container <500MB
Increased test execution time
- Mitigation: Make all enhancements optional
- Allow: Users to enable/disable specific tests

DELIVERABLES

Code

6 new MCP tools (gradient, media, hover, OCR, consistency, flash)
1 enhanced tool (accessibility tree)
Updated Dockerfile with dependencies
Updated configuration schema
Integration tests for all new tools

Documentation

Updated docs/llm_ada_testing.md with new tools
Updated enhanced_chromium_ada_checklist.md with automation notes
New docs/AUTOMATION_TOOLS.md with technical details
Updated README with new capabilities
Example usage for each new tool

Testing

Unit tests for each new tool
Integration tests with real websites
Performance benchmarks
Accuracy validation against manual testing

MAINTENANCE PLAN

Ongoing Support

Monitor false positive/negative rates
Update pattern matching rules (sensory characteristics)
Keep dependencies updated (ImageMagick, Tesseract)
Add new ARIA validation rules as spec evolves

Future Enhancements (Post-Plan)

LLM-assisted semantic analysis (if budget allows)
Speech-to-text caption validation (if external service available)
Real-time live caption testing (if streaming infrastructure added)
Advanced video content analysis (if AI/ML resources available)

APPROVAL & SIGN-OFF

Plan Status: READY FOR APPROVAL

Estimated Total Effort: 84-112 hours (10-14 business days)

Estimated Timeline: 6-8 weeks (with testing and documentation)

Budget Impact: Minimal (only open-source dependencies)

Risk Level: LOW (all technologies proven and stable)

Next Steps:

Review and approve this plan
Set up development environment with new dependencies
Begin Phase 1 implementation
Schedule weekly progress reviews

Document Prepared By: Cremote Development Team
Date: October 2, 2025
Version: 1.0

20 KiB Raw Blame History

CREMOTE ADA AUTOMATION ENHANCEMENT PLAN

EXECUTIVE SUMMARY

SCOPE EXCLUSIONS

❌ NOT INCLUDED IN THIS PLAN:

IMPLEMENTATION PHASES

PHASE 1: FOUNDATION (Weeks 1-2)

1.1 Gradient Contrast Analysis (ImageMagick)

1.2 Time-Based Media Validation (Basic)

1.3 Hover/Focus Content Persistence Testing

PHASE 2: EXPANSION (Weeks 3-4)

2.1 Text-in-Images Detection (OCR)

2.2 Cross-Page Consistency Analysis

2.3 Sensory Characteristics Detection (Pattern Matching)

PHASE 3: ADVANCED (Weeks 5-6)

3.1 Animation & Flash Detection (Video Analysis)

3.2 Enhanced Accessibility Tree Analysis

IMPLEMENTATION SCHEDULE

Week 1-2: Phase 1 Foundation

Week 3-4: Phase 2 Expansion

Week 5-6: Phase 3 Advanced

Week 7-8: Testing & Documentation

TECHNICAL REQUIREMENTS

Container Dependencies

Go Dependencies

Configuration

SUCCESS METRICS

Coverage Targets

Quality Metrics

Performance Targets

RISK MITIGATION

Technical Risks

Operational Risks

DELIVERABLES

Code

Documentation

Testing

MAINTENANCE PLAN

Ongoing Support

Future Enhancements (Post-Plan)

APPROVAL & SIGN-OFF

20 KiB

Raw Blame History