bump

2025-10-03 10:19:06 -05:00
parent 741bd19bd9
commit a27273b581
27 changed files with 11258 additions and 14 deletions
--- a/AUTOMATION_ENHANCEMENT_PLAN.md
+++ b/AUTOMATION_ENHANCEMENT_PLAN.md
@@ -0,0 +1,712 @@
+# CREMOTE ADA AUTOMATION ENHANCEMENT PLAN
+
+**Date:** October 2, 2025  
+**Status:** APPROVED FOR IMPLEMENTATION  
+**Goal:** Increase automated testing coverage from 70% to 85%  
+**Timeline:** 6-8 weeks  
+**Philosophy:** KISS - Keep it Simple, Stupid
+
+---
+
+## EXECUTIVE SUMMARY
+
+This plan outlines practical enhancements to the cremote MCP accessibility testing suite. We will implement 6 new automated testing capabilities using proven, simple tools. The caption accuracy validation using speech-to-text is **EXCLUDED** as it's beyond our current platform capabilities.
+
+**Target Coverage Increase:** 70% → 85% (15 percentage point improvement)
+
+---
+
+## SCOPE EXCLUSIONS
+
+### ❌ NOT INCLUDED IN THIS PLAN:
+1. **Speech-to-Text Caption Accuracy Validation**
+   - Reason: Requires external services (Whisper API, Google Speech-to-Text)
+   - Complexity: High (video processing, audio extraction, STT integration)
+   - Cost: Ongoing API costs or significant compute resources
+   - Alternative: Manual review or future enhancement
+
+2. **Real-time Live Caption Testing**
+   - Reason: Requires live streaming infrastructure
+   - Complexity: Very high (real-time monitoring, streaming protocols)
+   - Alternative: Manual testing during live events
+
+3. **Complex Video Content Analysis**
+   - Reason: Determining if visual content requires audio description needs human judgment
+   - Alternative: Flag all videos without descriptions for manual review
+
+---
+
+## IMPLEMENTATION PHASES
+
+### **PHASE 1: FOUNDATION (Weeks 1-2)**
+**Goal:** Implement high-impact, low-effort enhancements  
+**Effort:** 28-36 hours
+
+#### 1.1 Gradient Contrast Analysis (ImageMagick)
+**Priority:** CRITICAL  
+**Effort:** 8-12 hours  
+**Solves:** "Incomplete" findings for text on gradient backgrounds
+
+**Deliverables:**
+- New MCP tool: `web_gradient_contrast_check_cremotemcp_cremotemcp`
+- Takes element selector, analyzes background gradient
+- Returns worst-case contrast ratio
+- Integrates with existing contrast checker
+
+**Technical Approach:**
+```bash
+# 1. Screenshot element
+web_screenshot_element(selector=".hero-section")
+
+# 2. Extract text color from computed styles
+text_color = getComputedStyle(element).color
+
+# 3. Sample 100 points across background using ImageMagick
+convert screenshot.png -resize 10x10! -depth 8 txt:- | parse_colors
+
+# 4. Calculate contrast against darkest/lightest points
+# 5. Return worst-case ratio
+```
+
+**Files to Create/Modify:**
+- `mcp/tools/gradient_contrast.go` (new)
+- `mcp/server.go` (register new tool)
+- `docs/llm_ada_testing.md` (document usage)
+
+---
+
+#### 1.2 Time-Based Media Validation (Basic)
+**Priority:** CRITICAL  
+**Effort:** 8-12 hours  
+**Solves:** WCAG 1.2.2, 1.2.3, 1.2.5, 1.4.2 violations
+
+**Deliverables:**
+- New MCP tool: `web_media_validation_cremotemcp_cremotemcp`
+- Detects all video/audio elements
+- Checks for caption tracks, audio description tracks, transcripts
+- Validates track files are accessible
+- Checks for autoplay violations
+
+**What We Test:**
+✅ Presence of `<track kind="captions">`  
+✅ Presence of `<track kind="descriptions">`  
+✅ Presence of transcript links  
+✅ Caption file accessibility (HTTP fetch)  
+✅ Controls attribute present  
+✅ Autoplay detection  
+✅ Embedded player detection (YouTube, Vimeo)  
+
+**What We DON'T Test:**
+❌ Caption accuracy (requires speech-to-text)  
+❌ Audio description quality (requires human judgment)  
+❌ Transcript completeness (requires human judgment)  
+
+**Technical Approach:**
+```javascript
+// JavaScript injection via console_command
+const mediaInventory = {
+  videos: Array.from(document.querySelectorAll('video')).map(v => ({
+    src: v.src,
+    hasCaptions: !!v.querySelector('track[kind="captions"], track[kind="subtitles"]'),
+    hasDescriptions: !!v.querySelector('track[kind="descriptions"]'),
+    hasControls: v.hasAttribute('controls'),
+    autoplay: v.hasAttribute('autoplay'),
+    captionTracks: Array.from(v.querySelectorAll('track')).map(t => ({
+      kind: t.kind,
+      src: t.src,
+      srclang: t.srclang
+    }))
+  })),
+  audios: Array.from(document.querySelectorAll('audio')).map(a => ({
+    src: a.src,
+    hasControls: a.hasAttribute('controls'),
+    autoplay: a.hasAttribute('autoplay')
+  })),
+  embeds: Array.from(document.querySelectorAll('iframe[src*="youtube"], iframe[src*="vimeo"]')).map(i => ({
+    src: i.src,
+    type: i.src.includes('youtube') ? 'youtube' : 'vimeo'
+  }))
+};
+
+// For each video, validate caption files
+for (const video of mediaInventory.videos) {
+  for (const track of video.captionTracks) {
+    const response = await fetch(track.src);
+    track.accessible = response.ok;
+  }
+}
+
+// Check for transcript links near videos
+const transcriptLinks = Array.from(document.querySelectorAll('a[href*="transcript"]'));
+
+return {mediaInventory, transcriptLinks};
+```
+
+**Files to Create/Modify:**
+- `mcp/tools/media_validation.go` (new)
+- `mcp/server.go` (register new tool)
+- `docs/llm_ada_testing.md` (document usage)
+
+---
+
+#### 1.3 Hover/Focus Content Persistence Testing
+**Priority:** HIGH  
+**Effort:** 12-16 hours  
+**Solves:** WCAG 1.4.13 violations (tooltips, dropdowns, popovers)
+
+**Deliverables:**
+- New MCP tool: `web_hover_focus_test_cremotemcp_cremotemcp`
+- Identifies elements with hover/focus-triggered content
+- Tests dismissibility (Esc key)
+- Tests hoverability (can mouse move to triggered content)
+- Tests persistence (doesn't disappear immediately)
+
+**Technical Approach:**
+```javascript
+// 1. Find all elements with hover/focus handlers
+const interactiveElements = Array.from(document.querySelectorAll('*')).filter(el => {
+  const events = getEventListeners(el);
+  return events.mouseover || events.mouseenter || events.focus;
+});
+
+// 2. Test each element
+for (const el of interactiveElements) {
+  // Trigger hover
+  el.dispatchEvent(new MouseEvent('mouseover', {bubbles: true}));
+  await sleep(100);
+  
+  // Check for new content
+  const tooltip = document.querySelector('[role="tooltip"], .tooltip, .popover');
+  
+  if (tooltip) {
+    // Test dismissibility
+    document.dispatchEvent(new KeyboardEvent('keydown', {key: 'Escape'}));
+    const dismissed = !document.contains(tooltip);
+    
+    // Test hoverability
+    const rect = tooltip.getBoundingClientRect();
+    const hoverable = rect.width > 0 && rect.height > 0;
+    
+    // Test persistence
+    el.dispatchEvent(new MouseEvent('mouseout', {bubbles: true}));
+    await sleep(500);
+    const persistent = document.contains(tooltip);
+    
+    results.push({element: el, dismissed, hoverable, persistent});
+  }
+}
+```
+
+**Files to Create/Modify:**
+- `mcp/tools/hover_focus_test.go` (new)
+- `mcp/server.go` (register new tool)
+- `docs/llm_ada_testing.md` (document usage)
+
+---
+
+### **PHASE 2: EXPANSION (Weeks 3-4)**
+**Goal:** Add medium-complexity enhancements  
+**Effort:** 32-44 hours
+
+#### 2.1 Text-in-Images Detection (OCR)
+**Priority:** HIGH  
+**Effort:** 12-16 hours  
+**Solves:** WCAG 1.4.5 violations (images of text)
+
+**Deliverables:**
+- New MCP tool: `web_text_in_images_check_cremotemcp_cremotemcp`
+- Downloads all images from page
+- Runs Tesseract OCR on each image
+- Flags images containing significant text (>5 words)
+- Compares detected text with alt text
+- Excludes logos (configurable)
+
+**Technical Approach:**
+```bash
+# 1. Extract all image URLs
+images=$(console_command "Array.from(document.querySelectorAll('img')).map(img => ({src: img.src, alt: img.alt}))")
+
+# 2. Download each image to container
+for img in $images; do
+  curl -o /tmp/img_$i.png $img.src
+  
+  # 3. Run OCR
+  tesseract /tmp/img_$i.png /tmp/img_$i_text --psm 6
+  
+  # 4. Count words
+  word_count=$(wc -w < /tmp/img_$i_text.txt)
+  
+  # 5. If >5 words, flag for review
+  if [ $word_count -gt 5 ]; then
+    echo "WARNING: Image contains text ($word_count words)"
+    echo "Image: $img.src"
+    echo "Alt text: $img.alt"
+    echo "Detected text: $(cat /tmp/img_$i_text.txt)"
+    echo "MANUAL REVIEW: Verify if this should be HTML text instead"
+  fi
+done
+```
+
+**Dependencies:**
+- Tesseract OCR (install in container)
+- curl or wget for image download
+
+**Files to Create/Modify:**
+- `mcp/tools/text_in_images.go` (new)
+- `Dockerfile` (add tesseract-ocr)
+- `mcp/server.go` (register new tool)
+- `docs/llm_ada_testing.md` (document usage)
+
+---
+
+#### 2.2 Cross-Page Consistency Analysis
+**Priority:** MEDIUM  
+**Effort:** 16-24 hours  
+**Solves:** WCAG 3.2.3, 3.2.4 violations (consistent navigation/identification)
+
+**Deliverables:**
+- New MCP tool: `web_consistency_check_cremotemcp_cremotemcp`
+- Crawls multiple pages (configurable limit)
+- Extracts navigation structure from each page
+- Compares navigation order across pages
+- Identifies common elements (search, login, cart)
+- Verifies consistent labeling
+
+**Technical Approach:**
+```javascript
+// 1. Crawl site (limit to 20 pages for performance)
+const pages = [];
+const visited = new Set();
+
+async function crawlPage(url, depth = 0) {
+  if (depth > 2 || visited.has(url)) return;
+  visited.add(url);
+  
+  await navigateTo(url);
+  
+  pages.push({
+    url,
+    navigation: Array.from(document.querySelectorAll('nav a, header a')).map(a => ({
+      text: a.textContent.trim(),
+      href: a.href,
+      order: Array.from(a.parentElement.children).indexOf(a)
+    })),
+    commonElements: {
+      search: document.querySelector('[type="search"], [role="search"]')?.outerHTML,
+      login: document.querySelector('a[href*="login"]')?.textContent,
+      cart: document.querySelector('a[href*="cart"]')?.textContent
+    }
+  });
+  
+  // Find more pages
+  const links = Array.from(document.querySelectorAll('a[href]'))
+    .map(a => a.href)
+    .filter(href => href.startsWith(window.location.origin))
+    .slice(0, 10);
+  
+  for (const link of links) {
+    await crawlPage(link, depth + 1);
+  }
+}
+
+// 2. Analyze consistency
+const navOrders = pages.map(p => p.navigation.map(n => n.text).join('|'));
+const uniqueOrders = [...new Set(navOrders)];
+
+if (uniqueOrders.length > 1) {
+  // Navigation order varies - FAIL WCAG 3.2.3
+}
+
+// Check common element consistency
+const searchLabels = pages.map(p => p.commonElements.search).filter(Boolean);
+if (new Set(searchLabels).size > 1) {
+  // Search identified inconsistently - FAIL WCAG 3.2.4
+}
+```
+
+**Files to Create/Modify:**
+- `mcp/tools/consistency_check.go` (new)
+- `mcp/server.go` (register new tool)
+- `docs/llm_ada_testing.md` (document usage)
+
+---
+
+#### 2.3 Sensory Characteristics Detection (Pattern Matching)
+**Priority:** MEDIUM  
+**Effort:** 8-12 hours  
+**Solves:** WCAG 1.3.3 violations (instructions relying on sensory characteristics)
+
+**Deliverables:**
+- New MCP tool: `web_sensory_check_cremotemcp_cremotemcp`
+- Scans page text for sensory-only instructions
+- Flags phrases like "click the red button", "square icon", "on the right"
+- Uses regex pattern matching
+- Provides context for manual review
+
+**Technical Approach:**
+```javascript
+// Pattern matching for sensory-only instructions
+const sensoryPatterns = [
+  // Color-only
+  /click (the )?(red|green|blue|yellow|orange|purple|pink|gray|grey) (button|link|icon)/gi,
+  /the (red|green|blue|yellow|orange|purple|pink|gray|grey) (button|link|icon)/gi,
+  
+  // Shape-only
+  /(round|square|circular|rectangular|triangular) (button|icon|shape)/gi,
+  /click (the )?(circle|square|triangle|rectangle)/gi,
+  
+  // Position-only
+  /(on the |at the )?(left|right|top|bottom|above|below)/gi,
+  /button (on the |at the )?(left|right|top|bottom)/gi,
+  
+  // Size-only
+  /(large|small|big|little) (button|icon|link)/gi,
+  
+  // Sound-only
+  /when you hear (the )?(beep|sound|tone|chime)/gi
+];
+
+const pageText = document.body.innerText;
+const violations = [];
+
+for (const pattern of sensoryPatterns) {
+  const matches = pageText.matchAll(pattern);
+  for (const match of matches) {
+    // Get context (50 chars before and after)
+    const index = match.index;
+    const context = pageText.substring(index - 50, index + match[0].length + 50);
+    
+    violations.push({
+      text: match[0],
+      context,
+      pattern: pattern.source,
+      wcag: '1.3.3 Sensory Characteristics'
+    });
+  }
+}
+
+return violations;
+```
+
+**Files to Create/Modify:**
+- `mcp/tools/sensory_check.go` (new)
+- `mcp/server.go` (register new tool)
+- `docs/llm_ada_testing.md` (document usage)
+
+---
+
+### **PHASE 3: ADVANCED (Weeks 5-6)**
+**Goal:** Add complex but valuable enhancements  
+**Effort:** 24-32 hours
+
+#### 3.1 Animation & Flash Detection (Video Analysis)
+**Priority:** MEDIUM  
+**Effort:** 16-24 hours  
+**Solves:** WCAG 2.3.1 violations (three flashes or below threshold)
+
+**Deliverables:**
+- New MCP tool: `web_flash_detection_cremotemcp_cremotemcp`
+- Records page for 10 seconds using CDP screencast
+- Analyzes frames for brightness changes
+- Counts flashes per second
+- Flags if >3 flashes/second detected
+
+**Technical Approach:**
+```go
+// Use Chrome DevTools Protocol to capture screencast
+func (t *FlashDetectionTool) Execute(params map[string]interface{}) (interface{}, error) {
+    // 1. Start screencast
+    err := t.cdp.Page.StartScreencast(&page.StartScreencastArgs{
+        Format:    "png",
+        Quality:   80,
+        MaxWidth:  1280,
+        MaxHeight: 800,
+    })
+    
+    // 2. Collect frames for 10 seconds
+    frames := [][]byte{}
+    timeout := time.After(10 * time.Second)
+    
+    for {
+        select {
+        case frame := <-t.cdp.Page.ScreencastFrame:
+            frames = append(frames, frame.Data)
+        case <-timeout:
+            goto analyze
+        }
+    }
+    
+analyze:
+    // 3. Analyze brightness changes between consecutive frames
+    flashes := 0
+    for i := 1; i < len(frames); i++ {
+        brightness1 := calculateBrightness(frames[i-1])
+        brightness2 := calculateBrightness(frames[i])
+        
+        // If brightness change >20%, count as flash
+        if math.Abs(brightness2 - brightness1) > 0.2 {
+            flashes++
+        }
+    }
+    
+    // 4. Calculate flashes per second
+    flashesPerSecond := float64(flashes) / 10.0
+    
+    return map[string]interface{}{
+        "flashes_detected": flashes,
+        "flashes_per_second": flashesPerSecond,
+        "passes": flashesPerSecond <= 3.0,
+        "wcag": "2.3.1 Three Flashes or Below Threshold",
+    }, nil
+}
+```
+
+**Dependencies:**
+- Chrome DevTools Protocol screencast API
+- Image processing library (Go image package)
+
+**Files to Create/Modify:**
+- `mcp/tools/flash_detection.go` (new)
+- `mcp/server.go` (register new tool)
+- `docs/llm_ada_testing.md` (document usage)
+
+---
+
+#### 3.2 Enhanced Accessibility Tree Analysis
+**Priority:** MEDIUM  
+**Effort:** 8-12 hours  
+**Solves:** Better detection of ARIA issues, role/name/value problems
+
+**Deliverables:**
+- Enhance existing `get_accessibility_tree_cremotemcp_cremotemcp` tool
+- Add validation rules for common ARIA mistakes
+- Check for invalid role combinations
+- Verify required ARIA properties
+- Detect orphaned ARIA references
+
+**Technical Approach:**
+```javascript
+// Validate ARIA usage
+const ariaValidation = {
+  // Check for invalid role combinations
+  invalidRoles: Array.from(document.querySelectorAll('[role]')).filter(el => {
+    const role = el.getAttribute('role');
+    const validRoles = ['button', 'link', 'navigation', 'main', 'complementary', ...];
+    return !validRoles.includes(role);
+  }),
+  
+  // Check for required ARIA properties
+  missingProperties: Array.from(document.querySelectorAll('[role="button"]')).filter(el => {
+    return !el.hasAttribute('aria-label') && !el.textContent.trim();
+  }),
+  
+  // Check for orphaned aria-describedby/labelledby
+  orphanedReferences: Array.from(document.querySelectorAll('[aria-describedby], [aria-labelledby]')).filter(el => {
+    const describedby = el.getAttribute('aria-describedby');
+    const labelledby = el.getAttribute('aria-labelledby');
+    const id = describedby || labelledby;
+    return id && !document.getElementById(id);
+  })
+};
+```
+
+**Files to Create/Modify:**
+- `mcp/tools/accessibility_tree.go` (enhance existing)
+- `docs/llm_ada_testing.md` (document new validations)
+
+---
+
+## IMPLEMENTATION SCHEDULE
+
+### Week 1-2: Phase 1 Foundation
+- [ ] Day 1-3: Gradient contrast analysis (ImageMagick)
+- [ ] Day 4-6: Time-based media validation (basic)
+- [ ] Day 7-10: Hover/focus content testing
+
+### Week 3-4: Phase 2 Expansion
+- [ ] Day 11-14: Text-in-images detection (OCR)
+- [ ] Day 15-20: Cross-page consistency analysis
+- [ ] Day 21-23: Sensory characteristics detection
+
+### Week 5-6: Phase 3 Advanced
+- [ ] Day 24-30: Animation/flash detection
+- [ ] Day 31-35: Enhanced accessibility tree analysis
+
+### Week 7-8: Testing & Documentation
+- [ ] Day 36-40: Integration testing
+- [ ] Day 41-45: Documentation updates
+- [ ] Day 46-50: User acceptance testing
+
+---
+
+## TECHNICAL REQUIREMENTS
+
+### Container Dependencies
+```dockerfile
+# Add to Dockerfile
+RUN apt-get update && apt-get install -y \
+    imagemagick \
+    tesseract-ocr \
+    tesseract-ocr-eng \
+    && rm -rf /var/lib/apt/lists/*
+```
+
+### Go Dependencies
+```go
+// Add to go.mod
+require (
+    github.com/chromedp/cdproto v0.0.0-20231011050154-1d073bb38998
+    github.com/disintegration/imaging v1.6.2 // Image processing
+)
+```
+
+### Configuration
+```yaml
+# Add to cremote config
+automation_enhancements:
+  gradient_contrast:
+    enabled: true
+    sample_points: 100
+  
+  media_validation:
+    enabled: true
+    check_embedded_players: true
+    youtube_api_key: "" # Optional
+  
+  text_in_images:
+    enabled: true
+    min_word_threshold: 5
+    exclude_logos: true
+  
+  consistency_check:
+    enabled: true
+    max_pages: 20
+    max_depth: 2
+  
+  flash_detection:
+    enabled: true
+    recording_duration: 10
+    brightness_threshold: 0.2
+```
+
+---
+
+## SUCCESS METRICS
+
+### Coverage Targets
+- **Current:** 70% automated coverage
+- **After Phase 1:** 78% automated coverage (+8%)
+- **After Phase 2:** 83% automated coverage (+5%)
+- **After Phase 3:** 85% automated coverage (+2%)
+
+### Quality Metrics
+- **False Positive Rate:** <10%
+- **False Negative Rate:** <5%
+- **Test Execution Time:** <5 minutes per page
+- **Report Clarity:** 100% actionable findings
+
+### Performance Targets
+- Gradient contrast: <2 seconds per element
+- Media validation: <5 seconds per page
+- Text-in-images: <1 second per image
+- Consistency check: <30 seconds for 20 pages
+- Flash detection: 10 seconds (fixed recording time)
+
+---
+
+## RISK MITIGATION
+
+### Technical Risks
+1. **ImageMagick performance on large images**
+   - Mitigation: Resize images before analysis
+   - Fallback: Skip images >5MB
+
+2. **Tesseract OCR accuracy**
+   - Mitigation: Set confidence threshold
+   - Fallback: Flag low-confidence results for manual review
+
+3. **CDP screencast reliability**
+   - Mitigation: Implement retry logic
+   - Fallback: Skip flash detection if screencast fails
+
+4. **Cross-page crawling performance**
+   - Mitigation: Limit to 20 pages, depth 2
+   - Fallback: Allow user to specify page list
+
+### Operational Risks
+1. **Container size increase**
+   - Mitigation: Use multi-stage Docker builds
+   - Monitor: Keep container <500MB
+
+2. **Increased test execution time**
+   - Mitigation: Make all enhancements optional
+   - Allow: Users to enable/disable specific tests
+
+---
+
+## DELIVERABLES
+
+### Code
+- [ ] 6 new MCP tools (gradient, media, hover, OCR, consistency, flash)
+- [ ] 1 enhanced tool (accessibility tree)
+- [ ] Updated Dockerfile with dependencies
+- [ ] Updated configuration schema
+- [ ] Integration tests for all new tools
+
+### Documentation
+- [ ] Updated `docs/llm_ada_testing.md` with new tools
+- [ ] Updated `enhanced_chromium_ada_checklist.md` with automation notes
+- [ ] New `docs/AUTOMATION_TOOLS.md` with technical details
+- [ ] Updated README with new capabilities
+- [ ] Example usage for each new tool
+
+### Testing
+- [ ] Unit tests for each new tool
+- [ ] Integration tests with real websites
+- [ ] Performance benchmarks
+- [ ] Accuracy validation against manual testing
+
+---
+
+## MAINTENANCE PLAN
+
+### Ongoing Support
+- Monitor false positive/negative rates
+- Update pattern matching rules (sensory characteristics)
+- Keep dependencies updated (ImageMagick, Tesseract)
+- Add new ARIA validation rules as spec evolves
+
+### Future Enhancements (Post-Plan)
+- LLM-assisted semantic analysis (if budget allows)
+- Speech-to-text caption validation (if external service available)
+- Real-time live caption testing (if streaming infrastructure added)
+- Advanced video content analysis (if AI/ML resources available)
+
+---
+
+## APPROVAL & SIGN-OFF
+
+**Plan Status:** READY FOR APPROVAL
+
+**Estimated Total Effort:** 84-112 hours (10-14 business days)
+
+**Estimated Timeline:** 6-8 weeks (with testing and documentation)
+
+**Budget Impact:** Minimal (only open-source dependencies)
+
+**Risk Level:** LOW (all technologies proven and stable)
+
+---
+
+**Next Steps:**
+1. Review and approve this plan
+2. Set up development environment with new dependencies
+3. Begin Phase 1 implementation
+4. Schedule weekly progress reviews
+
+---
+
+**Document Prepared By:** Cremote Development Team  
+**Date:** October 2, 2025  
+**Version:** 1.0
+