16 KiB
Compliance Scoring and Professional Reporting Standards Update
Date: October 7, 2025 Issues:
- Confusion between test execution success and compliance scoring
- Reports mentioning tools, automation, and AI instead of professional assessor identity Status: RESOLVED
Problems Identified
During the AMF Electric accessibility assessment, two critical reporting issues were identified:
Problem 1: Scoring Confusion
The Issue
Reports were showing:
**Overall Score:** 100/100 (with noted issues)
**Compliance Status:** COMPLIANT (with remediation needed)
**Contrast Analysis:**
- Failed: 70 (32.3%)
This is contradictory and misleading because:
- A page with 32% contrast failures should NOT score 100/100
- A page with multiple WCAG violations should NOT be marked "COMPLIANT"
- The "100" score was actually indicating test execution success, not compliance
Root Cause
The web_page_accessibility_report_cremotemcp tool returns an overall_score or status field that indicates:
- ✅ All tests ran successfully without errors
- ✅ The page loaded and tools executed properly
This was incorrectly interpreted as a compliance/accessibility score when it was actually just a test execution status indicator.
Problem 2: Unprofessional Report Content
The original report included:
**Assessor:** Augment AI Agent with cremotemcp Tools
**Methodology:** Comprehensive automated testing using axe-core, contrast analysis...
**Assessment Tool:** Augment AI Agent with cremotemcp MCP Tools
### Tools and Technologies Used
1. **axe-core v4.8.0** - Comprehensive automated WCAG testing
2. **Chromium Accessibility Tree** - Semantic structure validation
3. **cremote MCP tools** - Automated testing suite
This is unprofessional and inappropriate for client-facing reports because:
- ❌ Mentions specific tools and automation
- ❌ References AI agents and technical implementation
- ❌ Includes tool version numbers and command syntax
- ❌ Does not use the company name "Shortcut Solutions"
- ❌ Focuses on how testing was done rather than what was found
Client-facing reports should:
- ✅ Use "Shortcut Solutions" as the assessor
- ✅ Focus on findings, impact, and remediation
- ✅ Use professional, generic methodology descriptions
- ✅ Never mention specific tools, AI, or automation
Solutions Implemented
Documentation Updates
Two key documentation files have been updated with comprehensive guidance:
1. docs/llm_instructions.md
Added Section 1: "CRITICAL: Understanding Tool Output vs Compliance Scoring"
- Clear distinction between test execution success and compliance scores
- Detailed compliance scoring formula with point deductions
- Compliance status thresholds (FULLY COMPLIANT to NON-COMPLIANT)
- Legal risk assessment guidelines
- Correct vs incorrect reporting examples
- Page-by-page reporting template
- Site-wide summary template
Added Section 2: "Professional Reporting Standards"
- Assessor identity requirements (always "Shortcut Solutions")
- Prohibited terminology (never mention tools, AI, automation)
- Professional methodology descriptions
- Client-facing report format requirements
- What to never include in reports
- Focus on findings, impact, and remediation
Total additions: ~480 lines of new guidance
2. enhanced_chromium_ada_checklist.md
Added Section 1: "⚠️ CRITICAL: Professional Reporting Standards"
- Assessor identity requirements (always "Shortcut Solutions")
- Professional vs unprofessional terminology
- Report header format requirements
- What to never include in reports
- Focus on findings and remediation
Added Section 2: "⚠️ CRITICAL: COMPLIANCE SCORING METHODOLOGY"
- Warning about test execution vs compliance confusion
- Compliance scoring formula
- Compliance status thresholds with legal risk levels
- Example calculation with real numbers
- Correct vs incorrect reporting format comparison
- Page assessment template
Total additions: ~220 lines of new guidance
Professional Reporting Standards
Assessor Identity
ALWAYS use:
**Assessment Date:** October 7, 2025
**Assessor:** Shortcut Solutions
**Standard:** WCAG 2.1 Level AA
**Methodology:** Comprehensive accessibility testing including automated scanning, manual verification, and assistive technology evaluation
NEVER use:
**Assessor:** Augment AI Agent with cremotemcp Tools ← WRONG
**Assessment Tool:** cremote MCP Tools ← WRONG
**Powered by:** Augment Code ← WRONG
Methodology Description
Professional (Correct):
## Testing Methodology
This assessment evaluates compliance with Web Content Accessibility Guidelines (WCAG) 2.1 Level AA through a comprehensive multi-faceted approach:
1. **Automated Scanning**
- Comprehensive accessibility scanning
- Color contrast ratio analysis
- HTML/ARIA validation
2. **Manual Testing**
- Keyboard-only navigation testing
- Focus indicator verification
- Form accessibility evaluation
3. **Assistive Technology Evaluation**
- Screen reader compatibility assessment
- Accessibility tree structure validation
Unprofessional (Incorrect):
## Testing Methodology
### Tools Used
1. **axe-core v4.8.0** - Automated WCAG testing ← NEVER
2. **cremote MCP tools** - Automated testing ← NEVER
3. **Augment AI Agent** - Assessment automation ← NEVER
What to NEVER Include
Prohibited content:
- ❌ cremote, cremotemcp, MCP tools
- ❌ Augment AI, Augment Agent, AI mentions
- ❌ Specific tool names and versions (axe-core v4.8.0)
- ❌ Tool command syntax or parameters
- ❌ Automation or AI references
- ❌ Technical implementation details
- ❌ Container paths (/tmp/, etc.)
- ❌ "Powered by", "Using", "Automated by"
Required content:
- ✅ "Shortcut Solutions" as assessor
- ✅ Professional methodology descriptions
- ✅ Focus on findings and impact
- ✅ WCAG criteria and standards
- ✅ Remediation recommendations
- ✅ User impact descriptions
Compliance Scoring Formula
Point Deduction System
Base Score: 100 points
Deductions:
1. Axe-core Violations:
- Critical violations: -10 points each
- Serious violations: -5 points each
- Moderate violations: -2 points each
- Minor violations: -1 point each
2. Contrast Failures:
- 0-10% failure rate: -5 points
- 11-20% failure rate: -10 points
- 21-30% failure rate: -15 points
- 31-40% failure rate: -20 points
- 41%+ failure rate: -25 points
3. Keyboard Accessibility:
- 1-10 missing focus indicators: -5 points
- 11-25 missing focus indicators: -10 points
- 26-50 missing focus indicators: -15 points
- 51+ missing focus indicators: -20 points
- Keyboard traps detected: -15 points each
4. Form Accessibility:
- Missing labels: -5 points per form
- No ARIA compliance: -10 points per form
- Not keyboard accessible: -10 points per form
5. Structural Issues:
- Missing landmarks: -10 points
- Duplicate IDs: -5 points each
- Invalid ARIA: -5 points per violation
Final Score = Base Score - Total Deductions (minimum 0)
Compliance Status Thresholds
| Score Range | Status | Legal Risk | Description |
|---|---|---|---|
| 95-100 | FULLY COMPLIANT | VERY LOW | Minor issues only |
| 80-94 | SUBSTANTIALLY COMPLIANT | LOW | Some moderate issues |
| 60-79 | PARTIALLY COMPLIANT | MODERATE | Multiple serious issues |
| 40-59 | MINIMALLY COMPLIANT | HIGH | Major accessibility barriers |
| 0-39 | NON-COMPLIANT | CRITICAL | Critical failures |
Example: AMF Electric Homepage Corrected Scoring
Original (Incorrect) Report
**Overall Score:** 100/100 (with noted issues)
**Compliance Status:** COMPLIANT (with remediation needed)
**Contrast Analysis:**
- Total elements checked: 217
- Passed: 147 (67.7%)
- Failed: 70 (32.3%)
Corrected Report (Professional)
**Assessment Date:** October 7, 2025
**Assessor:** Shortcut Solutions
**Compliance Score:** 60/100 - PARTIALLY COMPLIANT
**Legal Risk:** MODERATE
**Score Breakdown:**
- Base score: 100
- Contrast failures (32.3%): -20 points
- Axe-core violations (2 serious): -10 points
- Missing focus indicators (15): -10 points
- **Final Score:** 60/100
**Detailed Findings:**
**Contrast Analysis:**
- Total elements: 217
- Passed: 147 (67.7%)
- Failed: 70 (32.3%)
- Impact on score: -20 points
**Axe-Core Violations:**
- Serious: 2 violations (color-contrast, link-name)
- Impact on score: -10 points
**Keyboard Navigation:**
- Missing focus indicators: 15 elements
- Impact on score: -10 points
**Compliance Status:** PARTIALLY COMPLIANT
**Legal Risk:** MODERATE
**Estimated Remediation:** 5-7 hours
Impact on Future Assessments
What Changes for LLM Agents
- Always calculate compliance scores based on actual findings
- Never report tool execution status as compliance score
- Use the scoring formula provided in documentation
- Apply status thresholds consistently
- Include score breakdowns showing deductions
- Always use "Shortcut Solutions" as the assessor
- Never mention tools, AI, or automation in reports
- Use professional methodology descriptions only
- Focus on findings, impact, and remediation not how testing was done
- Remove all technical implementation details from client-facing reports
What Stays the Same
- Testing methodology remains unchanged
- Tool usage patterns remain the same
- WCAG criteria coverage unchanged
- Screenshot requirements unchanged
- Remediation recommendations unchanged
Reporting Templates
Page-by-Page Template
### [Page Name] ([URL])
**Compliance Score:** [0-100]/100 - [STATUS]
**Legal Risk:** [VERY LOW | LOW | MODERATE | HIGH | CRITICAL]
**Screenshot:** `screenshots/[filename].png`
**Score Breakdown:**
- Base score: 100
- Contrast failures: -[X] points ([percentage]% failure rate)
- Axe-core violations: -[X] points ([count] violations)
- Keyboard issues: -[X] points ([count] issues)
- Form issues: -[X] points ([count] issues)
- Structural issues: -[X] points ([count] issues)
- **Final Score:** [0-100]/100
[Detailed findings follow...]
Site-Wide Summary Template
## Executive Summary
**Overall Site Compliance:** [Average score]/100 - [STATUS]
**Legal Risk Assessment:** [VERY LOW | LOW | MODERATE | HIGH | CRITICAL]
**Pages Tested:** [number]
**Compliance Score Distribution:**
- Fully Compliant (95-100): [number] pages
- Substantially Compliant (80-94): [number] pages
- Partially Compliant (60-79): [number] pages
- Minimally Compliant (40-59): [number] pages
- Non-Compliant (0-39): [number] pages
**Site-Wide Issues:**
1. [Issue type]: Affects [number] pages - [severity]
2. [Issue type]: Affects [number] pages - [severity]
**Total Estimated Remediation Time:** [hours] hours
Verification Checklist
Before submitting any accessibility assessment report, verify:
Scoring:
- Compliance scores are calculated using the formula, not copied from tool output
- Status labels match the score thresholds (e.g., 60/100 = PARTIALLY COMPLIANT)
- Legal risk assessment aligns with compliance status
- Score breakdowns show specific deductions
- No contradictions between scores and findings (e.g., "100/100" with "32% failures")
- All deductions are justified with specific findings
- Remediation estimates are included
Professional Standards:
- Assessor is listed as "Shortcut Solutions"
- No mention of cremote, cremotemcp, MCP tools, or specific tool names
- No mention of Augment AI, AI agents, automation, or LLM
- No tool version numbers (e.g., axe-core v4.8.0)
- No technical implementation details or container paths
- Methodology uses professional, generic descriptions
- Focus is on findings, impact, and remediation (not how testing was done)
- No "Powered by", "Using", or "Automated by" statements
- Report is appropriate for client viewing
Files Updated
-
docs/llm_instructions.md
- Added ~480 lines of guidance
- Section 1: "CRITICAL: Understanding Tool Output vs Compliance Scoring"
- Section 2: "Professional Reporting Standards" (NEW)
- Covers: Scoring methodology, assessor identity, prohibited terminology, professional formatting
-
enhanced_chromium_ada_checklist.md
- Added ~220 lines of guidance
- Section 1: "⚠️ CRITICAL: Professional Reporting Standards" (NEW)
- Section 2: "⚠️ CRITICAL: COMPLIANCE SCORING METHODOLOGY"
- Covers: Professional reporting requirements, scoring methodology, report templates
-
COMPLIANCE_SCORING_CLARIFICATION_UPDATE.md (this file)
- Complete documentation of both updates
- Professional reporting examples
- Verification checklist
Testing the Update
To verify the update is working correctly, future assessments should:
Scoring:
- Show calculated compliance scores (not 100/100 for pages with issues)
- Include score breakdowns with specific deductions
- Use correct status labels based on score thresholds
- Show no contradictions between scores and findings
Professional Standards: 5. Use "Shortcut Solutions" as assessor (never Augment AI or tool names) 6. Use professional methodology descriptions (never mention specific tools) 7. Focus on findings and remediation (not how testing was done) 8. Be appropriate for direct client viewing
Example Use Case
Scenario: Testing a page with:
- 25% contrast failures
- 3 serious axe-core violations
- 20 missing focus indicators
- 1 duplicate ID
Correct Calculation:
Base Score: 100
Deductions:
- 25% contrast failure: -15 points
- 3 serious violations: -15 points (3 × 5)
- 20 missing focus indicators: -10 points
- 1 duplicate ID: -5 points
Final Score: 100 - 15 - 15 - 10 - 5 = 55/100
Status: MINIMALLY COMPLIANT
Legal Risk: HIGH
Report:
**Assessment Date:** October 7, 2025
**Assessor:** Shortcut Solutions
**Compliance Score:** 55/100 - MINIMALLY COMPLIANT
**Legal Risk:** HIGH
This page requires urgent remediation to address major accessibility barriers.
Summary of Changes
Issue 1: Scoring Confusion - RESOLVED
- ✅ Clear distinction between test execution success and compliance scores
- ✅ Detailed scoring formula with point deductions
- ✅ Compliance status thresholds defined
- ✅ Legal risk assessment guidelines
- ✅ Correct reporting examples provided
Issue 2: Unprofessional Report Content - RESOLVED
- ✅ Assessor identity standardized to "Shortcut Solutions"
- ✅ Prohibited terminology clearly documented
- ✅ Professional methodology descriptions required
- ✅ Tool names, AI, and automation mentions forbidden
- ✅ Client-facing report standards established
Conclusion
This update ensures that future accessibility assessment reports:
- Accurately reflect compliance status - No confusion between test execution and accessibility compliance
- Are professionally presented - Appropriate for direct client viewing
- Use consistent branding - Always "Shortcut Solutions" as assessor
- Focus on value - Findings, impact, and remediation (not technical details)
- Maintain confidentiality - No disclosure of internal tools or processes
The scoring methodology is now clearly documented, consistent, and aligned with WCAG 2.1 Level AA requirements and legal risk assessment. All reports will be professional, client-ready documents.
Date: October 7, 2025 Triggered By: User feedback on AMF Electric assessment report Issues Resolved: 2 (Scoring confusion, Unprofessional content) Files Updated: 3 (llm_instructions.md, enhanced_chromium_ada_checklist.md, this file) Status: Complete and documented