Clean up documentation and remove temporary files

- Remove all phase completion summaries and temporary development docs
- Remove test files and backup directories
- Update README.md to document select dropdown functionality
- Add comprehensive select action documentation with examples
- Clean project structure for production readiness

The project now has clean, production-ready documentation with:
- Main README with complete CLI documentation including select actions
- MCP server documentation with 27 comprehensive tools
- LLM usage guides and best practices
- All temporary and ancillary files removed
This commit is contained in:
Josh at WLTechBlog 2025-08-19 10:15:11 -05:00
parent 63860db70b
commit 1651c4312e
36 changed files with 15 additions and 5161 deletions

View File

@ -1,371 +0,0 @@
# Cremote MCP Server Enhancement Plan
## Overview
This plan outlines the implementation of enhanced capabilities for the cremote MCP server to make it more powerful for LLM-driven web automation workflows. The enhancements are organized into 6 phases, each building upon the previous ones.
## 🎉 **STATUS UPDATE - Phase 5 COMPLETE!**
**Date Completed**: August 16, 2025
**Session**: Phase 5 implementation session
**Phase 1: Element State and Checking Tools** - **COMPLETED**
- All daemon commands implemented and tested
- Client methods added and functional
- MCP tools created and documented
- Comprehensive documentation updated
- Ready for production use
**Phase 2: Enhanced Data Extraction Tools** - **COMPLETED**
- All daemon commands implemented (extract-multiple, extract-links, extract-table, extract-text)
- Client methods added and functional
- MCP tools created and documented
- Comprehensive documentation updated
- Ready for production use
**Phase 3: Form Analysis and Bulk Operations** - **COMPLETED**
- All daemon commands implemented (analyze-form, interact-multiple, fill-form-bulk)
- Client methods added and functional (AnalyzeForm, InteractMultiple, FillFormBulk)
- MCP tools created and documented (web_form_analyze_cremotemcp, web_interact_multiple_cremotemcp, web_form_fill_bulk_cremotemcp)
- Comprehensive documentation updated
- Test assets created for validation
- Ready for production use
- **See `PHASE3_COMPLETION_SUMMARY.md` for detailed implementation report**
**Phase 4: Page State and Metadata Tools** - **COMPLETED**
- All daemon commands implemented (get-page-info, get-viewport-info, get-performance, check-content)
- Client methods added and functional (GetPageInfo, GetViewportInfo, GetPerformance, CheckContent)
- MCP tools created and documented (web_page_info_cremotemcp, web_viewport_info_cremotemcp, web_performance_metrics_cremotemcp, web_content_check_cremotemcp)
- Comprehensive documentation updated
- Rich page state and metadata capabilities delivered
- Ready for production use
- **See `PHASE4_COMPLETION_SUMMARY.md` for detailed implementation report**
**Phase 5: Enhanced Screenshot and File Management** - **COMPLETED**
- All daemon commands implemented (screenshot-element, screenshot-enhanced, bulk-files, manage-files)
- Client methods added and functional (ScreenshotElement, ScreenshotEnhanced, BulkFiles, ManageFiles)
- MCP tools created and documented (web_screenshot_element_cremotemcp, web_screenshot_enhanced_cremotemcp, file_operations_bulk_cremotemcp, file_management_cremotemcp)
- Comprehensive documentation updated
- Enhanced screenshot and file management capabilities delivered
- Ready for production use
- **See `PHASE5_COMPLETION_SUMMARY.md` for detailed implementation report**
🎉 **All Phases Complete**: Comprehensive web automation platform ready for production
## Implementation Strategy
### Key Principles
- **LLM-Friendly**: Design tools that work well with LLM timing characteristics (avoid wait-navigation issues)
- **Batch Operations**: Reduce round trips by allowing multiple operations in single calls
- **Rich Data Extraction**: Provide structured data that LLMs can easily process
- **Conditional Logic**: Enable element checking without interaction for better flow control
- **Backward Compatibility**: All existing tools continue to work unchanged
### Architecture Changes
Each new tool requires changes at three levels:
1. **Daemon Layer** (`daemon/daemon.go`): Add new command handlers
2. **Client Layer** (`client/client.go`): Add new methods for daemon communication
3. **MCP Layer** (`mcp/main.go`): Add new MCP tool definitions
## Phase 1: Element State and Checking Tools ✅ **COMPLETED**
**Priority: HIGH** - Enables conditional logic without timing issues
**Status**: ✅ **COMPLETE** - August 16, 2025
### ✅ Implemented Tools
- `web_element_check_cremotemcp`: Check existence, visibility, enabled state, count elements
- `web_element_attributes_cremotemcp`: Get attributes, properties, computed styles
### ✅ Implementation Completed
- ✅ Added daemon commands: `check-element`, `get-element-attributes`, `count-elements`
- ✅ Support multiple check types: exists, visible, enabled, focused, selected
- ✅ Return structured data with boolean results and element counts
- ✅ Handle timeout gracefully (element not found vs. timeout error)
- ✅ Client methods: `CheckElement()`, `GetElementAttributes()`, `CountElements()`
- ✅ MCP tools with comprehensive parameter validation
- ✅ Full documentation updates (README, LLM Guide, Quick Reference)
### ✅ Benefits Delivered
- ✅ LLMs can make decisions based on page state
- ✅ Prevents errors from trying to interact with non-existent elements
- ✅ Enables conditional workflows
- ✅ Rich element inspection for debugging
- ✅ Foundation for advanced automation patterns
### 📁 Implementation Files
- `daemon/daemon.go`: Lines 557-620 (command handlers), Lines 2118-2420 (methods)
- `client/client.go`: Lines 814-953 (new client methods)
- `mcp/main.go`: Lines 806-931 (new MCP tools)
- Documentation: `mcp/README.md`, `mcp/LLM_USAGE_GUIDE.md`, `mcp/QUICK_REFERENCE.md`
- Summary: `PHASE1_COMPLETION_SUMMARY.md`
## Phase 2: Enhanced Data Extraction Tools ✅ **COMPLETED**
**Priority: HIGH** - Dramatically improves data gathering efficiency
**Status**: ✅ **COMPLETE** - August 16, 2025
### ✅ Implemented Tools
- `web_extract_multiple_cremotemcp`: Extract from multiple selectors in one call
- `web_extract_links_cremotemcp`: Extract all links with filtering options
- `web_extract_table_cremotemcp`: Extract table data as structured JSON
- `web_extract_text_cremotemcp`: Extract text with pattern matching
### ✅ Implementation Completed
- ✅ Added daemon commands: `extract-multiple`, `extract-links`, `extract-table`, `extract-text`
- ✅ Support CSS selector maps for batch extraction
- ✅ Return structured JSON with labeled results
- ✅ Include link filtering by href patterns, domain, or text content
- ✅ Table extraction preserves headers and data types
- ✅ Client methods: `ExtractMultiple()`, `ExtractLinks()`, `ExtractTable()`, `ExtractText()`
- ✅ MCP tools with comprehensive parameter validation
- ✅ Full documentation updates (README, LLM Guide, Quick Reference)
### ✅ Benefits Delivered
- ✅ Reduces multiple round trips to single calls
- ✅ Provides structured data ready for LLM processing
- ✅ Enables comprehensive page analysis
- ✅ Rich link extraction with filtering capabilities
- ✅ Structured table data extraction
- ✅ Pattern-based text extraction
### 📁 Implementation Files
- `daemon/daemon.go`: Lines 620-703 (command handlers), Lines 2542-2937 (methods)
- `client/client.go`: Lines 824-857 (data structures), Lines 989-1282 (client methods)
- `mcp/main.go`: Lines 933-1199 (new MCP tools)
- Documentation: `mcp/README.md`, `mcp/LLM_USAGE_GUIDE.md`, `mcp/QUICK_REFERENCE.md`
## Phase 3: Form Analysis and Bulk Operations ✅ **COMPLETED**
**Priority: MEDIUM** - Streamlines form handling workflows
**Status**: ✅ **COMPLETE** - August 16, 2025
### ✅ Implemented Tools
- `web_form_analyze_cremotemcp`: Analyze forms completely
- `web_interact_multiple_cremotemcp`: Batch interactions
- `web_form_fill_bulk_cremotemcp`: Fill entire forms with key-value pairs
### ✅ Implementation Completed
- ✅ Added daemon commands: `analyze-form`, `interact-multiple`, `fill-form-bulk`
- ✅ Form analysis returns all fields, current values, validation state, submission info
- ✅ Bulk operations support arrays of selector-value pairs with detailed error reporting
- ✅ Comprehensive error handling for partial failures
- ✅ Smart field detection with multiple selector strategies
- ✅ Complete documentation and test assets
### ✅ Benefits Delivered
- **10x efficiency**: Complete forms in 1-2 calls instead of 10+
- **Form intelligence**: Complete form understanding before interaction
- **Error prevention**: Validate fields exist before attempting to fill
- **Batch operations**: Multiple interactions in single calls
- **Rich context**: Comprehensive form analysis for better LLM decision making
### ✅ Files Modified
- `daemon/daemon.go`: Lines 684-769 (command handlers), Lines 3000-3465 (methods)
- `client/client.go`: Lines 852-919 (data structures), Lines 1343-1626 (client methods)
- `mcp/main.go`: Lines 1198-1433 (new MCP tools)
- Documentation: `mcp/README.md`, `mcp/LLM_USAGE_GUIDE.md`, `mcp/QUICK_REFERENCE.md`
- **Completion Summary**: `PHASE3_COMPLETION_SUMMARY.md`
## Phase 4: Page State and Metadata Tools ✅ **COMPLETED**
**Priority: MEDIUM** - Provides rich context about page state
**Status**: ✅ **COMPLETE** - August 16, 2025
### ✅ Implemented Tools
- `web_page_info_cremotemcp`: Get page metadata and loading state
- `web_viewport_info_cremotemcp`: Get viewport and scroll information
- `web_performance_metrics_cremotemcp`: Get performance data
- `web_content_check_cremotemcp`: Check for specific content types
### ✅ Implementation Completed
- ✅ Added daemon commands: `get-page-info`, `get-viewport-info`, `get-performance`, `check-content`
- ✅ Page info includes title, URL, loading state, document ready state, domain, protocol
- ✅ Performance metrics include load times, resource counts, memory usage, paint metrics
- ✅ Content checking for images loaded, scripts executed, forms, links, errors
- ✅ Client methods: `GetPageInfo()`, `GetViewportInfo()`, `GetPerformance()`, `CheckContent()`
- ✅ MCP tools with comprehensive parameter validation
- ✅ Full documentation updates (README, LLM Guide, Quick Reference)
### ✅ Benefits Delivered
- ✅ Better debugging and monitoring capabilities
- ✅ Performance optimization insights
- ✅ Content loading verification
- ✅ Rich page state context for LLM decision making
### 📁 Implementation Files
- `daemon/daemon.go`: Lines 767-844 (command handlers), Lines 3607-4054 (methods)
- `client/client.go`: Lines 920-975 (data structures), Lines 1690-1973 (client methods)
- `mcp/main.go`: Lines 1429-1644 (new MCP tools)
- Documentation: `mcp/README.md`, `mcp/LLM_USAGE_GUIDE.md`, `mcp/QUICK_REFERENCE.md`
- Summary: `PHASE4_COMPLETION_SUMMARY.md`
## Phase 5: Enhanced Screenshot and File Management ✅ **COMPLETED**
**Priority: LOW** - Improves debugging and file handling
**Status**: ✅ **COMPLETE** - August 16, 2025
### ✅ Implemented Tools
- `web_screenshot_element_cremotemcp`: Screenshot specific elements
- `web_screenshot_enhanced_cremotemcp`: Screenshots with metadata
- `file_operations_bulk_cremotemcp`: Bulk file operations
- `file_management_cremotemcp`: Temporary file cleanup
### ✅ Implementation Completed
- ✅ Added daemon commands: `screenshot-element`, `screenshot-enhanced`, `bulk-files`, `manage-files`
- ✅ Element screenshots with automatic sizing and positioning
- ✅ Enhanced screenshots include timestamp, viewport size, URL metadata
- ✅ Bulk file operations for multiple uploads/downloads
- ✅ Automatic cleanup of temporary files
- ✅ Client methods: `ScreenshotElement()`, `ScreenshotEnhanced()`, `BulkFiles()`, `ManageFiles()`
- ✅ MCP tools with comprehensive parameter validation
- ✅ Full documentation updates (README, LLM Guide, Quick Reference)
### ✅ Benefits Delivered
- ✅ Better debugging with targeted screenshots
- ✅ Improved file handling workflows
- ✅ Automatic resource management
- ✅ Enhanced visual debugging capabilities
- ✅ Efficient bulk file operations
### 📁 Implementation Files
- `daemon/daemon.go`: Lines 858-923 (command handlers), Lines 4137-4658 (methods)
- `client/client.go`: Lines 984-1051 (data structures), Lines 2045-2203 (client methods)
- `mcp/main.go`: Lines 1647-1956 (new MCP tools)
- Documentation: `mcp/README.md`, `mcp/LLM_USAGE_GUIDE.md`, `mcp/QUICK_REFERENCE.md`
- Summary: `PHASE5_COMPLETION_SUMMARY.md`
**Phase 6: Testing and Documentation** - **COMPLETED**
**Priority: HIGH** - Ensures quality and usability
**Status**: ✅ **COMPLETE** - August 17, 2025
### ✅ Deliverables Completed
- ✅ Comprehensive documentation updates for all 27 tools
- ✅ Updated README.md with complete tool categorization and examples
- ✅ Enhanced LLM_USAGE_GUIDE.md with advanced workflows and best practices
- ✅ Updated QUICK_REFERENCE.md with efficiency tips and production guidelines
- ✅ Created WORKFLOW_EXAMPLES.md with 9 comprehensive workflow examples
- ✅ Created PERFORMANCE_BEST_PRACTICES.md with optimization guidelines
- ✅ Updated version to 2.0.0 reflecting completion of all enhancement phases
- ✅ Production readiness documentation and deployment guidelines
### ✅ Documentation Strategy Completed
- ✅ Complete coverage of all 27 tools with examples and parameters
- ✅ LLM-optimized documentation designed for AI agent consumption
- ✅ Performance benchmarks and 10x efficiency metrics documented
- ✅ Real-world workflow examples for common automation tasks
- ✅ Comprehensive best practices for production deployment
**Note**: Testing will be performed after build and deployment as specified.
## Implementation Order
### ✅ Session 1: Foundation (Phase 1) - COMPLETED
1. ✅ Element checking daemon commands
2. ✅ Client methods for element checking
3. ✅ MCP tools for element state checking
4. ✅ Basic tests and documentation
5. ✅ Comprehensive documentation updates
**Result**: Phase 1 fully implemented and ready for production use.
### ✅ Session 2: Data Extraction (Phase 2) - COMPLETED
1. ✅ Enhanced extraction daemon commands
2. ✅ Client methods for data extraction
3. ✅ MCP tools for multiple data extraction
4. ✅ Implementation validation
5. ✅ Documentation updates
### 🎯 Session 3: Forms and Bulk Ops (Phase 3) - NEXT SESSION
1. Form analysis and bulk operation daemon commands
2. Client methods for forms and bulk operations
3. MCP tools for form handling
4. Tests and documentation
### Session 4: Page State (Phase 4)
1. Page state daemon commands
2. Client methods for page information
3. MCP tools for page metadata
4. Tests and examples
### Session 5: Screenshots and Files (Phase 5)
1. Enhanced screenshot and file daemon commands
2. Client methods for advanced file operations
3. MCP tools for screenshots and file management
4. Tests and optimization
### Session 6: Polish and Documentation (Phase 6)
1. Comprehensive testing
2. Documentation updates
3. Usage examples and guides
4. Performance optimization
## Expected Impact
### ✅ Phase 1 Impact Achieved
**For LLMs:**
- ✅ **Better Decision Making**: Element checking enables conditional logic
- ✅ **Fewer Errors**: State checking prevents interaction failures
- ✅ **Rich Context**: Detailed element information for debugging
**For Developers:**
- ✅ **More Reliable**: Robust error handling and state checking
- ✅ **Better Debugging**: Enhanced element inspection capabilities
- ✅ **Foundation Built**: Ready for advanced automation patterns
### ✅ Phase 2 Impact Achieved
**For LLMs:**
- ✅ **Reduced Round Trips**: Batch operations minimize API calls
- ✅ **Rich Context**: Enhanced data extraction provides better understanding
- ✅ **Structured Data**: JSON responses ready for processing
- ✅ **Pattern Matching**: Built-in regex support for text extraction
**For Developers:**
- ✅ **Faster Automation**: Bulk operations speed up workflows
- ✅ **Better Data Extraction**: Comprehensive extraction capabilities
- ✅ **Flexible Filtering**: Advanced filtering options for links and content
- ✅ **Foundation Built**: Ready for Phase 3 form and bulk operations
### 🎯 Phase 3+ Expected Impact
**For LLMs:**
- **Form Intelligence**: Complete form analysis and bulk filling
- **Bulk Operations**: Multiple interactions in single calls
**For Developers:**
- **Better Debugging**: Enhanced screenshots and logging
- **Easier Testing**: Comprehensive test coverage
## Success Metrics
- ✅ **Phase 1 Success**: Element checking tools implemented and documented
- ✅ **Phase 2 Success**: Enhanced data extraction tools implemented and documented
- ✅ **Phase 3 Success**: Form analysis and bulk operations implemented and documented
- ✅ **Efficiency Goal**: 10x reduction in MCP tool calls for form workflows achieved
- ✅ **Overall Goal**: Comprehensive web automation capabilities delivered
- 🎯 **User Feedback**: Ready for production validation
## 🎉 **FINAL STATUS - ALL PHASES COMPLETE!**
**Phase 1 Status**: ✅ **COMPLETE** - All tools implemented, tested, and documented
**Phase 2 Status**: ✅ **COMPLETE** - All tools implemented, tested, and documented
**Phase 3 Status**: ✅ **COMPLETE** - All tools implemented, tested, and documented
**Phase 4 Status**: ✅ **COMPLETE** - All tools implemented, tested, and documented
**Phase 5 Status**: ✅ **COMPLETE** - All tools implemented, tested, and documented
**Phase 6 Status**: ✅ **COMPLETE** - All documentation updated and production-ready
**Project Status**: 🎉 **COMPLETE** - Comprehensive web automation platform ready for production
**Version**: 2.0.0 - Production Ready
**Foundation**: Complete web automation platform with 27 tools and comprehensive documentation
### 📊 **Final Capabilities**
- **27 MCP Tools**: Complete web automation toolkit
- **Enhanced Screenshots**: Element-specific and metadata-rich screenshots
- **Bulk File Operations**: Efficient file transfer and management
- **File Management**: Automated cleanup and monitoring
- **Page Intelligence**: Complete page analysis and monitoring
- **Form Intelligence**: Complete form analysis and bulk operations
- **Data Extraction**: Batch extraction with structured output
- **Element Checking**: Conditional logic without timing issues
- **File Operations**: Upload/download capabilities
- **Console Access**: Debug and command execution
- **Performance Monitoring**: Real-time performance metrics
- **Content Verification**: Loading state and error detection
This plan provides a structured approach to significantly enhancing the cremote MCP server while maintaining backward compatibility and following cremote's design principles.
---
**Last Updated**: August 17, 2025
**Phase 6 Completion**: ✅ **COMPLETE** - Documentation updated and production-ready
**Project Status**: 🎉 **ALL PHASES COMPLETE** - Comprehensive web automation platform delivered
**Version**: 2.0.0 - Production Ready
**Total Tools**: 27 comprehensive web automation tools with complete documentation

View File

@ -1,175 +0,0 @@
# Phase 1 Implementation Summary: Element State and Checking Tools
## Overview
Phase 1 of the MCP Enhancement Plan has been successfully implemented, adding powerful element checking capabilities to the cremote MCP server. These new tools enable conditional logic and better decision-making for LLM-driven web automation workflows.
## Implemented Features
### 1. New Daemon Commands
Added three new commands to `daemon/daemon.go`:
- **`check-element`**: Checks element existence, visibility, enabled state, focus, and selection
- **`get-element-attributes`**: Retrieves HTML attributes, JavaScript properties, and computed styles
- **`count-elements`**: Counts elements matching a CSS selector
### 2. New Client Methods
Added corresponding methods to `client/client.go`:
- **`CheckElement()`**: Returns structured element state information
- **`GetElementAttributes()`**: Returns map of element attributes and properties
- **`CountElements()`**: Returns count of matching elements
### 3. New MCP Tools
Added two new MCP tools to `mcp/main.go`:
- **`web_element_check_cremotemcp`**: Exposes element checking functionality
- **`web_element_attributes_cremotemcp`**: Exposes attribute retrieval functionality
## Key Benefits
### For LLMs
- **Conditional Logic**: Can check element states before attempting interactions
- **Reduced Errors**: Prevents failures from interacting with non-existent or disabled elements
- **Rich Context**: Detailed element information for better decision-making
- **Timing Independence**: No need to wait for elements, just check their current state
### For Developers
- **Robust Automation**: More reliable web automation workflows
- **Better Debugging**: Detailed element state information for troubleshooting
- **Flexible Queries**: Support for various attribute types and computed styles
- **Backward Compatibility**: All existing tools continue to work unchanged
## Technical Implementation Details
### Element Checking (`check-element`)
- Supports multiple check types: `exists`, `visible`, `enabled`, `focused`, `selected`, `all`
- Returns structured JSON with boolean values for each check
- Handles iframe context automatically
- Graceful timeout handling
### Attribute Retrieval (`get-element-attributes`)
- Supports three attribute types:
- HTML attributes (e.g., `id`, `class`, `href`)
- Computed styles (prefix: `style_`, e.g., `style_display`)
- JavaScript properties (prefix: `prop_`, e.g., `prop_textContent`)
- Special `all` mode returns common attributes, properties, and styles
- Comma-separated attribute lists for specific queries
### Element Counting (`count-elements`)
- Simple count of elements matching a CSS selector
- Returns 0 for non-existent elements (not an error)
- Useful for checking if multiple elements exist
## Documentation Updates
### Updated Files
- **`mcp/README.md`**: Added new tool descriptions and examples
- **`mcp/LLM_USAGE_GUIDE.md`**: Comprehensive usage guide for LLMs
- **`mcp/QUICK_REFERENCE.md`**: Quick reference with common patterns
### New Usage Patterns
- **Conditional Workflows**: Check element state before interaction
- **Form Validation**: Verify form readiness and field states
- **Error Detection**: Check for error messages or validation states
- **Dynamic Content**: Verify content loading and visibility
## Example Usage
### Basic Element Checking
```json
{
"name": "web_element_check_cremotemcp",
"arguments": {
"selector": "#submit-button",
"check_type": "enabled"
}
}
```
### Comprehensive Element Analysis
```json
{
"name": "web_element_attributes_cremotemcp",
"arguments": {
"selector": "#user-form",
"attributes": "all"
}
}
```
### Conditional Logic Example
```json
// 1. Check if form is ready
{
"name": "web_element_check_cremotemcp",
"arguments": {
"selector": "form#login",
"check_type": "visible"
}
}
// 2. Get current field values
{
"name": "web_element_attributes_cremotemcp",
"arguments": {
"selector": "input[name='username']",
"attributes": "value,placeholder,required"
}
}
// 3. Fill form only if needed
{
"name": "web_interact_cremotemcp",
"arguments": {
"action": "fill",
"selector": "input[name='username']",
"value": "testuser"
}
}
```
## Testing Status
### Build Status
- ✅ All code compiles successfully
- ✅ No syntax errors or type issues
- ✅ MCP server builds without errors
### Test Coverage
- ✅ Created comprehensive test HTML page (`test-element-checking.html`)
- ✅ Created test scripts for daemon command validation
- ⚠️ Full integration testing limited by Chrome DevTools connection issues
- ✅ Code structure and API design validated
### Known Issues
- Chrome DevTools connection intermittent in test environment
- System daemon conflict on default port 8989
- These are environment-specific issues, not code problems
## Next Steps
### Phase 2: Enhanced Data Extraction Tools
Ready to implement:
- `web_extract_multiple_cremotemcp`: Batch data extraction
- `web_extract_links_cremotemcp`: Link extraction with filtering
- `web_extract_table_cremotemcp`: Structured table data extraction
- `web_extract_text_cremotemcp`: Text extraction with pattern matching
### Immediate Benefits Available
Phase 1 tools are ready for use and provide immediate value:
- Better error handling in automation workflows
- Conditional logic capabilities for LLMs
- Rich element inspection for debugging
- Foundation for more advanced automation patterns
## Conclusion
Phase 1 successfully delivers on its promise of enabling conditional logic without timing issues. The new element checking tools provide LLMs with the ability to make informed decisions about web page state, significantly improving the reliability and intelligence of web automation workflows.
The implementation follows cremote's design principles:
- **KISS Philosophy**: Simple, focused tools that do one thing well
- **Backward Compatibility**: No breaking changes to existing functionality
- **LLM-Friendly**: Designed specifically for LLM interaction patterns
- **Robust Error Handling**: Graceful handling of edge cases and timeouts
Phase 1 is complete and ready for production use.

View File

@ -1,181 +0,0 @@
# Phase 2 Completion Summary: Enhanced Data Extraction Tools
**Date Completed**: August 16, 2025
**Session**: Phase 2 Implementation
**Status**: ✅ **COMPLETE** - Ready for production use
## 🎉 Phase 2 Successfully Implemented!
Phase 2 of the cremote MCP server enhancement plan has been successfully completed, delivering powerful new data extraction capabilities that dramatically improve efficiency for LLM-driven web automation workflows.
## ✅ What Was Delivered
### New Daemon Commands
- **`extract-multiple`**: Extract from multiple selectors in a single call
- **`extract-links`**: Extract all links with advanced filtering options
- **`extract-table`**: Extract table data as structured JSON
- **`extract-text`**: Extract text content with pattern matching
### New Client Methods
- **`ExtractMultiple()`**: Batch extraction from multiple selectors
- **`ExtractLinks()`**: Link extraction with href/text pattern filtering
- **`ExtractTable()`**: Table data extraction with header processing
- **`ExtractText()`**: Text extraction with regex pattern matching
### New MCP Tools
- **`web_extract_multiple_cremotemcp`**: Multi-selector batch extraction
- **`web_extract_links_cremotemcp`**: Advanced link extraction and filtering
- **`web_extract_table_cremotemcp`**: Structured table data extraction
- **`web_extract_text_cremotemcp`**: Pattern-based text extraction
### New Data Structures
- **`MultipleExtractionResult`**: Structured results with error handling
- **`LinksExtractionResult`**: Rich link information with metadata
- **`TableExtractionResult`**: Table data with headers and structured format
- **`TextExtractionResult`**: Text content with pattern matches
## 🚀 Key Benefits Achieved
### For LLMs
- **Reduced Round Trips**: Extract multiple data points in single API calls
- **Structured Data**: Well-formatted JSON responses ready for processing
- **Rich Context**: Comprehensive data extraction provides better understanding
- **Pattern Matching**: Built-in regex support eliminates post-processing
- **Error Handling**: Graceful handling of missing elements with detailed feedback
### For Developers
- **Faster Automation**: Bulk operations significantly speed up workflows
- **Better Data Quality**: Structured responses with consistent formatting
- **Flexible Filtering**: Advanced filtering options for precise data extraction
- **Comprehensive Coverage**: Tools handle common extraction scenarios
- **Backward Compatibility**: All existing tools continue to work unchanged
## 📊 Technical Implementation
### Architecture Changes
All new functionality follows the established three-layer architecture:
1. **Daemon Layer** (`daemon/daemon.go`):
- Lines 620-703: Command handlers for new extraction commands
- Lines 2542-2937: Implementation methods with timeout handling
2. **Client Layer** (`client/client.go`):
- Lines 824-857: New data structures for structured responses
- Lines 989-1282: Client methods with parameter validation
3. **MCP Layer** (`mcp/main.go`):
- Lines 933-1199: MCP tool definitions with comprehensive schemas
### Key Features Implemented
- **Batch Processing**: Multiple selectors processed in single calls
- **Advanced Filtering**: Regex patterns for href and text filtering
- **Structured Output**: Consistent JSON formatting across all tools
- **Error Resilience**: Graceful handling of missing or invalid elements
- **Timeout Management**: Configurable timeouts for all operations
- **Pattern Matching**: Built-in regex support for text extraction
## 📚 Documentation Updates
### Comprehensive Documentation
- **README.md**: Updated with Phase 2 tools and examples
- **LLM_USAGE_GUIDE.md**: Detailed usage instructions and patterns
- **QUICK_REFERENCE.md**: Updated tool list and essential parameters
- **MCP_ENHANCEMENT_PLAN.md**: Updated status and implementation details
### New Usage Patterns
- Multi-selector data extraction workflows
- Advanced link discovery and filtering
- Table data processing and analysis
- Pattern-based text extraction examples
- Comprehensive site analysis workflows
## 🔧 Implementation Files
### Core Implementation
- `daemon/daemon.go`: Enhanced with 4 new extraction commands and methods
- `client/client.go`: Added 4 new data structures and client methods
- `mcp/main.go`: Added 4 new MCP tools with comprehensive schemas
### Documentation
- `mcp/README.md`: Updated with Phase 2 tools and benefits
- `mcp/LLM_USAGE_GUIDE.md`: Comprehensive usage guide with examples
- `mcp/QUICK_REFERENCE.md`: Updated tool reference
- `MCP_ENHANCEMENT_PLAN.md`: Updated status and next steps
### Testing
- `test_phase2_extraction.go`: Comprehensive test suite for validation
## 🎯 Real-World Use Cases
### E-commerce Data Extraction
```json
{
"name": "web_extract_multiple_cremotemcp",
"arguments": {
"selectors": {
"title": "h1.product-title",
"price": ".price-current",
"rating": ".rating-score",
"availability": ".stock-status"
}
}
}
```
### Site Structure Analysis
```json
{
"name": "web_extract_links_cremotemcp",
"arguments": {
"container_selector": "nav",
"href_pattern": "https://.*"
}
}
```
### Data Table Processing
```json
{
"name": "web_extract_table_cremotemcp",
"arguments": {
"selector": "#pricing-table",
"include_headers": true
}
}
```
### Contact Information Extraction
```json
{
"name": "web_extract_text_cremotemcp",
"arguments": {
"selector": ".contact-info",
"pattern": "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b"
}
}
```
## 🚀 Ready for Production
Phase 2 is now **complete and ready for production deployment**. All tools have been:
- ✅ **Implemented**: Full functionality across all three layers
- ✅ **Documented**: Comprehensive documentation and examples
- ✅ **Validated**: Implementation verified through testing
- ✅ **Integrated**: Seamlessly integrated with existing tools
## 🎯 Next Steps: Phase 3
With Phase 2 complete, the foundation is now ready for **Phase 3: Form Analysis and Bulk Operations**, which will focus on:
- **Form Intelligence**: Complete form analysis and understanding
- **Bulk Interactions**: Multiple form interactions in single calls
- **Advanced Workflows**: Complex multi-step automation patterns
The solid foundation established in Phases 1 and 2 provides the perfect base for these advanced capabilities.
---
**Phase 2 Status**: ✅ **COMPLETE** - Ready for production use
**Next Phase**: 🎯 **Phase 3: Form Analysis and Bulk Operations**
**Foundation**: Comprehensive extraction capabilities ready for advanced automation

View File

@ -1,144 +0,0 @@
# Phase 3 Completion Summary
**Date Completed**: August 16, 2025
**Implementation Session**: Phase 3 - Form Analysis and Bulk Operations
## ✅ **PHASE 3 COMPLETE!**
Phase 3 of the cremote MCP server enhancement plan has been successfully implemented, adding powerful form analysis and bulk operation capabilities.
## 🎯 **What Was Implemented**
### New Daemon Commands
- **`analyze-form`**: Complete form analysis with field detection, validation rules, and submission info
- **`interact-multiple`**: Batch interactions supporting click, fill, select, check, uncheck actions
- **`fill-form-bulk`**: Bulk form filling with intelligent field mapping
### New Client Methods
- **`AnalyzeForm()`**: Returns comprehensive form analysis with field metadata
- **`InteractMultiple()`**: Executes multiple interactions with detailed success/error reporting
- **`FillFormBulk()`**: Fills multiple form fields with automatic selector generation
### New MCP Tools
- **`web_form_analyze_cremotemcp`**: Analyze forms completely
- **`web_interact_multiple_cremotemcp`**: Batch interactions
- **`web_form_fill_bulk_cremotemcp`**: Fill entire forms with key-value pairs
## 🏗️ **Implementation Details**
### Daemon Layer (`daemon/daemon.go`)
- **Lines 684-769**: Added command handlers for Phase 3 commands
- **Lines 3000-3465**: Implemented form analysis, multiple interactions, and bulk filling methods
- **Comprehensive error handling**: Partial success support for batch operations
- **Smart field detection**: Multiple selector strategies for robust field identification
### Client Layer (`client/client.go`)
- **Lines 852-919**: Added data structures for form analysis and interaction results
- **Lines 1343-1626**: Implemented client methods with proper JSON parsing
- **Structured responses**: Rich data structures for LLM processing
### MCP Layer (`mcp/main.go`)
- **Lines 1198-1433**: Added three new MCP tools with comprehensive parameter validation
- **Proper error handling**: Consistent error reporting across all tools
- **Parameter validation**: Robust input validation for complex data structures
## 📊 **Key Features Delivered**
### Form Analysis
- **Complete field detection**: Input, textarea, select, button elements
- **Field metadata**: Name, type, value, placeholder, validation attributes
- **Smart labeling**: Automatic label association and text extraction
- **Select options**: Full option enumeration with selected state
- **Submission info**: Form action, method, and submit button detection
### Multiple Interactions
- **Batch operations**: Execute multiple actions in single calls
- **Action support**: click, fill, select, check, uncheck
- **Error resilience**: Continue processing on partial failures
- **Detailed reporting**: Success/error status for each interaction
### Bulk Form Filling
- **Intelligent mapping**: Multiple field selector strategies
- **Form scoping**: Optional form-specific field search
- **Flexible input**: Support for field names, IDs, and custom selectors
- **Comprehensive results**: Detailed success/failure reporting
## 🎉 **Benefits for LLMs**
### Efficiency Gains
- **Reduced round trips**: Complete forms in 1-2 calls instead of 10+
- **Batch processing**: Multiple interactions in single operations
- **Smart automation**: Form analysis prevents interaction failures
### Enhanced Capabilities
- **Form intelligence**: Understand form structure before interaction
- **Error prevention**: Validate fields exist before attempting to fill
- **Flexible workflows**: Support for complex multi-step form processes
### Better User Experience
- **Structured data**: Rich JSON responses for easy processing
- **Error context**: Detailed error information for debugging
- **Partial success**: Continue processing even when some operations fail
## 📚 **Documentation Updates**
### Updated Files
- **`mcp/README.md`**: Added Phase 3 tools and benefits section
- **`mcp/LLM_USAGE_GUIDE.md`**: Added comprehensive Phase 3 tool documentation and usage patterns
- **`mcp/QUICK_REFERENCE.md`**: Added Phase 3 tool parameters and common patterns
### New Examples
- **Smart form handling**: Complete form analysis and filling workflows
- **Batch operations**: Multiple interactions in single calls
- **Complex workflows**: Multi-step form completion patterns
## 🧪 **Testing Preparation**
### Test Assets Created
- **`test-phase3-forms.html`**: Comprehensive test page with multiple form types
- **`test-phase3-functionality.sh`**: Test script for Phase 3 functionality validation
### Test Coverage
- **Form analysis**: Registration forms, contact forms, complex field types
- **Multiple interactions**: Button clicks, form filling, checkbox/radio handling
- **Bulk filling**: Various field mapping strategies and error scenarios
## 🚀 **Ready for Production**
Phase 3 implementation is **complete and ready for production use**:
**All daemon commands implemented and functional**
**Client methods with proper error handling**
**MCP tools with comprehensive parameter validation**
**Complete documentation with examples**
**Test assets prepared for validation**
## 📈 **Impact Achieved**
### For LLMs
- **10x efficiency**: Form completion in 1-2 calls vs 10+ individual calls
- **Better reliability**: Form analysis prevents interaction failures
- **Rich context**: Comprehensive form understanding for better decision making
### For Developers
- **Faster automation**: Bulk operations significantly speed up workflows
- **Better debugging**: Detailed error reporting and partial success handling
- **Flexible integration**: Multiple strategies for field identification and interaction
## 🎯 **Next Steps**
Phase 3 is **COMPLETE**. The cremote MCP server now provides:
- **19 comprehensive tools** for web automation
- **Complete form handling capabilities**
- **Efficient batch operations**
- **Production-ready implementation**
**Ready for Phase 4**: Page State and Metadata Tools (when needed)
---
**Implementation Quality**: ⭐⭐⭐⭐⭐ Production Ready
**Documentation Quality**: ⭐⭐⭐⭐⭐ Comprehensive
**Test Coverage**: ⭐⭐⭐⭐⭐ Thorough
**Phase 3 Status**: ✅ **COMPLETE AND READY FOR PRODUCTION USE**

View File

@ -1,156 +0,0 @@
# Phase 4 Implementation Completion Summary
**Date**: August 16, 2025
**Phase**: 4 - Page State and Metadata Tools
**Status**: ✅ **COMPLETE**
## Overview
Phase 4 of the cremote MCP enhancement plan has been successfully implemented, adding comprehensive page state and metadata capabilities to provide rich context for better debugging and monitoring.
## ✅ Implemented Features
### 1. Daemon Commands (daemon/daemon.go)
- ✅ `get-page-info` - Retrieves comprehensive page metadata and state information
- ✅ `get-viewport-info` - Gets viewport and scroll information
- ✅ `get-performance` - Retrieves page performance metrics
- ✅ `check-content` - Verifies specific content types and loading states
### 2. Data Structures
- ✅ `PageInfo` - Page metadata including title, URL, loading state, domain, protocol, charset, etc.
- ✅ `ViewportInfo` - Viewport dimensions, scroll position, device pixel ratio, orientation
- ✅ `PerformanceMetrics` - Load times, resource counts, memory usage, performance data
- ✅ `ContentCheck` - Content verification for images, scripts, styles, forms, links, iframes, errors
### 3. Client Methods (client/client.go)
- ✅ `GetPageInfo()` - Client method for page information retrieval
- ✅ `GetViewportInfo()` - Client method for viewport information
- ✅ `GetPerformance()` - Client method for performance metrics
- ✅ `CheckContent()` - Client method for content verification
### 4. MCP Tools (mcp/main.go)
- ✅ `web_page_info_cremotemcp` - MCP tool for page metadata
- ✅ `web_viewport_info_cremotemcp` - MCP tool for viewport information
- ✅ `web_performance_metrics_cremotemcp` - MCP tool for performance metrics
- ✅ `web_content_check_cremotemcp` - MCP tool for content verification
## 🎯 Key Capabilities Delivered
### Page State Monitoring
- **Comprehensive Metadata**: Title, URL, loading state, ready state, domain, protocol
- **Browser Status**: Cookie enabled, online status, character set, content type
- **Loading States**: Complete detection of page loading and ready states
### Viewport Intelligence
- **Dimensions**: Width, height, scroll position, scroll dimensions
- **Device Info**: Device pixel ratio, orientation detection
- **Responsive Context**: Full viewport and scroll state information
### Performance Analysis
- **Load Metrics**: Navigation start, load event end, DOM content loaded
- **Paint Metrics**: First paint, first contentful paint timing
- **Resource Tracking**: Resource count, load times, DOM load times
- **Memory Usage**: JavaScript heap size information
### Content Verification
- **Image Loading**: Track loaded vs total images
- **Script Status**: Monitor script loading and execution
- **Style Verification**: Check stylesheet loading
- **Element Counting**: Forms, links, iframes present on page
- **Error Detection**: Identify broken images, missing stylesheets, and other errors
## 📊 Implementation Statistics
- **New Daemon Commands**: 4
- **New Data Structures**: 4
- **New Client Methods**: 4
- **New MCP Tools**: 4
- **Lines of Code Added**: ~500
- **Documentation Updated**: 3 files (README, LLM Guide, Quick Reference)
## 🔧 Technical Implementation
### JavaScript Integration
All Phase 4 tools leverage browser JavaScript APIs for comprehensive data collection:
- `document` properties for page metadata
- `window` properties for viewport and performance
- DOM queries for content verification
- Performance API for timing metrics
### Error Handling
- Robust timeout handling with 5-second defaults
- Graceful fallbacks for missing browser APIs
- Comprehensive error reporting with detailed messages
- Safe parsing of JavaScript results
### Data Format
- Structured JSON responses for easy LLM processing
- Consistent naming conventions across all tools
- Optional fields marked appropriately
- Rich metadata for debugging and analysis
## 📚 Documentation Updates
### README.md
- Added 4 new tool descriptions with examples
- Added Phase 4 enhancement section
- Updated tool count and capabilities overview
### LLM_USAGE_GUIDE.md
- Added detailed parameter documentation for all 4 tools
- Added response format examples
- Added Phase 4 usage pattern
- Updated tool count to 23 total tools
### QUICK_REFERENCE.md
- Added Phase 4 tools to tool list
- Added parameter examples for all new tools
- Added Phase 4 monitoring pattern
- Updated workflow recommendations
## 🎉 Benefits Delivered
### For LLMs
- **Rich Context**: Comprehensive page state information for better decision making
- **Performance Insights**: Detailed metrics for optimization and monitoring
- **Content Verification**: Ensure all required content is loaded before proceeding
- **Debugging Support**: Enhanced information for troubleshooting issues
### For Developers
- **Better Monitoring**: Real-time page state and performance tracking
- **Enhanced Debugging**: Comprehensive page analysis capabilities
- **Content Validation**: Verify page loading and content availability
- **Performance Optimization**: Detailed metrics for performance analysis
## 🚀 Ready for Production
Phase 4 is fully implemented and ready for production use:
- ✅ All code compiles successfully
- ✅ Comprehensive error handling implemented
- ✅ Full documentation provided
- ✅ Consistent with existing cremote patterns
- ✅ MCP tools properly registered and functional
## 📈 Total Cremote MCP Capabilities
With Phase 4 complete, the cremote MCP server now provides:
- **23 Total Tools**: Comprehensive web automation toolkit
- **Page Intelligence**: Complete page analysis and monitoring
- **Form Automation**: Advanced form handling and bulk operations
- **Data Extraction**: Batch extraction with structured output
- **Element Checking**: Conditional logic without timing issues
- **File Operations**: Upload/download capabilities
- **Console Access**: Debug and command execution
- **Performance Monitoring**: Real-time performance metrics
- **Content Verification**: Loading state and error detection
## 🎯 Next Steps
Phase 4 completes the core page state and metadata capabilities. The cremote MCP server now provides a comprehensive foundation for advanced web automation workflows with rich context and monitoring capabilities.
**Phase 5** (Enhanced Screenshots and File Management) is ready for implementation when needed.
---
**Implementation Complete**: August 16, 2025
**Total Development Time**: ~2 hours
**Status**: ✅ **PRODUCTION READY**

View File

@ -1,190 +0,0 @@
# Phase 5 Implementation Summary: Enhanced Screenshot and File Management
**Date Completed**: August 16, 2025
**Implementation Session**: Phase 5 - Enhanced Screenshot and File Management
**Status**: ✅ **COMPLETE** - All tools implemented, tested, and documented
## Overview
Phase 5 successfully implemented enhanced screenshot capabilities and comprehensive file management tools, completing the cremote MCP server enhancement plan. This phase focused on improving debugging workflows and file handling efficiency.
## ✅ Implemented Features
### 1. Enhanced Screenshot Capabilities
#### `screenshot-element` Daemon Command
- **Location**: `daemon/daemon.go` lines 858-862 (handler), 4137-4180 (method)
- **Functionality**: Captures screenshots of specific elements with automatic positioning
- **Key Features**:
- Automatic element scrolling into view
- Element-specific screenshot capture
- Stable element waiting before capture
- Timeout handling
#### `screenshot-enhanced` Daemon Command
- **Location**: `daemon/daemon.go` lines 863-889 (handler), 4200-4303 (method)
- **Functionality**: Enhanced screenshots with rich metadata
- **Key Features**:
- Comprehensive metadata collection (timestamp, URL, title, viewport)
- File size and resolution information
- Full page or viewport capture options
- Structured metadata response
### 2. Bulk File Operations
#### `bulk-files` Daemon Command
- **Location**: `daemon/daemon.go` lines 890-910 (handler), 4340-4443 (method)
- **Functionality**: Efficient batch file upload/download operations
- **Key Features**:
- Multiple file operations in single call
- Detailed success/failure reporting
- Timeout handling for bulk operations
- Individual operation error tracking
### 3. File Management System
#### `manage-files` Daemon Command
- **Location**: `daemon/daemon.go` lines 911-923 (handler), 4514-4658 (methods)
- **Functionality**: Comprehensive file management operations
- **Key Features**:
- File cleanup with age-based filtering
- Directory listing with detailed file information
- Individual file information retrieval
- Pattern-based file matching
## ✅ Client Layer Implementation
### New Client Methods
- **Location**: `client/client.go` lines 984-1051 (data structures), 2045-2203 (methods)
#### `ScreenshotElement()`
- Element-specific screenshot capture
- Automatic timeout and tab handling
- Simple error reporting
#### `ScreenshotEnhanced()`
- Enhanced screenshot with metadata
- Structured metadata response parsing
- Full page and viewport options
#### `BulkFiles()`
- Batch file operations with detailed reporting
- JSON marshaling for operation arrays
- Comprehensive result parsing
#### `ManageFiles()`
- File management operations
- Flexible parameter handling
- Structured result parsing
## ✅ MCP Tools Implementation
### New MCP Tools
- **Location**: `mcp/main.go` lines 1647-1956
#### `web_screenshot_element_cremotemcp`
- **Parameters**: selector, output, tab, timeout
- **Functionality**: Element-specific screenshot capture
- **Integration**: Automatic screenshot tracking
#### `web_screenshot_enhanced_cremotemcp`
- **Parameters**: output, full_page, tab, timeout
- **Functionality**: Enhanced screenshots with metadata
- **Response**: Rich JSON metadata
#### `file_operations_bulk_cremotemcp`
- **Parameters**: operation, files array, timeout
- **Functionality**: Bulk file upload/download
- **Response**: Detailed operation results
#### `file_management_cremotemcp`
- **Parameters**: operation, pattern, max_age
- **Functionality**: File cleanup, listing, and info
- **Response**: Comprehensive file management results
## ✅ Documentation Updates
### README.md Updates
- **Location**: Lines 337-414 (new tools), 475-500 (Phase 5 section)
- Added 4 new tool descriptions with examples
- Added comprehensive Phase 5 benefits section
- Updated tool count and capabilities overview
### LLM Usage Guide Updates
- **Location**: Lines 7 (tool count), 728-908 (new tools)
- Updated tool count from 19 to 23
- Added detailed usage examples for all 4 new tools
- Included response format documentation
- Added parameter descriptions and use cases
### Quick Reference Updates
- **Location**: Lines 22-30 (tool list), 310-334 (parameters)
- Added Phase 5 tools to quick reference list
- Added parameter quick reference for new tools
- Maintained consistent formatting
## 🎯 Key Achievements
### Enhanced Debugging Capabilities
- **Element Screenshots**: Precise visual debugging for specific page elements
- **Rich Metadata**: Comprehensive context for screenshot analysis
- **Visual Documentation**: Better debugging and documentation workflows
### Efficient File Operations
- **Bulk Operations**: 10x efficiency improvement for multiple file transfers
- **Detailed Reporting**: Comprehensive success/failure tracking
- **Timeout Management**: Robust handling of long-running operations
### Automated File Management
- **Smart Cleanup**: Age-based file cleanup with pattern matching
- **Directory Monitoring**: Comprehensive file listing and information
- **Resource Management**: Automated maintenance of temporary files
## 📊 Implementation Statistics
- **New Daemon Commands**: 4 (screenshot-element, screenshot-enhanced, bulk-files, manage-files)
- **New Client Methods**: 4 (ScreenshotElement, ScreenshotEnhanced, BulkFiles, ManageFiles)
- **New MCP Tools**: 4 (web_screenshot_element_cremotemcp, web_screenshot_enhanced_cremotemcp, file_operations_bulk_cremotemcp, file_management_cremotemcp)
- **New Data Structures**: 8 (ScreenshotMetadata, FileOperation, BulkFileResult, etc.)
- **Lines of Code Added**: ~500 lines across daemon, client, and MCP layers
- **Documentation Updates**: 3 files updated with comprehensive examples
## 🚀 Benefits Delivered
### For LLMs
1. **Visual Debugging**: Element-specific screenshots for precise debugging
2. **Efficient File Operations**: Bulk operations reduce API call overhead
3. **Automated Maintenance**: Smart file cleanup and management
4. **Rich Context**: Enhanced metadata for better decision making
### For Developers
1. **Better Debugging**: Visual element capture for issue diagnosis
2. **Efficient Workflows**: Bulk file operations for data management
3. **Automated Cleanup**: Intelligent file maintenance
4. **Production Ready**: Comprehensive error handling and reporting
## ✅ Quality Assurance
- **Error Handling**: Comprehensive error handling at all layers
- **Timeout Management**: Robust timeout handling for all operations
- **Data Validation**: Input validation and type checking
- **Documentation**: Complete documentation with examples
- **Backward Compatibility**: All existing tools continue to work unchanged
## 🎉 Phase 5 Complete
Phase 5 successfully completes the cremote MCP server enhancement plan, delivering:
- **27 Total Tools**: Comprehensive web automation toolkit
- **Enhanced Screenshots**: Visual debugging and documentation capabilities
- **Bulk File Operations**: Efficient file transfer and management
- **Automated Maintenance**: Smart file cleanup and monitoring
- **Production Ready**: Robust error handling and comprehensive documentation
The cremote MCP server now provides a complete, production-ready web automation platform with advanced screenshot capabilities and comprehensive file management tools.
---
**Implementation Complete**: August 16, 2025
**Total Development Time**: Phase 5 implementation session
**Status**: ✅ Ready for production use
**Next Steps**: User validation and feedback collection

View File

@ -77,7 +77,7 @@ cremote <command> [options]
- `version`: Show version information for CLI and daemon
- `open-tab`: Open a new tab and return its ID
- `load-url`: Load a URL in a tab
- `fill-form`: Fill a form field with a value
- `fill-form`: Fill a form field with a value (also handles checkboxes, radio buttons, and dropdown selections)
- `upload-file`: Upload a file to a file input
- `submit-form`: Submit a form
- `get-source`: Get the source code of a page
@ -163,6 +163,20 @@ cremote fill-form --tab="<tab-id>" --selector="#option2" --value="true"
Accepted values for checking a checkbox or selecting a radio button: `true`, `1`, `yes`, `on`, `checked`.
Any other value will uncheck the checkbox or deselect the radio button.
#### Select dropdown options
The `fill-form` command can also be used to select options in dropdown elements:
```bash
# Select by option text (visible text)
cremote fill-form --tab="<tab-id>" --selector="#country" --value="United States"
# Select by option value (value attribute)
cremote fill-form --tab="<tab-id>" --selector="#state" --value="CA"
```
The command automatically detects dropdown elements and tries both option text and option value matching. This works with both `<select>` elements and custom dropdown implementations.
#### Upload a file
```bash

View File

@ -1,125 +0,0 @@
# Select Element Fix Summary
## Problem Identified
The cremote MCP system had issues with select dropdown elements:
1. **Single `web_interact_cremotemcp`** only supported "click", "fill", "submit", "upload" actions - missing "select"
2. **Bulk `web_form_fill_bulk_cremotemcp`** always used "fill" action, which tried to use `SelectAllText()` and `Input()` methods on select elements, causing errors
3. **Multiple `web_interact_multiple_cremotemcp`** already supported "select" action and worked correctly
## Root Cause
- The "fill" action was designed for text inputs and used methods like `SelectAllText()` and `Input()`
- Select elements don't support these methods - they need `Select()` method or JavaScript value assignment
- The daemon had proper select handling in the `interact-multiple` endpoint but not in single interactions or bulk form fill
## Fixes Implemented
### 1. Enhanced Single Interaction Support
**File: `mcp/main.go`**
- Added "select" to the enum of supported actions (line 199)
- Added "select" case to the action switch statement (lines 270-275)
- Added call to new `SelectElement` client method
### 2. New Client Method
**File: `client/client.go`**
- Added `SelectElement` method (lines 328-360)
- Method calls new "select-element" daemon endpoint
- Supports timeout parameters like other client methods
### 3. New Daemon Endpoint
**File: `daemon/daemon.go`**
- Added "select-element" case to command handler (lines 452-478)
- Added `selectElement` method (lines 1934-1982)
- Uses rod's `Select()` method with fallback to JavaScript
- Tries selection by text first, then by value
- Includes verification that selection worked
### 4. Enhanced Bulk Form Fill
**File: `daemon/daemon.go`**
- Modified `fillFormBulk` to detect element types (lines 3680-3813)
- Added element tag name detection using `element.Eval()`
- Uses "select" action for `<select>` elements
- Uses "fill" action for other elements (input, textarea, etc.)
- Proper error handling for both action types
### 5. Updated Documentation
**Files: `mcp/LLM_USAGE_GUIDE.md`, `mcp/QUICK_REFERENCE.md`, `mcp/README.md`**
- Added "select" to supported actions
- Added examples for select dropdown usage
- Updated parameter descriptions
## Testing Results
### ✅ Working Immediately (No Server Restart Required)
- `web_interact_multiple_cremotemcp` with "select" action
- Mixed form filling with text inputs, selects, checkboxes, radio buttons
### ✅ Working After Server Restart
- `web_interact_cremotemcp` with "select" action
- `web_form_fill_bulk_cremotemcp` with automatic select detection
## Test Examples
### Single Select Action
```yaml
web_interact_cremotemcp:
action: "select"
selector: "#country"
value: "United States" # Works with option text or value
```
### Multiple Actions Including Select
```yaml
web_interact_multiple_cremotemcp:
interactions:
- selector: "#firstName"
action: "fill"
value: "John"
- selector: "#state"
action: "select"
value: "California"
- selector: "#newsletter"
action: "check"
```
### Bulk Form Fill (Auto-detects Select Elements)
```yaml
web_form_fill_bulk_cremotemcp:
fields:
firstName: "John"
lastName: "Doe"
state: "CA" # Automatically uses select action
newsletter: "yes" # Automatically uses appropriate action
```
## Verification
Tested on https://brokedown.net/formtest.php with:
- ✅ Select by option value ("CA", "TX", "FL")
- ✅ Select by option text ("California", "Texas", "Florida")
- ✅ Mixed form completion with 7 different field types
- ✅ All interactions successful (7/7 success rate)
## Files Modified
1. `mcp/main.go` - Added select action support
2. `client/client.go` - Added SelectElement method
3. `daemon/daemon.go` - Added select endpoint and enhanced bulk fill
4. `mcp/LLM_USAGE_GUIDE.md` - Updated documentation
5. `mcp/QUICK_REFERENCE.md` - Updated documentation
6. `mcp/README.md` - Updated documentation
## Deployment Required
The server needs to be redeployed to activate:
- Single `web_interact_cremotemcp` "select" action
- Enhanced `web_form_fill_bulk_cremotemcp` with select detection
The `web_interact_multiple_cremotemcp` "select" action works immediately without restart.

View File

@ -1 +0,0 @@
This is a test file for cremote file transfer

View File

@ -1,323 +0,0 @@
<?php
// formtest.php - Comprehensive form testing for cremote MCP tools
?>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Cremote Form Testing - Multiple Field Types</title>
<style>
body {
font-family: Arial, sans-serif;
max-width: 800px;
margin: 0 auto;
padding: 20px;
background-color: #f5f5f5;
}
.container {
background: white;
padding: 30px;
border-radius: 8px;
box-shadow: 0 2px 10px rgba(0,0,0,0.1);
}
h1 {
color: #333;
text-align: center;
margin-bottom: 30px;
}
.form-group {
margin-bottom: 20px;
}
label {
display: block;
margin-bottom: 5px;
font-weight: bold;
color: #555;
}
input[type="text"],
input[type="email"],
input[type="tel"],
input[type="password"],
input[type="number"],
input[type="date"],
select,
textarea {
width: 100%;
padding: 10px;
border: 2px solid #ddd;
border-radius: 4px;
font-size: 16px;
box-sizing: border-box;
}
input[type="checkbox"],
input[type="radio"] {
margin-right: 8px;
}
.checkbox-group,
.radio-group {
display: flex;
flex-direction: column;
gap: 8px;
}
.inline-group {
display: flex;
gap: 15px;
align-items: center;
}
button {
background-color: #007cba;
color: white;
padding: 12px 30px;
border: none;
border-radius: 4px;
font-size: 16px;
cursor: pointer;
margin-right: 10px;
}
button:hover {
background-color: #005a87;
}
.reset-btn {
background-color: #666;
}
.reset-btn:hover {
background-color: #444;
}
.results {
background-color: #e8f5e8;
border: 2px solid #4caf50;
padding: 20px;
border-radius: 4px;
margin-top: 20px;
}
.results h2 {
color: #2e7d32;
margin-top: 0;
}
.field-result {
margin-bottom: 10px;
padding: 8px;
background-color: white;
border-radius: 4px;
}
.field-name {
font-weight: bold;
color: #1976d2;
}
.field-value {
color: #333;
margin-left: 10px;
}
.empty-value {
color: #999;
font-style: italic;
}
</style>
</head>
<body>
<div class="container">
<h1>🧪 Cremote Form Testing Suite</h1>
<?php if ($_SERVER['REQUEST_METHOD'] === 'POST'): ?>
<div class="results">
<h2>📋 Form Submission Results</h2>
<p><strong>Submission Time:</strong> <?php echo date('Y-m-d H:i:s'); ?></p>
<p><strong>Total Fields Submitted:</strong> <?php echo count($_POST); ?></p>
<?php foreach ($_POST as $field => $value): ?>
<div class="field-result">
<span class="field-name"><?php echo htmlspecialchars($field); ?>:</span>
<span class="field-value <?php echo empty($value) ? 'empty-value' : ''; ?>">
<?php
if (is_array($value)) {
echo htmlspecialchars(implode(', ', $value));
} else {
echo empty($value) ? '(empty)' : htmlspecialchars($value);
}
?>
</span>
</div>
<?php endforeach; ?>
<hr style="margin: 20px 0;">
<h3>🔍 Raw POST Data (for debugging):</h3>
<pre style="background: #f0f0f0; padding: 10px; border-radius: 4px; overflow-x: auto;"><?php print_r($_POST); ?></pre>
</div>
<hr style="margin: 30px 0;">
<?php endif; ?>
<form id="test-form" method="POST" action="<?php echo $_SERVER['PHP_SELF']; ?>">
<h2>📝 Personal Information</h2>
<div class="form-group">
<label for="firstName">First Name:</label>
<input type="text" id="firstName" name="firstName" placeholder="Enter your first name">
</div>
<div class="form-group">
<label for="lastName">Last Name:</label>
<input type="text" id="lastName" name="lastName" placeholder="Enter your last name">
</div>
<div class="form-group">
<label for="email">Email Address:</label>
<input type="email" id="email" name="email" placeholder="your.email@example.com">
</div>
<div class="form-group">
<label for="phone">Phone Number:</label>
<input type="tel" id="phone" name="phone" placeholder="+1-555-123-4567">
</div>
<div class="form-group">
<label for="birthDate">Birth Date:</label>
<input type="date" id="birthDate" name="birthDate">
</div>
<div class="form-group">
<label for="age">Age:</label>
<input type="number" id="age" name="age" min="1" max="120" placeholder="25">
</div>
<h2>🏠 Address Information</h2>
<div class="form-group">
<label for="address">Street Address:</label>
<input type="text" id="address" name="address" placeholder="123 Main Street">
</div>
<div class="form-group">
<label for="city">City:</label>
<input type="text" id="city" name="city" placeholder="New York">
</div>
<div class="form-group">
<label for="state">State/Province:</label>
<select id="state" name="state">
<option value="">Select a state...</option>
<option value="AL">Alabama</option>
<option value="CA">California</option>
<option value="FL">Florida</option>
<option value="NY">New York</option>
<option value="TX">Texas</option>
<option value="WA">Washington</option>
<option value="OTHER">Other</option>
</select>
</div>
<div class="form-group">
<label for="zipCode">ZIP/Postal Code:</label>
<input type="text" id="zipCode" name="zipCode" placeholder="12345">
</div>
<h2>🎯 Preferences</h2>
<div class="form-group">
<label>Preferred Contact Method:</label>
<div class="radio-group">
<div class="inline-group">
<input type="radio" id="contactEmail" name="contactMethod" value="email">
<label for="contactEmail">Email</label>
</div>
<div class="inline-group">
<input type="radio" id="contactPhone" name="contactMethod" value="phone">
<label for="contactPhone">Phone</label>
</div>
<div class="inline-group">
<input type="radio" id="contactMail" name="contactMethod" value="mail">
<label for="contactMail">Mail</label>
</div>
</div>
</div>
<div class="form-group">
<label>Interests (check all that apply):</label>
<div class="checkbox-group">
<div class="inline-group">
<input type="checkbox" id="interestTech" name="interests[]" value="technology">
<label for="interestTech">Technology</label>
</div>
<div class="inline-group">
<input type="checkbox" id="interestSports" name="interests[]" value="sports">
<label for="interestSports">Sports</label>
</div>
<div class="inline-group">
<input type="checkbox" id="interestMusic" name="interests[]" value="music">
<label for="interestMusic">Music</label>
</div>
<div class="inline-group">
<input type="checkbox" id="interestTravel" name="interests[]" value="travel">
<label for="interestTravel">Travel</label>
</div>
</div>
</div>
<div class="form-group">
<label for="newsletter">
<input type="checkbox" id="newsletter" name="newsletter" value="yes">
Subscribe to newsletter
</label>
</div>
<h2>💬 Additional Information</h2>
<div class="form-group">
<label for="comments">Comments or Questions:</label>
<textarea id="comments" name="comments" rows="4" placeholder="Enter any additional comments or questions..."></textarea>
</div>
<div class="form-group">
<label for="password">Password (for testing):</label>
<input type="password" id="password" name="password" placeholder="Enter a test password">
</div>
<div class="form-group" style="margin-top: 30px;">
<button type="submit" id="submitBtn">🚀 Submit Form</button>
<button type="reset" class="reset-btn">🔄 Reset Form</button>
</div>
</form>
<div style="margin-top: 40px; padding: 20px; background-color: #f0f8ff; border-radius: 4px;">
<h3>🧪 Testing Instructions for Cremote MCP Tools</h3>
<p><strong>Form ID:</strong> <code>#test-form</code></p>
<p><strong>Submit Button:</strong> <code>#submitBtn</code></p>
<h4>Example MCP Tool Usage:</h4>
<pre style="background: #f5f5f5; padding: 10px; border-radius: 4px; font-size: 14px;">
// Test bulk form filling
web_form_fill_bulk_cremotemcp:
form_selector: "#test-form"
fields:
firstName: "John"
lastName: "Doe"
email: "john.doe@example.com"
phone: "+1-555-123-4567"
address: "123 Test Street"
city: "Test City"
state: "CA"
zipCode: "12345"
contactMethod: "email"
interests: ["technology", "travel"]
newsletter: "yes"
comments: "This is a test submission"
</pre>
</div>
</div>
<script>
// Add some JavaScript for enhanced testing
document.getElementById('test-form').addEventListener('submit', function(e) {
console.log('Form submission detected');
console.log('Form data:', new FormData(this));
});
// Log when fields are filled (useful for debugging MCP tools)
document.querySelectorAll('input, select, textarea').forEach(function(element) {
element.addEventListener('change', function() {
console.log('Field changed:', this.name, '=', this.value);
});
});
</script>
</body>
</html>

View File

@ -1,205 +0,0 @@
# Phase 6: Documentation Updates - Completion Summary
**Date Completed**: August 17, 2025
**Version**: 2.0.0
**Status**: ✅ **COMPLETE** - Production Ready
## 🎉 Phase 6 Deliverables Completed
### ✅ 1. Updated README.md with Complete Tool List
**File**: `mcp/README.md`
**Status**: ✅ Complete
**Key Updates:**
- Updated header to reflect **27 comprehensive tools** across 5 phases
- Reorganized tools by category (Core, Phase 1-5)
- Added comprehensive capability matrix
- Updated tool numbering (1-27) with proper categorization
- Added enhanced workflow examples
- Updated benefits section with 10x efficiency metrics
- Added production readiness indicators
**New Sections Added:**
- 🎉 Complete Web Automation Platform overview
- Tool categorization by enhancement phases
- Advanced workflow examples (Basic + E-commerce)
- Key Benefits for LLM Agents section
- Production Ready status with capability matrix
### ✅ 2. Updated LLM_USAGE_GUIDE.md with Complete Documentation
**File**: `mcp/LLM_USAGE_GUIDE.md`
**Status**: ✅ Complete
**Key Updates:**
- Updated introduction to reflect **27 tools** across 5 phases
- Verified all 27 tools are documented with complete examples
- Added advanced workflow examples section
- Added comprehensive best practices for LLM agents
- Added production readiness guidelines
**New Sections Added:**
- 🚀 Advanced Workflow Examples (Form completion, Data extraction)
- 🎯 Best Practices for LLM Agents (Batch operations, Element checking)
- Enhanced debugging guidelines
- Production optimization tips
### ✅ 3. Updated QUICK_REFERENCE.md with All Tools
**File**: `mcp/QUICK_REFERENCE.md`
**Status**: ✅ Complete
**Key Updates:**
- Updated header to reflect complete platform status
- Reorganized tools by category for easy lookup
- Added efficiency tips section
- Enhanced error handling guidelines
- Added production readiness summary
**New Sections Added:**
- Tool categorization by enhancement phases
- 🚀 Efficiency Tips (10x faster operations)
- Smart Element Checking guidelines
- Enhanced Debugging practices
- Production Ready capability matrix
### ✅ 4. Created Comprehensive Workflow Examples
**File**: `mcp/WORKFLOW_EXAMPLES.md` *(New)*
**Status**: ✅ Complete
**Content Created:**
- 9 comprehensive workflow examples
- Form automation workflows (Traditional vs Enhanced)
- Data extraction workflows (E-commerce, Contact info)
- Page analysis workflows (Health check, Form validation)
- File management workflows
- Advanced automation patterns
- Performance optimization examples
**Key Features:**
- Side-by-side comparison of traditional vs enhanced approaches
- Real-world use cases with complete code examples
- Error handling and conditional logic examples
- Best practices summary
### ✅ 5. Added Performance and Best Practices Section
**File**: `mcp/PERFORMANCE_BEST_PRACTICES.md` *(New)*
**Status**: ✅ Complete
**Content Created:**
- Performance optimization guidelines
- Batch operations best practices
- Error prevention strategies
- Timeout management guidelines
- Resource management practices
- Performance monitoring techniques
- Debugging best practices
- Production deployment guidelines
**Key Metrics Documented:**
- **10x Form Efficiency**: Complete forms in 1-2 calls instead of 10+
- **5x Data Extraction**: Batch extraction vs individual calls
- **3x File Operations**: Bulk operations vs individual transfers
- Real-world performance benchmarks
### ✅ 6. Updated Version Numbers and Completion Status
**Files Updated**: `mcp/main.go`, All documentation files
**Status**: ✅ Complete
**Version Updates:**
- Updated MCP server version from "1.0.0" to "2.0.0"
- Reflects major enhancement completion across all 5 phases
- Updated all documentation to reflect production-ready status
## 📊 Final Documentation Portfolio
### Core Documentation (Updated)
1. **README.md** - Main project documentation with 27 tools
2. **LLM_USAGE_GUIDE.md** - Comprehensive usage guide for LLM agents
3. **QUICK_REFERENCE.md** - Quick lookup reference for all tools
### New Documentation (Created)
4. **WORKFLOW_EXAMPLES.md** - Comprehensive workflow examples
5. **PERFORMANCE_BEST_PRACTICES.md** - Performance optimization guide
6. **PHASE6_COMPLETION_SUMMARY.md** - This completion summary
### Configuration Files
7. **claude_desktop_config.json** - Claude Desktop configuration
8. **go.mod** - Go module configuration
## 🎯 Key Achievements
### Documentation Quality
- **Comprehensive Coverage**: All 27 tools fully documented
- **LLM Optimized**: Specifically designed for AI agent consumption
- **Production Ready**: Complete deployment and optimization guides
- **Real-World Examples**: Practical workflows for common use cases
### Performance Documentation
- **Efficiency Metrics**: Documented 10x performance improvements
- **Best Practices**: Comprehensive optimization guidelines
- **Error Prevention**: Smart element checking strategies
- **Resource Management**: Production deployment considerations
### User Experience
- **Multiple Formats**: Quick reference, detailed guide, and examples
- **Categorized Organization**: Tools organized by capability and phase
- **Progressive Complexity**: From basic usage to advanced patterns
- **Production Focus**: Ready for real-world deployment
## 🚀 Production Readiness Indicators
### ✅ Complete Feature Set
- **27 Tools**: Comprehensive web automation capabilities
- **5 Enhancement Phases**: Systematic capability building
- **Batch Operations**: 10x efficiency improvements
- **Smart Element Checking**: Error prevention and conditional logic
### ✅ Comprehensive Documentation
- **Multiple Documentation Types**: Reference, guide, examples, best practices
- **LLM Optimized**: Designed for AI agent consumption
- **Production Guidelines**: Deployment and optimization instructions
- **Performance Benchmarks**: Real-world efficiency metrics
### ✅ Quality Assurance
- **All Tools Documented**: Complete coverage of 27 tools
- **Consistent Formatting**: Standardized documentation structure
- **Version Control**: Updated to v2.0.0 reflecting completion
- **Cross-Referenced**: Consistent information across all documents
## 📈 Impact Summary
### For LLM Agents
- **10x Form Efficiency**: Complete forms in 1-2 calls instead of 10+
- **Batch Operations**: Multiple data extractions in single calls
- **Smart Element Checking**: Conditional logic without timing issues
- **Rich Context**: Page state, performance metrics, content verification
### For Developers
- **Production Ready**: Complete deployment and optimization guides
- **Best Practices**: Comprehensive performance optimization guidelines
- **Error Prevention**: Smart strategies for reliable automation
- **Resource Management**: Efficient file and memory management
### For Organizations
- **Scalable Solution**: Production-ready web automation platform
- **Cost Effective**: Significant efficiency improvements reduce resource usage
- **Reliable**: Error prevention and smart checking strategies
- **Maintainable**: Comprehensive documentation and best practices
## 🎉 Final Status
**Phase 6 Status**: ✅ **COMPLETE**
**Overall Project Status**: ✅ **PRODUCTION READY**
**Documentation Status**: ✅ **COMPREHENSIVE**
**Version**: 2.0.0
### Ready for Production Deployment
The cremote MCP server is now a **complete web automation platform** with:
- **27 comprehensive tools** across 5 enhancement phases
- **Complete documentation** optimized for LLM agents
- **Production deployment guides** with performance optimization
- **Real-world workflow examples** for common automation tasks
- **Best practices documentation** for reliable operation
---
**🚀 Mission Accomplished**: Phase 6 documentation updates complete. The cremote MCP server is now production-ready with comprehensive documentation, delivering 10x efficiency improvements for LLM-driven web automation workflows.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -1,923 +0,0 @@
package main
import (
"bufio"
"encoding/json"
"fmt"
"io"
"log"
"os"
"os/signal"
"path/filepath"
"strconv"
"strings"
"syscall"
"time"
"git.teamworkapps.com/shortcut/cremote/client"
)
// DebugLogger handles debug logging to file
type DebugLogger struct {
file *os.File
logger *log.Logger
}
// NewDebugLogger creates a new debug logger
func NewDebugLogger() (*DebugLogger, error) {
logDir := "/tmp/cremote-mcp-logs"
if err := os.MkdirAll(logDir, 0755); err != nil {
return nil, fmt.Errorf("failed to create log directory: %w", err)
}
logFile := filepath.Join(logDir, fmt.Sprintf("mcp-stdio-%d.log", time.Now().Unix()))
file, err := os.OpenFile(logFile, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0644)
if err != nil {
return nil, fmt.Errorf("failed to open log file: %w", err)
}
logger := log.New(file, "", log.LstdFlags|log.Lmicroseconds|log.Lshortfile)
logger.Printf("=== MCP STDIO Server Debug Log Started ===")
return &DebugLogger{
file: file,
logger: logger,
}, nil
}
// Log writes a debug message
func (d *DebugLogger) Log(format string, args ...interface{}) {
if d != nil && d.logger != nil {
d.logger.Printf(format, args...)
}
}
// LogJSON logs a JSON object with a label
func (d *DebugLogger) LogJSON(label string, obj interface{}) {
if d != nil && d.logger != nil {
jsonBytes, err := json.MarshalIndent(obj, "", " ")
if err != nil {
d.logger.Printf("%s: JSON marshal error: %v", label, err)
} else {
d.logger.Printf("%s:\n%s", label, string(jsonBytes))
}
}
}
// Close closes the debug logger
func (d *DebugLogger) Close() {
if d != nil && d.file != nil {
d.logger.Printf("=== MCP STDIO Server Debug Log Ended ===")
d.file.Close()
}
}
var debugLogger *DebugLogger
// MCPServer wraps the cremote client with MCP protocol
type MCPServer struct {
client *client.Client
currentTab string
tabHistory []string
iframeMode bool
screenshots []string
}
// MCPRequest represents an incoming MCP request
type MCPRequest struct {
JSONRPC string `json:"jsonrpc,omitempty"`
Method string `json:"method"`
Params map[string]interface{} `json:"params"`
ID interface{} `json:"id"`
}
// MCPResponse represents an MCP response
type MCPResponse struct {
JSONRPC string `json:"jsonrpc,omitempty"`
Result interface{} `json:"result,omitempty"`
Error *MCPError `json:"error,omitempty"`
ID interface{} `json:"id"`
}
// MCPError represents an MCP error
type MCPError struct {
Code int `json:"code"`
Message string `json:"message"`
}
// ToolResult represents the result of a tool execution
type ToolResult struct {
Success bool `json:"success"`
Data interface{} `json:"data,omitempty"`
Screenshot string `json:"screenshot,omitempty"`
CurrentTab string `json:"current_tab,omitempty"`
TabHistory []string `json:"tab_history,omitempty"`
IframeMode bool `json:"iframe_mode"`
Error string `json:"error,omitempty"`
Metadata map[string]string `json:"metadata,omitempty"`
}
// NewMCPServer creates a new MCP server instance
func NewMCPServer(host string, port int) *MCPServer {
c := client.NewClient(host, port)
return &MCPServer{
client: c,
tabHistory: make([]string, 0),
screenshots: make([]string, 0),
}
}
// HandleRequest processes an MCP request and returns a response
func (s *MCPServer) HandleRequest(req MCPRequest) MCPResponse {
debugLogger.Log("HandleRequest called with method: %s, ID: %v", req.Method, req.ID)
debugLogger.LogJSON("Incoming Request", req)
var resp MCPResponse
switch req.Method {
case "initialize":
debugLogger.Log("Handling initialize request")
resp = s.handleInitialize(req)
case "tools/list":
debugLogger.Log("Handling tools/list request")
resp = s.handleToolsList(req)
case "tools/call":
debugLogger.Log("Handling tools/call request")
resp = s.handleToolCall(req)
default:
debugLogger.Log("Unknown method: %s", req.Method)
resp = MCPResponse{
Error: &MCPError{
Code: -32601,
Message: fmt.Sprintf("Method not found: %s", req.Method),
},
ID: req.ID,
}
}
debugLogger.LogJSON("Response", resp)
debugLogger.Log("HandleRequest completed for method: %s", req.Method)
return resp
}
// handleInitialize handles the MCP initialize request
func (s *MCPServer) handleInitialize(req MCPRequest) MCPResponse {
debugLogger.Log("handleInitialize: Processing initialize request")
debugLogger.LogJSON("Initialize request params", req.Params)
result := map[string]interface{}{
"protocolVersion": "2024-11-05",
"capabilities": map[string]interface{}{
"tools": map[string]interface{}{
"listChanged": true,
},
},
"serverInfo": map[string]interface{}{
"name": "cremote-mcp",
"version": "1.0.0",
},
}
debugLogger.LogJSON("Initialize response result", result)
return MCPResponse{
Result: result,
ID: req.ID,
}
}
// handleToolsList returns the list of available tools
func (s *MCPServer) handleToolsList(req MCPRequest) MCPResponse {
tools := []map[string]interface{}{
{
"name": "web_navigate",
"description": "Navigate to a URL and optionally take a screenshot",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"url": map[string]interface{}{
"type": "string",
"description": "URL to navigate to",
},
"tab": map[string]interface{}{
"type": "string",
"description": "Tab ID (optional, uses current tab)",
},
"screenshot": map[string]interface{}{
"type": "boolean",
"description": "Take screenshot after navigation",
},
"timeout": map[string]interface{}{
"type": "integer",
"description": "Timeout in seconds",
"default": 5,
},
},
"required": []string{"url"},
},
},
{
"name": "web_interact",
"description": "Interact with web elements (click, fill, submit)",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"action": map[string]interface{}{
"type": "string",
"enum": []string{"click", "fill", "submit", "upload"},
},
"selector": map[string]interface{}{
"type": "string",
"description": "CSS selector for the element",
},
"value": map[string]interface{}{
"type": "string",
"description": "Value to fill (for fill/upload actions)",
},
"tab": map[string]interface{}{
"type": "string",
"description": "Tab ID (optional)",
},
"timeout": map[string]interface{}{
"type": "integer",
"description": "Timeout in seconds",
"default": 5,
},
},
"required": []string{"action", "selector"},
},
},
{
"name": "web_extract",
"description": "Extract data from the page (source, element HTML, or execute JavaScript)",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"type": map[string]interface{}{
"type": "string",
"enum": []string{"source", "element", "javascript"},
},
"selector": map[string]interface{}{
"type": "string",
"description": "CSS selector (for element type)",
},
"code": map[string]interface{}{
"type": "string",
"description": "JavaScript code (for javascript type)",
},
"tab": map[string]interface{}{
"type": "string",
"description": "Tab ID (optional)",
},
"timeout": map[string]interface{}{
"type": "integer",
"description": "Timeout in seconds",
"default": 5,
},
},
"required": []string{"type"},
},
},
{
"name": "web_screenshot",
"description": "Take a screenshot of the current page",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"output": map[string]interface{}{
"type": "string",
"description": "Output file path",
},
"full_page": map[string]interface{}{
"type": "boolean",
"description": "Capture full page",
"default": false,
},
"tab": map[string]interface{}{
"type": "string",
"description": "Tab ID (optional)",
},
"timeout": map[string]interface{}{
"type": "integer",
"description": "Timeout in seconds",
"default": 5,
},
},
"required": []string{"output"},
},
},
{
"name": "web_manage_tabs",
"description": "Manage browser tabs (open, close, list, switch)",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"action": map[string]interface{}{
"type": "string",
"enum": []string{"open", "close", "list", "switch"},
},
"tab": map[string]interface{}{
"type": "string",
"description": "Tab ID (for close/switch actions)",
},
"timeout": map[string]interface{}{
"type": "integer",
"description": "Timeout in seconds",
"default": 5,
},
},
"required": []string{"action"},
},
},
{
"name": "web_iframe",
"description": "Switch iframe context for subsequent operations",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"action": map[string]interface{}{
"type": "string",
"enum": []string{"enter", "exit"},
},
"selector": map[string]interface{}{
"type": "string",
"description": "Iframe CSS selector (for enter action)",
},
"tab": map[string]interface{}{
"type": "string",
"description": "Tab ID (optional)",
},
},
"required": []string{"action"},
},
},
}
return MCPResponse{
Result: map[string]interface{}{
"tools": tools,
},
ID: req.ID,
}
}
// handleToolCall executes a tool and returns the result
func (s *MCPServer) handleToolCall(req MCPRequest) MCPResponse {
debugLogger.Log("handleToolCall: Processing tool call request")
debugLogger.LogJSON("Tool call request params", req.Params)
params := req.Params
toolName, ok := params["name"].(string)
if !ok || toolName == "" {
debugLogger.Log("handleToolCall: Tool name missing or invalid")
return MCPResponse{
Error: &MCPError{Code: -32602, Message: "Missing tool name"},
ID: req.ID,
}
}
debugLogger.Log("handleToolCall: Tool name extracted: %s", toolName)
arguments, _ := params["arguments"].(map[string]interface{})
if arguments == nil {
debugLogger.Log("handleToolCall: No arguments provided, using empty map")
arguments = make(map[string]interface{})
} else {
debugLogger.LogJSON("Tool arguments", arguments)
}
var result ToolResult
var err error
debugLogger.Log("handleToolCall: Dispatching to tool handler: %s", toolName)
switch toolName {
case "web_navigate":
result, err = s.handleNavigate(arguments)
case "web_interact":
result, err = s.handleInteract(arguments)
case "web_extract":
result, err = s.handleExtract(arguments)
case "web_screenshot":
result, err = s.handleScreenshot(arguments)
case "web_manage_tabs":
result, err = s.handleManageTabs(arguments)
case "web_iframe":
result, err = s.handleIframe(arguments)
default:
debugLogger.Log("handleToolCall: Unknown tool: %s", toolName)
return MCPResponse{
Error: &MCPError{Code: -32601, Message: fmt.Sprintf("Unknown tool: %s", toolName)},
ID: req.ID,
}
}
debugLogger.LogJSON("Tool execution result", result)
if err != nil {
debugLogger.Log("handleToolCall: Tool execution error: %v", err)
return MCPResponse{
Error: &MCPError{Code: -32603, Message: err.Error()},
ID: req.ID,
}
}
// Always include current state in response
result.CurrentTab = s.currentTab
result.TabHistory = s.tabHistory
result.IframeMode = s.iframeMode
response := MCPResponse{Result: result, ID: req.ID}
debugLogger.LogJSON("Final tool call response", response)
debugLogger.Log("handleToolCall: Completed successfully for tool: %s", toolName)
return response
}
// Helper functions for parameter extraction
func getStringParam(params map[string]interface{}, key, defaultValue string) string {
if val, ok := params[key].(string); ok {
return val
}
return defaultValue
}
func getIntParam(params map[string]interface{}, key string, defaultValue int) int {
if val, ok := params[key].(float64); ok {
return int(val)
}
return defaultValue
}
func getBoolParam(params map[string]interface{}, key string, defaultValue bool) bool {
if val, ok := params[key].(bool); ok {
return val
}
return defaultValue
}
// resolveTabID returns the tab ID to use, defaulting to current tab
func (s *MCPServer) resolveTabID(tabID string) string {
if tabID != "" {
return tabID
}
return s.currentTab
}
// handleNavigate handles web navigation
func (s *MCPServer) handleNavigate(params map[string]interface{}) (ToolResult, error) {
url := getStringParam(params, "url", "")
if url == "" {
return ToolResult{}, fmt.Errorf("url parameter is required")
}
tab := getStringParam(params, "tab", "")
screenshot := getBoolParam(params, "screenshot", false)
timeout := getIntParam(params, "timeout", 5)
// If no tab specified and no current tab, open a new one
if tab == "" && s.currentTab == "" {
newTab, err := s.client.OpenTab(timeout)
if err != nil {
return ToolResult{}, fmt.Errorf("failed to open new tab: %w", err)
}
s.currentTab = newTab
s.tabHistory = append(s.tabHistory, newTab)
tab = newTab
} else if tab == "" {
tab = s.currentTab
} else {
// Update current tab if specified
s.currentTab = tab
// Move to end of history if it exists, otherwise add it
s.removeTabFromHistory(tab)
s.tabHistory = append(s.tabHistory, tab)
}
// Navigate to URL
err := s.client.LoadURL(tab, url, timeout)
if err != nil {
return ToolResult{}, fmt.Errorf("failed to navigate to %s: %w", url, err)
}
result := ToolResult{
Success: true,
Data: map[string]string{
"url": url,
"tab": tab,
},
}
// Take screenshot if requested
if screenshot {
screenshotPath := fmt.Sprintf("/tmp/navigate-%d.png", time.Now().Unix())
err = s.client.TakeScreenshot(tab, screenshotPath, false, timeout)
if err != nil {
// Don't fail the whole operation for screenshot errors
result.Metadata = map[string]string{
"screenshot_error": err.Error(),
}
} else {
result.Screenshot = screenshotPath
s.screenshots = append(s.screenshots, screenshotPath)
}
}
return result, nil
}
// handleInteract handles element interactions
func (s *MCPServer) handleInteract(params map[string]interface{}) (ToolResult, error) {
action := getStringParam(params, "action", "")
selector := getStringParam(params, "selector", "")
value := getStringParam(params, "value", "")
tab := s.resolveTabID(getStringParam(params, "tab", ""))
timeout := getIntParam(params, "timeout", 5)
if action == "" {
return ToolResult{}, fmt.Errorf("action parameter is required")
}
if selector == "" {
return ToolResult{}, fmt.Errorf("selector parameter is required")
}
if tab == "" {
return ToolResult{}, fmt.Errorf("no active tab available")
}
var err error
result := ToolResult{Success: true}
switch action {
case "click":
err = s.client.ClickElement(tab, selector, timeout, timeout)
result.Data = map[string]string{"action": "clicked", "selector": selector}
case "fill":
if value == "" {
return ToolResult{}, fmt.Errorf("value parameter is required for fill action")
}
err = s.client.FillFormField(tab, selector, value, timeout, timeout)
result.Data = map[string]string{"action": "filled", "selector": selector, "value": value}
case "submit":
err = s.client.SubmitForm(tab, selector, timeout, timeout)
result.Data = map[string]string{"action": "submitted", "selector": selector}
case "upload":
if value == "" {
return ToolResult{}, fmt.Errorf("value parameter is required for upload action")
}
err = s.client.UploadFile(tab, selector, value, timeout, timeout)
result.Data = map[string]string{"action": "uploaded", "selector": selector, "file": value}
default:
return ToolResult{}, fmt.Errorf("unknown action: %s", action)
}
if err != nil {
return ToolResult{}, fmt.Errorf("failed to %s element: %w", action, err)
}
return result, nil
}
// handleExtract handles data extraction
func (s *MCPServer) handleExtract(params map[string]interface{}) (ToolResult, error) {
extractType := getStringParam(params, "type", "")
selector := getStringParam(params, "selector", "")
code := getStringParam(params, "code", "")
tab := s.resolveTabID(getStringParam(params, "tab", ""))
timeout := getIntParam(params, "timeout", 5)
if extractType == "" {
return ToolResult{}, fmt.Errorf("type parameter is required")
}
if tab == "" {
return ToolResult{}, fmt.Errorf("no active tab available")
}
var data string
var err error
switch extractType {
case "source":
data, err = s.client.GetPageSource(tab, timeout)
case "element":
if selector == "" {
return ToolResult{}, fmt.Errorf("selector parameter is required for element type")
}
data, err = s.client.GetElementHTML(tab, selector, timeout)
case "javascript":
if code == "" {
return ToolResult{}, fmt.Errorf("code parameter is required for javascript type")
}
data, err = s.client.EvalJS(tab, code, timeout)
default:
return ToolResult{}, fmt.Errorf("unknown extract type: %s", extractType)
}
if err != nil {
return ToolResult{}, fmt.Errorf("failed to extract %s: %w", extractType, err)
}
return ToolResult{
Success: true,
Data: data,
}, nil
}
// handleScreenshot handles screenshot capture
func (s *MCPServer) handleScreenshot(params map[string]interface{}) (ToolResult, error) {
output := getStringParam(params, "output", "")
if output == "" {
output = fmt.Sprintf("/tmp/screenshot-%d.png", time.Now().Unix())
}
tab := s.resolveTabID(getStringParam(params, "tab", ""))
fullPage := getBoolParam(params, "full_page", false)
timeout := getIntParam(params, "timeout", 5)
if tab == "" {
return ToolResult{}, fmt.Errorf("no active tab available")
}
err := s.client.TakeScreenshot(tab, output, fullPage, timeout)
if err != nil {
return ToolResult{}, fmt.Errorf("failed to take screenshot: %w", err)
}
s.screenshots = append(s.screenshots, output)
return ToolResult{
Success: true,
Screenshot: output,
Data: map[string]interface{}{
"output": output,
"full_page": fullPage,
"tab": tab,
},
}, nil
}
// handleManageTabs handles tab management operations
func (s *MCPServer) handleManageTabs(params map[string]interface{}) (ToolResult, error) {
action := getStringParam(params, "action", "")
tab := getStringParam(params, "tab", "")
timeout := getIntParam(params, "timeout", 5)
if action == "" {
return ToolResult{}, fmt.Errorf("action parameter is required")
}
var data interface{}
var err error
switch action {
case "open":
newTab, err := s.client.OpenTab(timeout)
if err != nil {
return ToolResult{}, fmt.Errorf("failed to open tab: %w", err)
}
s.currentTab = newTab
s.tabHistory = append(s.tabHistory, newTab)
data = map[string]string{"tab": newTab, "action": "opened"}
case "close":
targetTab := s.resolveTabID(tab)
if targetTab == "" {
return ToolResult{}, fmt.Errorf("no tab to close")
}
err = s.client.CloseTab(targetTab, timeout)
if err != nil {
return ToolResult{}, fmt.Errorf("failed to close tab: %w", err)
}
s.removeTabFromHistory(targetTab)
data = map[string]string{"tab": targetTab, "action": "closed"}
case "list":
tabs, err := s.client.ListTabs()
if err != nil {
return ToolResult{}, fmt.Errorf("failed to list tabs: %w", err)
}
data = tabs
case "switch":
if tab == "" {
return ToolResult{}, fmt.Errorf("tab parameter is required for switch action")
}
s.currentTab = tab
s.removeTabFromHistory(tab)
s.tabHistory = append(s.tabHistory, tab)
data = map[string]string{"tab": tab, "action": "switched"}
default:
return ToolResult{}, fmt.Errorf("unknown tab action: %s", action)
}
if err != nil {
return ToolResult{}, err
}
return ToolResult{
Success: true,
Data: data,
}, nil
}
// handleIframe handles iframe context switching
func (s *MCPServer) handleIframe(params map[string]interface{}) (ToolResult, error) {
action := getStringParam(params, "action", "")
selector := getStringParam(params, "selector", "")
tab := s.resolveTabID(getStringParam(params, "tab", ""))
if action == "" {
return ToolResult{}, fmt.Errorf("action parameter is required")
}
if tab == "" {
return ToolResult{}, fmt.Errorf("no active tab available")
}
var err error
var data map[string]string
switch action {
case "enter":
if selector == "" {
return ToolResult{}, fmt.Errorf("selector parameter is required for enter action")
}
err = s.client.SwitchToIframe(tab, selector, 5) // Default 5 second timeout
s.iframeMode = true
data = map[string]string{"action": "entered", "selector": selector}
case "exit":
err = s.client.SwitchToMain(tab)
s.iframeMode = false
data = map[string]string{"action": "exited"}
default:
return ToolResult{}, fmt.Errorf("unknown iframe action: %s", action)
}
if err != nil {
return ToolResult{}, fmt.Errorf("failed to %s iframe: %w", action, err)
}
return ToolResult{
Success: true,
Data: data,
}, nil
}
// removeTabFromHistory removes a tab from history and updates current tab
func (s *MCPServer) removeTabFromHistory(tabID string) {
for i, id := range s.tabHistory {
if id == tabID {
s.tabHistory = append(s.tabHistory[:i], s.tabHistory[i+1:]...)
break
}
}
if s.currentTab == tabID {
if len(s.tabHistory) > 0 {
s.currentTab = s.tabHistory[len(s.tabHistory)-1]
} else {
s.currentTab = ""
}
}
}
func main() {
// Initialize debug logger
var err error
debugLogger, err = NewDebugLogger()
if err != nil {
log.Printf("Warning: Failed to initialize debug logger: %v", err)
// Continue without debug logging
} else {
defer debugLogger.Close()
debugLogger.Log("MCP STDIO Server starting up")
}
// Set up signal handling for graceful shutdown
sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
host := os.Getenv("CREMOTE_HOST")
if host == "" {
host = "localhost"
}
portStr := os.Getenv("CREMOTE_PORT")
port := 8989
if portStr != "" {
if p, err := strconv.Atoi(portStr); err == nil {
port = p
}
}
debugLogger.Log("Connecting to cremote daemon at %s:%d", host, port)
log.Printf("Starting MCP stdio server, connecting to cremote daemon at %s:%d", host, port)
server := NewMCPServer(host, port)
// Create a buffered reader for better EOF handling
reader := bufio.NewReader(os.Stdin)
log.Printf("MCP stdio server ready, waiting for requests...")
// Channel to signal when to stop reading
done := make(chan bool)
// Goroutine to handle stdin reading
go func() {
defer close(done)
for {
// Read headers for Content-Length framing (use the same reader for headers and body)
contentLength := -1
for {
line, err := reader.ReadString('\n')
if err != nil {
if err == io.EOF {
log.Printf("Input stream closed, shutting down MCP server")
return
}
log.Printf("Error reading header: %v", err)
return
}
line = strings.TrimRight(line, "\r\n")
if line == "" {
break // end of headers
}
const prefix = "content-length: "
low := strings.ToLower(line)
if strings.HasPrefix(low, prefix) {
var err error
contentLength, err = strconv.Atoi(strings.TrimSpace(low[len(prefix):]))
if err != nil {
log.Printf("Invalid Content-Length: %v", err)
return
}
}
}
if contentLength < 0 {
log.Printf("Missing Content-Length header")
return
}
// Read body of specified length
debugLogger.Log("Reading request body of length: %d", contentLength)
body := make([]byte, contentLength)
if _, err := io.ReadFull(reader, body); err != nil {
debugLogger.Log("Error reading body: %v", err)
log.Printf("Error reading body: %v", err)
return
}
debugLogger.Log("Raw request body: %s", string(body))
var req MCPRequest
if err := json.Unmarshal(body, &req); err != nil {
debugLogger.Log("Error decoding request body: %v", err)
log.Printf("Error decoding request body: %v", err)
continue
}
debugLogger.Log("Successfully parsed request: %s (ID: %v)", req.Method, req.ID)
log.Printf("Processing request: %s (ID: %v)", req.Method, req.ID)
resp := server.HandleRequest(req)
resp.JSONRPC = "2.0"
// Write Content-Length framed response
responseBytes, err := json.Marshal(resp)
if err != nil {
debugLogger.Log("Error marshaling response: %v", err)
log.Printf("Error marshaling response: %v", err)
continue
}
debugLogger.Log("Sending response with Content-Length: %d", len(responseBytes))
debugLogger.Log("Raw response: %s", string(responseBytes))
fmt.Fprintf(os.Stdout, "Content-Length: %d\r\n\r\n", len(responseBytes))
os.Stdout.Write(responseBytes)
debugLogger.Log("Response sent successfully for request: %s", req.Method)
log.Printf("Completed request: %s", req.Method)
}
}()
// Wait for either completion or signal
select {
case <-done:
log.Printf("Input processing completed")
case sig := <-sigChan:
log.Printf("Received signal %v, shutting down", sig)
}
log.Printf("MCP stdio server shutdown complete")
}

View File

@ -1,822 +0,0 @@
//go:build mcp_http
package main
import (
"encoding/json"
"fmt"
"log"
"net/http"
"os"
"path/filepath"
"strconv"
"time"
"git.teamworkapps.com/shortcut/cremote/client"
)
// DebugLogger handles debug logging to file
type DebugLogger struct {
file *os.File
logger *log.Logger
}
// NewDebugLogger creates a new debug logger
func NewDebugLogger() (*DebugLogger, error) {
logDir := "/tmp/cremote-mcp-logs"
if err := os.MkdirAll(logDir, 0755); err != nil {
return nil, fmt.Errorf("failed to create log directory: %w", err)
}
logFile := filepath.Join(logDir, fmt.Sprintf("mcp-http-%d.log", time.Now().Unix()))
file, err := os.OpenFile(logFile, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0644)
if err != nil {
return nil, fmt.Errorf("failed to open log file: %w", err)
}
logger := log.New(file, "", log.LstdFlags|log.Lmicroseconds|log.Lshortfile)
logger.Printf("=== MCP HTTP Server Debug Log Started ===")
return &DebugLogger{
file: file,
logger: logger,
}, nil
}
// Log writes a debug message
func (d *DebugLogger) Log(format string, args ...interface{}) {
if d != nil && d.logger != nil {
d.logger.Printf(format, args...)
}
}
// LogJSON logs a JSON object with a label
func (d *DebugLogger) LogJSON(label string, obj interface{}) {
if d != nil && d.logger != nil {
jsonBytes, err := json.MarshalIndent(obj, "", " ")
if err != nil {
d.logger.Printf("%s: JSON marshal error: %v", label, err)
} else {
d.logger.Printf("%s:\n%s", label, string(jsonBytes))
}
}
}
// Close closes the debug logger
func (d *DebugLogger) Close() {
if d != nil && d.file != nil {
d.logger.Printf("=== MCP HTTP Server Debug Log Ended ===")
d.file.Close()
}
}
var debugLogger *DebugLogger
// MCPServer wraps the cremote client with MCP protocol
type MCPServer struct {
client *client.Client
currentTab string
tabHistory []string
iframeMode bool
lastError string
screenshots []string
}
// MCPRequest represents an incoming MCP request
type MCPRequest struct {
Method string `json:"method"`
Params map[string]interface{} `json:"params"`
ID interface{} `json:"id"`
}
// MCPResponse represents an MCP response
type MCPResponse struct {
Result interface{} `json:"result,omitempty"`
Error *MCPError `json:"error,omitempty"`
ID interface{} `json:"id"`
}
// MCPError represents an MCP error
type MCPError struct {
Code int `json:"code"`
Message string `json:"message"`
}
// ToolResult represents the result of a tool execution
type ToolResult struct {
Success bool `json:"success"`
Data interface{} `json:"data,omitempty"`
Screenshot string `json:"screenshot,omitempty"`
CurrentTab string `json:"current_tab,omitempty"`
TabHistory []string `json:"tab_history,omitempty"`
IframeMode bool `json:"iframe_mode"`
Error string `json:"error,omitempty"`
Metadata map[string]string `json:"metadata,omitempty"`
}
// NewMCPServer creates a new MCP server instance
func NewMCPServer(host string, port int) *MCPServer {
return &MCPServer{
client: client.NewClient(host, port),
tabHistory: make([]string, 0),
screenshots: make([]string, 0),
}
}
// HandleRequest processes an MCP request
func (s *MCPServer) HandleRequest(req MCPRequest) MCPResponse {
debugLogger.Log("HandleRequest called with method: %s, ID: %v", req.Method, req.ID)
debugLogger.LogJSON("Incoming Request", req)
var resp MCPResponse
switch req.Method {
case "initialize":
debugLogger.Log("Handling initialize request")
resp = s.handleInitialize(req)
case "tools/list":
debugLogger.Log("Handling tools/list request")
resp = s.handleToolsList(req)
case "tools/call":
debugLogger.Log("Handling tools/call request")
resp = s.handleToolCall(req)
default:
debugLogger.Log("Unknown method: %s", req.Method)
resp = MCPResponse{
Error: &MCPError{Code: -32601, Message: "Method not found"},
ID: req.ID,
}
}
debugLogger.LogJSON("Response", resp)
debugLogger.Log("HandleRequest completed for method: %s", req.Method)
return resp
}
// handleInitialize handles the MCP initialize request
func (s *MCPServer) handleInitialize(req MCPRequest) MCPResponse {
return MCPResponse{
Result: map[string]interface{}{
"protocolVersion": "2024-11-05",
"capabilities": map[string]interface{}{
"tools": map[string]interface{}{
"listChanged": true,
},
},
"serverInfo": map[string]interface{}{
"name": "cremote-mcp",
"version": "1.0.0",
},
},
ID: req.ID,
}
}
// handleToolsList returns the list of available tools
func (s *MCPServer) handleToolsList(req MCPRequest) MCPResponse {
tools := []map[string]interface{}{
{
"name": "web_navigate",
"description": "Navigate to a URL and optionally take a screenshot",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"url": map[string]interface{}{"type": "string", "description": "URL to navigate to"},
"tab": map[string]interface{}{"type": "string", "description": "Tab ID (optional, uses current tab)"},
"screenshot": map[string]interface{}{"type": "boolean", "description": "Take screenshot after navigation"},
"timeout": map[string]interface{}{"type": "integer", "description": "Timeout in seconds", "default": 5},
},
"required": []string{"url"},
},
},
{
"name": "web_interact",
"description": "Interact with web elements (click, fill, submit)",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"action": map[string]interface{}{"type": "string", "enum": []string{"click", "fill", "submit", "upload"}},
"selector": map[string]interface{}{"type": "string", "description": "CSS selector for the element"},
"value": map[string]interface{}{"type": "string", "description": "Value to fill (for fill/upload actions)"},
"tab": map[string]interface{}{"type": "string", "description": "Tab ID (optional)"},
"timeout": map[string]interface{}{"type": "integer", "description": "Timeout in seconds", "default": 5},
},
"required": []string{"action", "selector"},
},
},
{
"name": "web_extract",
"description": "Extract data from the page (source, element HTML, or execute JavaScript)",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"type": map[string]interface{}{"type": "string", "enum": []string{"source", "element", "javascript"}},
"selector": map[string]interface{}{"type": "string", "description": "CSS selector (for element type)"},
"code": map[string]interface{}{"type": "string", "description": "JavaScript code (for javascript type)"},
"tab": map[string]interface{}{"type": "string", "description": "Tab ID (optional)"},
"timeout": map[string]interface{}{"type": "integer", "description": "Timeout in seconds", "default": 5},
},
"required": []string{"type"},
},
},
{
"name": "web_screenshot",
"description": "Take a screenshot of the current page",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"output": map[string]interface{}{"type": "string", "description": "Output file path"},
"full_page": map[string]interface{}{"type": "boolean", "description": "Capture full page", "default": false},
"tab": map[string]interface{}{"type": "string", "description": "Tab ID (optional)"},
"timeout": map[string]interface{}{"type": "integer", "description": "Timeout in seconds", "default": 5},
},
"required": []string{"output"},
},
},
{
"name": "web_manage_tabs",
"description": "Manage browser tabs (open, close, list, switch)",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"action": map[string]interface{}{"type": "string", "enum": []string{"open", "close", "list", "switch"}},
"tab": map[string]interface{}{"type": "string", "description": "Tab ID (for close/switch actions)"},
"timeout": map[string]interface{}{"type": "integer", "description": "Timeout in seconds", "default": 5},
},
"required": []string{"action"},
},
},
{
"name": "web_iframe",
"description": "Switch iframe context for subsequent operations",
"inputSchema": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"action": map[string]interface{}{"type": "string", "enum": []string{"enter", "exit"}},
"selector": map[string]interface{}{"type": "string", "description": "Iframe CSS selector (for enter action)"},
"tab": map[string]interface{}{"type": "string", "description": "Tab ID (optional)"},
},
"required": []string{"action"},
},
},
}
return MCPResponse{
Result: map[string]interface{}{"tools": tools},
ID: req.ID,
}
}
// handleToolCall executes a tool call
func (s *MCPServer) handleToolCall(req MCPRequest) MCPResponse {
params, ok := req.Params["arguments"].(map[string]interface{})
if !ok {
return MCPResponse{
Error: &MCPError{Code: -32602, Message: "Invalid parameters"},
ID: req.ID,
}
}
toolName, ok := req.Params["name"].(string)
if !ok {
return MCPResponse{
Error: &MCPError{Code: -32602, Message: "Tool name required"},
ID: req.ID,
}
}
var result ToolResult
var err error
switch toolName {
case "web_navigate":
result, err = s.handleNavigate(params)
case "web_interact":
result, err = s.handleInteract(params)
case "web_extract":
result, err = s.handleExtract(params)
case "web_screenshot":
result, err = s.handleScreenshot(params)
case "web_manage_tabs":
result, err = s.handleManageTabs(params)
case "web_iframe":
result, err = s.handleIframe(params)
default:
return MCPResponse{
Error: &MCPError{Code: -32601, Message: "Unknown tool: " + toolName},
ID: req.ID,
}
}
if err != nil {
result.Success = false
result.Error = err.Error()
s.lastError = err.Error()
}
// Always include current state in response
result.CurrentTab = s.currentTab
result.TabHistory = s.tabHistory
result.IframeMode = s.iframeMode
return MCPResponse{
Result: result,
ID: req.ID,
}
}
// Helper function to get string parameter with default
func getStringParam(params map[string]interface{}, key, defaultValue string) string {
if val, ok := params[key].(string); ok {
return val
}
return defaultValue
}
// Helper function to get int parameter with default
func getIntParam(params map[string]interface{}, key string, defaultValue int) int {
if val, ok := params[key].(float64); ok {
return int(val)
}
if val, ok := params[key].(int); ok {
return val
}
return defaultValue
}
// Helper function to get bool parameter with default
func getBoolParam(params map[string]interface{}, key string, defaultValue bool) bool {
if val, ok := params[key].(bool); ok {
return val
}
return defaultValue
}
// Helper function to resolve tab ID
func (s *MCPServer) resolveTabID(tabParam string) string {
if tabParam != "" {
return tabParam
}
return s.currentTab
}
// handleNavigate handles web navigation
func (s *MCPServer) handleNavigate(params map[string]interface{}) (ToolResult, error) {
url := getStringParam(params, "url", "")
if url == "" {
return ToolResult{}, fmt.Errorf("url parameter is required")
}
tab := getStringParam(params, "tab", "")
timeout := getIntParam(params, "timeout", 5)
takeScreenshot := getBoolParam(params, "screenshot", false)
// If no tab specified and we don't have a current tab, open one
if tab == "" && s.currentTab == "" {
newTab, err := s.client.OpenTab(timeout)
if err != nil {
return ToolResult{}, fmt.Errorf("failed to open new tab: %w", err)
}
s.currentTab = newTab
s.tabHistory = append(s.tabHistory, newTab)
tab = newTab
} else if tab == "" {
tab = s.currentTab
}
// Load the URL
err := s.client.LoadURL(tab, url, timeout)
if err != nil {
return ToolResult{}, fmt.Errorf("failed to load URL: %w", err)
}
result := ToolResult{
Success: true,
Data: map[string]string{"url": url, "tab": tab},
}
// Take screenshot if requested
if takeScreenshot {
screenshotPath := fmt.Sprintf("/tmp/navigate-%d.png", time.Now().Unix())
err = s.client.TakeScreenshot(tab, screenshotPath, false, timeout)
if err == nil {
result.Screenshot = screenshotPath
s.screenshots = append(s.screenshots, screenshotPath)
}
}
return result, nil
}
// handleInteract handles web element interactions
func (s *MCPServer) handleInteract(params map[string]interface{}) (ToolResult, error) {
action := getStringParam(params, "action", "")
selector := getStringParam(params, "selector", "")
value := getStringParam(params, "value", "")
tab := s.resolveTabID(getStringParam(params, "tab", ""))
timeout := getIntParam(params, "timeout", 5)
if action == "" || selector == "" {
return ToolResult{}, fmt.Errorf("action and selector parameters are required")
}
if tab == "" {
return ToolResult{}, fmt.Errorf("no active tab available")
}
var err error
result := ToolResult{Success: true}
switch action {
case "click":
err = s.client.ClickElement(tab, selector, timeout, timeout)
result.Data = map[string]string{"action": "clicked", "selector": selector}
case "fill":
if value == "" {
return ToolResult{}, fmt.Errorf("value parameter is required for fill action")
}
err = s.client.FillFormField(tab, selector, value, timeout, timeout)
result.Data = map[string]string{"action": "filled", "selector": selector, "value": value}
case "submit":
err = s.client.SubmitForm(tab, selector, timeout, timeout)
result.Data = map[string]string{"action": "submitted", "selector": selector}
case "upload":
if value == "" {
return ToolResult{}, fmt.Errorf("value parameter (file path) is required for upload action")
}
err = s.client.UploadFile(tab, selector, value, timeout, timeout)
result.Data = map[string]string{"action": "uploaded", "selector": selector, "file": value}
default:
return ToolResult{}, fmt.Errorf("unknown action: %s", action)
}
if err != nil {
return ToolResult{}, fmt.Errorf("failed to %s element: %w", action, err)
}
return result, nil
}
// handleExtract handles data extraction from pages
func (s *MCPServer) handleExtract(params map[string]interface{}) (ToolResult, error) {
extractType := getStringParam(params, "type", "")
selector := getStringParam(params, "selector", "")
code := getStringParam(params, "code", "")
tab := s.resolveTabID(getStringParam(params, "tab", ""))
timeout := getIntParam(params, "timeout", 5)
if extractType == "" {
return ToolResult{}, fmt.Errorf("type parameter is required")
}
if tab == "" {
return ToolResult{}, fmt.Errorf("no active tab available")
}
var data interface{}
var err error
switch extractType {
case "source":
data, err = s.client.GetPageSource(tab, timeout)
case "element":
if selector == "" {
return ToolResult{}, fmt.Errorf("selector parameter is required for element extraction")
}
data, err = s.client.GetElementHTML(tab, selector, timeout)
case "javascript":
if code == "" {
return ToolResult{}, fmt.Errorf("code parameter is required for javascript extraction")
}
data, err = s.client.EvalJS(tab, code, timeout)
default:
return ToolResult{}, fmt.Errorf("unknown extraction type: %s", extractType)
}
if err != nil {
return ToolResult{}, fmt.Errorf("failed to extract %s: %w", extractType, err)
}
return ToolResult{
Success: true,
Data: data,
Metadata: map[string]string{
"type": extractType,
"selector": selector,
},
}, nil
}
// handleScreenshot handles screenshot capture
func (s *MCPServer) handleScreenshot(params map[string]interface{}) (ToolResult, error) {
output := getStringParam(params, "output", "")
if output == "" {
output = fmt.Sprintf("/tmp/screenshot-%d.png", time.Now().Unix())
}
tab := s.resolveTabID(getStringParam(params, "tab", ""))
fullPage := getBoolParam(params, "full_page", false)
timeout := getIntParam(params, "timeout", 5)
if tab == "" {
return ToolResult{}, fmt.Errorf("no active tab available")
}
err := s.client.TakeScreenshot(tab, output, fullPage, timeout)
if err != nil {
return ToolResult{}, fmt.Errorf("failed to take screenshot: %w", err)
}
s.screenshots = append(s.screenshots, output)
return ToolResult{
Success: true,
Screenshot: output,
Data: map[string]interface{}{
"output": output,
"full_page": fullPage,
"tab": tab,
},
}, nil
}
// handleManageTabs handles tab management operations
func (s *MCPServer) handleManageTabs(params map[string]interface{}) (ToolResult, error) {
action := getStringParam(params, "action", "")
tab := getStringParam(params, "tab", "")
timeout := getIntParam(params, "timeout", 5)
if action == "" {
return ToolResult{}, fmt.Errorf("action parameter is required")
}
var data interface{}
var err error
switch action {
case "open":
newTab, err := s.client.OpenTab(timeout)
if err != nil {
return ToolResult{}, fmt.Errorf("failed to open tab: %w", err)
}
s.currentTab = newTab
s.tabHistory = append(s.tabHistory, newTab)
data = map[string]string{"tab": newTab, "action": "opened"}
case "close":
targetTab := s.resolveTabID(tab)
if targetTab == "" {
return ToolResult{}, fmt.Errorf("no tab to close")
}
err = s.client.CloseTab(targetTab, timeout)
if err != nil {
return ToolResult{}, fmt.Errorf("failed to close tab: %w", err)
}
// Remove from history and update current tab
s.removeTabFromHistory(targetTab)
data = map[string]string{"tab": targetTab, "action": "closed"}
case "list":
tabs, err := s.client.ListTabs()
if err != nil {
return ToolResult{}, fmt.Errorf("failed to list tabs: %w", err)
}
data = tabs
case "switch":
if tab == "" {
return ToolResult{}, fmt.Errorf("tab parameter is required for switch action")
}
s.currentTab = tab
// Move to end of history if it exists, otherwise add it
s.removeTabFromHistory(tab)
s.tabHistory = append(s.tabHistory, tab)
data = map[string]string{"tab": tab, "action": "switched"}
default:
return ToolResult{}, fmt.Errorf("unknown tab action: %s", action)
}
if err != nil {
return ToolResult{}, err
}
return ToolResult{
Success: true,
Data: data,
}, nil
}
// handleIframe handles iframe context switching
func (s *MCPServer) handleIframe(params map[string]interface{}) (ToolResult, error) {
action := getStringParam(params, "action", "")
selector := getStringParam(params, "selector", "")
tab := s.resolveTabID(getStringParam(params, "tab", ""))
if action == "" {
return ToolResult{}, fmt.Errorf("action parameter is required")
}
if tab == "" {
return ToolResult{}, fmt.Errorf("no active tab available")
}
var err error
var data map[string]string
switch action {
case "enter":
if selector == "" {
return ToolResult{}, fmt.Errorf("selector parameter is required for enter action")
}
err = s.client.SwitchToIframe(tab, selector, 5) // Default 5 second timeout
s.iframeMode = true
data = map[string]string{"action": "entered", "selector": selector}
case "exit":
err = s.client.SwitchToMain(tab)
s.iframeMode = false
data = map[string]string{"action": "exited"}
default:
return ToolResult{}, fmt.Errorf("unknown iframe action: %s", action)
}
if err != nil {
return ToolResult{}, fmt.Errorf("failed to %s iframe: %w", action, err)
}
return ToolResult{
Success: true,
Data: data,
}, nil
}
// Helper function to remove tab from history
func (s *MCPServer) removeTabFromHistory(tabID string) {
for i, id := range s.tabHistory {
if id == tabID {
s.tabHistory = append(s.tabHistory[:i], s.tabHistory[i+1:]...)
break
}
}
// If we removed the current tab, set current to the last in history
if s.currentTab == tabID {
if len(s.tabHistory) > 0 {
s.currentTab = s.tabHistory[len(s.tabHistory)-1]
} else {
s.currentTab = ""
}
}
}
// HTTP handler for MCP requests
func (s *MCPServer) ServeHTTP(w http.ResponseWriter, r *http.Request) {
debugLogger.Log("ServeHTTP: Received %s request from %s", r.Method, r.RemoteAddr)
debugLogger.Log("ServeHTTP: Request URL: %s", r.URL.String())
debugLogger.Log("ServeHTTP: Request headers: %v", r.Header)
// Set CORS headers for browser compatibility
w.Header().Set("Access-Control-Allow-Origin", "*")
w.Header().Set("Access-Control-Allow-Methods", "POST, OPTIONS")
w.Header().Set("Access-Control-Allow-Headers", "Content-Type")
w.Header().Set("Content-Type", "application/json")
// Handle preflight requests
if r.Method == "OPTIONS" {
debugLogger.Log("ServeHTTP: Handling OPTIONS preflight request")
w.WriteHeader(http.StatusOK)
return
}
// Only accept POST requests
if r.Method != "POST" {
debugLogger.Log("ServeHTTP: Method not allowed: %s", r.Method)
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
// Read and decode the request
var req MCPRequest
decoder := json.NewDecoder(r.Body)
if err := decoder.Decode(&req); err != nil {
debugLogger.Log("ServeHTTP: Error decoding request: %v", err)
log.Printf("Error decoding request: %v", err)
http.Error(w, "Invalid JSON", http.StatusBadRequest)
return
}
debugLogger.LogJSON("HTTP Request", req)
log.Printf("Processing HTTP request: %s (ID: %v)", req.Method, req.ID)
// Handle the request
resp := s.HandleRequest(req)
debugLogger.LogJSON("HTTP Response", resp)
// Encode and send the response
encoder := json.NewEncoder(w)
if err := encoder.Encode(resp); err != nil {
debugLogger.Log("ServeHTTP: Error encoding response: %v", err)
log.Printf("Error encoding response: %v", err)
http.Error(w, "Internal server error", http.StatusInternalServerError)
return
}
debugLogger.Log("ServeHTTP: Response sent successfully for request: %s", req.Method)
log.Printf("Completed HTTP request: %s", req.Method)
}
func main() {
// Initialize debug logger
var err error
debugLogger, err = NewDebugLogger()
if err != nil {
log.Printf("Warning: Failed to initialize debug logger: %v", err)
// Continue without debug logging
} else {
defer debugLogger.Close()
debugLogger.Log("MCP HTTP Server starting up")
}
// Get cremote daemon connection settings
cremoteHost := os.Getenv("CREMOTE_HOST")
if cremoteHost == "" {
cremoteHost = "localhost"
}
cremotePortStr := os.Getenv("CREMOTE_PORT")
cremotePort := 8989
if cremotePortStr != "" {
if p, err := strconv.Atoi(cremotePortStr); err == nil {
cremotePort = p
}
}
// Get HTTP server settings
httpHost := os.Getenv("MCP_HOST")
if httpHost == "" {
httpHost = "localhost"
}
httpPortStr := os.Getenv("MCP_PORT")
httpPort := 8990
if httpPortStr != "" {
if p, err := strconv.Atoi(httpPortStr); err == nil {
httpPort = p
}
}
debugLogger.Log("HTTP server will listen on %s:%d", httpHost, httpPort)
debugLogger.Log("Connecting to cremote daemon at %s:%d", cremoteHost, cremotePort)
log.Printf("Starting MCP HTTP server on %s:%d", httpHost, httpPort)
log.Printf("Connecting to cremote daemon at %s:%d", cremoteHost, cremotePort)
// Create the MCP server
mcpServer := NewMCPServer(cremoteHost, cremotePort)
// Set up HTTP routes
http.Handle("/mcp", mcpServer)
// Health check endpoint
http.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]string{
"status": "healthy",
"cremote_host": cremoteHost,
"cremote_port": strconv.Itoa(cremotePort),
})
})
// Root endpoint with basic info
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"service": "Cremote MCP Server",
"version": "1.0.0",
"endpoints": map[string]string{
"mcp": "/mcp",
"health": "/health",
},
"cremote_daemon": fmt.Sprintf("%s:%d", cremoteHost, cremotePort),
})
})
// Start the HTTP server
addr := fmt.Sprintf("%s:%d", httpHost, httpPort)
log.Printf("MCP HTTP server listening on http://%s", addr)
log.Printf("MCP endpoint: http://%s/mcp", addr)
log.Printf("Health check: http://%s/health", addr)
if err := http.ListenAndServe(addr, nil); err != nil {
log.Fatalf("HTTP server failed: %v", err)
}
}

View File

@ -1,11 +0,0 @@
{
"mcpServers": {
"cremote": {
"command": "/path/to/cremote/mcp/cremote-mcp",
"env": {
"CREMOTE_HOST": "localhost",
"CREMOTE_PORT": "8989"
}
}
}
}

View File

@ -1,5 +0,0 @@
#!/bin/bash
(
echo '{"method":"tools/list","params":{},"id":1}'
sleep 1
) | ./cremote-mcp

View File

@ -1,53 +0,0 @@
<!DOCTYPE html>
<html>
<head>
<title>Checkbox Test</title>
</head>
<body>
<h1>Checkbox Test</h1>
<form id="testForm">
<div>
<input type="checkbox" id="checkbox1" name="checkbox1">
<label for="checkbox1">Checkbox 1</label>
</div>
<div>
<input type="checkbox" id="checkbox2" name="checkbox2">
<label for="checkbox2">Checkbox 2</label>
</div>
<div>
<input type="radio" id="radio1" name="radioGroup" value="option1">
<label for="radio1">Radio 1</label>
</div>
<div>
<input type="radio" id="radio2" name="radioGroup" value="option2">
<label for="radio2">Radio 2</label>
</div>
<div>
<input type="text" id="textInput" name="textInput" placeholder="Text input">
</div>
<div>
<button type="button" id="showValues" onclick="showValues()">Show Values</button>
</div>
<div id="result"></div>
</form>
<script>
function showValues() {
const checkbox1 = document.getElementById('checkbox1').checked;
const checkbox2 = document.getElementById('checkbox2').checked;
const radio1 = document.getElementById('radio1').checked;
const radio2 = document.getElementById('radio2').checked;
const textInput = document.getElementById('textInput').value;
const result = document.getElementById('result');
result.innerHTML = `
<p>Checkbox 1: ${checkbox1}</p>
<p>Checkbox 2: ${checkbox2}</p>
<p>Radio 1: ${radio1}</p>
<p>Radio 2: ${radio2}</p>
<p>Text Input: ${textInput}</p>
`;
}
</script>
</body>
</html>

View File

@ -1,58 +0,0 @@
<!DOCTYPE html>
<html>
<head>
<title>Checkbox Test</title>
</head>
<body>
<h1>Checkbox Test</h1>
<form id="testForm">
<div>
<input type="checkbox" id="checkbox1" name="checkbox1" onchange="showValues()">
<label for="checkbox1">Checkbox 1</label>
</div>
<div>
<input type="checkbox" id="checkbox2" name="checkbox2" onchange="showValues()">
<label for="checkbox2">Checkbox 2</label>
</div>
<div>
<input type="radio" id="radio1" name="radioGroup" value="option1" onchange="showValues()">
<label for="radio1">Radio 1</label>
</div>
<div>
<input type="radio" id="radio2" name="radioGroup" value="option2" onchange="showValues()">
<label for="radio2">Radio 2</label>
</div>
<div>
<input type="text" id="textInput" name="textInput" placeholder="Text input" oninput="showValues()">
</div>
<div id="result">
<p>Checkbox 1: false</p>
<p>Checkbox 2: false</p>
<p>Radio 1: false</p>
<p>Radio 2: false</p>
<p>Text Input: </p>
</div>
</form>
<script>
function showValues() {
const checkbox1 = document.getElementById('checkbox1').checked;
const checkbox2 = document.getElementById('checkbox2').checked;
const radio1 = document.getElementById('radio1').checked;
const radio2 = document.getElementById('radio2').checked;
const textInput = document.getElementById('textInput').value;
const result = document.getElementById('result');
result.innerHTML = `
<p>Checkbox 1: ${checkbox1}</p>
<p>Checkbox 2: ${checkbox2}</p>
<p>Radio 1: ${radio1}</p>
<p>Radio 2: ${radio2}</p>
<p>Text Input: ${textInput}</p>
`;
console.log('Values updated:', { checkbox1, checkbox2, radio1, radio2, textInput });
}
</script>
</body>
</html>

View File

@ -1,71 +0,0 @@
#!/bin/bash
# Simple test for new daemon commands
set -e
echo "=== Testing New Daemon Commands ==="
# Start daemon
echo "Starting daemon..."
./cremotedaemon --debug &
DAEMON_PID=$!
sleep 3
cleanup() {
echo "Cleaning up..."
if [ ! -z "$DAEMON_PID" ]; then
kill $DAEMON_PID 2>/dev/null || true
wait $DAEMON_PID 2>/dev/null || true
fi
}
trap cleanup EXIT
# Test using curl to send commands directly to daemon
echo "Testing daemon commands via HTTP..."
# Open tab
echo "Opening tab..."
TAB_RESPONSE=$(curl -s -X POST http://localhost:8989/command \
-H "Content-Type: application/json" \
-d '{"action": "open-tab", "params": {"timeout": "10"}}')
echo "Tab response: $TAB_RESPONSE"
# Extract tab ID (simple parsing)
TAB_ID=$(echo "$TAB_RESPONSE" | grep -o '"data":"[^"]*"' | cut -d'"' -f4)
echo "Tab ID: $TAB_ID"
if [ -z "$TAB_ID" ]; then
echo "Failed to get tab ID"
exit 1
fi
# Load a simple page
echo "Loading Google..."
curl -s -X POST http://localhost:8989/command \
-H "Content-Type: application/json" \
-d "{\"action\": \"load-url\", \"params\": {\"tab\": \"$TAB_ID\", \"url\": \"https://www.google.com\", \"timeout\": \"10\"}}"
sleep 3
# Test check-element command
echo "Testing check-element command..."
CHECK_RESPONSE=$(curl -s -X POST http://localhost:8989/command \
-H "Content-Type: application/json" \
-d "{\"action\": \"check-element\", \"params\": {\"tab\": \"$TAB_ID\", \"selector\": \"input[name=q]\", \"type\": \"exists\", \"timeout\": \"5\"}}")
echo "Check element response: $CHECK_RESPONSE"
# Test count-elements command
echo "Testing count-elements command..."
COUNT_RESPONSE=$(curl -s -X POST http://localhost:8989/command \
-H "Content-Type: application/json" \
-d "{\"action\": \"count-elements\", \"params\": {\"tab\": \"$TAB_ID\", \"selector\": \"input\", \"timeout\": \"5\"}}")
echo "Count elements response: $COUNT_RESPONSE"
# Test get-element-attributes command
echo "Testing get-element-attributes command..."
ATTR_RESPONSE=$(curl -s -X POST http://localhost:8989/command \
-H "Content-Type: application/json" \
-d "{\"action\": \"get-element-attributes\", \"params\": {\"tab\": \"$TAB_ID\", \"selector\": \"input[name=q]\", \"attributes\": \"name,type,placeholder\", \"timeout\": \"5\"}}")
echo "Get attributes response: $ATTR_RESPONSE"
echo "All daemon command tests completed!"

View File

@ -1,71 +0,0 @@
#!/bin/bash
# Minimal test to check if daemon recognizes new commands
set -e
echo "=== Minimal Daemon Command Test ==="
# Start Chrome first
echo "Starting Chrome..."
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug --no-sandbox --disable-dev-shm-usage --headless &
CHROME_PID=$!
sleep 5
# Start daemon
echo "Starting daemon..."
./cremotedaemon --debug &
DAEMON_PID=$!
sleep 3
cleanup() {
echo "Cleaning up..."
if [ ! -z "$DAEMON_PID" ]; then
kill $DAEMON_PID 2>/dev/null || true
fi
if [ ! -z "$CHROME_PID" ]; then
kill $CHROME_PID 2>/dev/null || true
fi
}
trap cleanup EXIT
# Test if daemon recognizes the new commands (should not return "Unknown action")
echo "Testing if daemon recognizes check-element command..."
RESPONSE=$(curl -s -X POST http://localhost:8989/command \
-H "Content-Type: application/json" \
-d '{"action": "check-element", "params": {"selector": "body", "type": "exists"}}')
echo "Response: $RESPONSE"
if echo "$RESPONSE" | grep -q "Unknown action"; then
echo "ERROR: Daemon does not recognize check-element command!"
exit 1
else
echo "SUCCESS: Daemon recognizes check-element command"
fi
echo "Testing if daemon recognizes count-elements command..."
RESPONSE=$(curl -s -X POST http://localhost:8989/command \
-H "Content-Type: application/json" \
-d '{"action": "count-elements", "params": {"selector": "body"}}')
echo "Response: $RESPONSE"
if echo "$RESPONSE" | grep -q "Unknown action"; then
echo "ERROR: Daemon does not recognize count-elements command!"
exit 1
else
echo "SUCCESS: Daemon recognizes count-elements command"
fi
echo "Testing if daemon recognizes get-element-attributes command..."
RESPONSE=$(curl -s -X POST http://localhost:8989/command \
-H "Content-Type: application/json" \
-d '{"action": "get-element-attributes", "params": {"selector": "body", "attributes": "all"}}')
echo "Response: $RESPONSE"
if echo "$RESPONSE" | grep -q "Unknown action"; then
echo "ERROR: Daemon does not recognize get-element-attributes command!"
exit 1
else
echo "SUCCESS: Daemon recognizes get-element-attributes command"
fi
echo "All commands are recognized by the daemon!"

View File

@ -1,44 +0,0 @@
#!/bin/bash
# Simple test to see if our debug message appears
set -e
echo "=== Debug Test ==="
# Kill existing processes
pkill -f chromium || true
pkill -f cremotedaemon || true
sleep 2
# Start Chrome
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug --no-sandbox --disable-dev-shm-usage --headless &
CHROME_PID=$!
sleep 5
# Start daemon with debug output
echo "Starting daemon with debug..."
./cremotedaemon --debug &
DAEMON_PID=$!
sleep 3
cleanup() {
echo "Cleaning up..."
if [ ! -z "$DAEMON_PID" ]; then
kill $DAEMON_PID 2>/dev/null || true
fi
if [ ! -z "$CHROME_PID" ]; then
kill $CHROME_PID 2>/dev/null || true
fi
}
trap cleanup EXIT
# Test our new command and look for debug output
echo "Testing check-element command..."
curl -s -X POST http://localhost:8989/command \
-H "Content-Type: application/json" \
-d '{"action": "check-element", "params": {"selector": "body", "type": "exists"}}' &
# Wait a moment for the request to process
sleep 2
echo "Test completed - check daemon output above for debug messages"

View File

@ -1,48 +0,0 @@
#!/bin/bash
# Test using a different port to avoid conflict
set -e
echo "=== Testing on Different Port ==="
# Kill our processes (not the system one)
pkill -f chromium || true
sleep 2
# Start Chrome
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug --no-sandbox --disable-dev-shm-usage --headless &
CHROME_PID=$!
sleep 5
# Start daemon on different port
echo "Starting daemon on port 8990..."
./cremotedaemon --listen localhost --port 8990 --debug &
DAEMON_PID=$!
sleep 3
cleanup() {
echo "Cleaning up..."
if [ ! -z "$DAEMON_PID" ]; then
kill $DAEMON_PID 2>/dev/null || true
fi
if [ ! -z "$CHROME_PID" ]; then
kill $CHROME_PID 2>/dev/null || true
fi
}
trap cleanup EXIT
# Test our new command on the different port
echo "Testing check-element command on port 8990..."
RESPONSE=$(curl -s -X POST http://localhost:8990/command \
-H "Content-Type: application/json" \
-d '{"action": "check-element", "params": {"selector": "body", "type": "exists"}}')
echo "Response: $RESPONSE"
if echo "$RESPONSE" | grep -q "Unknown action"; then
echo "ERROR: New command still not recognized"
exit 1
else
echo "SUCCESS: New command recognized!"
fi
echo "Test completed successfully!"

View File

@ -1,96 +0,0 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Element Checking Test Page</title>
<style>
.hidden { display: none; }
.invisible { visibility: hidden; }
.container { margin: 20px; padding: 10px; border: 1px solid #ccc; }
.red { color: red; }
.blue { background-color: blue; color: white; }
</style>
</head>
<body>
<h1 id="main-title" class="red">Element Checking Test Page</h1>
<div class="container">
<h2>Visibility Tests</h2>
<p id="visible-paragraph">This paragraph is visible</p>
<p id="hidden-paragraph" class="hidden">This paragraph is hidden with display:none</p>
<p id="invisible-paragraph" class="invisible">This paragraph is invisible with visibility:hidden</p>
</div>
<div class="container">
<h2>Form Elements</h2>
<form id="test-form">
<label for="text-input">Text Input:</label>
<input type="text" id="text-input" name="text-input" value="default value" placeholder="Enter text">
<label for="disabled-input">Disabled Input:</label>
<input type="text" id="disabled-input" name="disabled-input" disabled value="disabled">
<label for="checkbox1">Checkbox 1 (checked):</label>
<input type="checkbox" id="checkbox1" name="checkbox1" checked>
<label for="checkbox2">Checkbox 2 (unchecked):</label>
<input type="checkbox" id="checkbox2" name="checkbox2">
<label for="radio1">Radio 1 (selected):</label>
<input type="radio" id="radio1" name="radio-group" value="option1" checked>
<label for="radio2">Radio 2 (not selected):</label>
<input type="radio" id="radio2" name="radio-group" value="option2">
<select id="dropdown" name="dropdown">
<option value="option1">Option 1</option>
<option value="option2" selected>Option 2 (selected)</option>
<option value="option3">Option 3</option>
</select>
<button type="button" id="test-button" class="blue">Test Button</button>
<button type="submit" id="submit-button" disabled>Submit (disabled)</button>
</form>
</div>
<div class="container">
<h2>Multiple Elements</h2>
<div class="item" data-id="1">Item 1</div>
<div class="item" data-id="2">Item 2</div>
<div class="item" data-id="3">Item 3</div>
<span class="item" data-id="4">Item 4 (span)</span>
</div>
<div class="container">
<h2>Custom Attributes</h2>
<div id="custom-element"
data-test="test-value"
data-number="42"
aria-label="Custom element"
title="This is a tooltip"
custom-attr="custom-value">
Element with custom attributes
</div>
</div>
<script>
// Add some dynamic behavior for testing
document.getElementById('test-button').addEventListener('click', function() {
this.focus();
console.log('Button clicked and focused');
});
// Function to toggle visibility for testing
function toggleVisibility(elementId) {
const element = document.getElementById(elementId);
element.classList.toggle('hidden');
}
// Function to focus an element for testing
function focusElement(elementId) {
document.getElementById(elementId).focus();
}
</script>
</body>
</html>

View File

@ -1,69 +0,0 @@
#!/bin/bash
# Test existing commands to make sure basic setup works
set -e
echo "=== Testing Existing Commands ==="
# Kill any existing processes
pkill -f chromium || true
pkill -f cremotedaemon || true
sleep 2
# Start Chrome
echo "Starting Chrome..."
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug --no-sandbox --disable-dev-shm-usage --headless &
CHROME_PID=$!
sleep 5
# Verify Chrome is responding
echo "Checking Chrome DevTools..."
curl -s http://localhost:9222/json/version || {
echo "Chrome DevTools not responding"
exit 1
}
# Start daemon
echo "Starting daemon..."
./cremotedaemon --debug &
DAEMON_PID=$!
sleep 3
cleanup() {
echo "Cleaning up..."
if [ ! -z "$DAEMON_PID" ]; then
kill $DAEMON_PID 2>/dev/null || true
fi
if [ ! -z "$CHROME_PID" ]; then
kill $CHROME_PID 2>/dev/null || true
fi
}
trap cleanup EXIT
# Test existing command
echo "Testing open-tab command..."
RESPONSE=$(curl -s -X POST http://localhost:8989/command \
-H "Content-Type: application/json" \
-d '{"action": "open-tab", "params": {"timeout": "10"}}')
echo "Open tab response: $RESPONSE"
if echo "$RESPONSE" | grep -q "Unknown action"; then
echo "ERROR: Even existing commands don't work!"
exit 1
fi
# Test our new command
echo "Testing check-element command..."
RESPONSE=$(curl -s -X POST http://localhost:8989/command \
-H "Content-Type: application/json" \
-d '{"action": "check-element", "params": {"selector": "body", "type": "exists"}}')
echo "Check element response: $RESPONSE"
if echo "$RESPONSE" | grep -q "Unknown action"; then
echo "ERROR: New command not recognized"
exit 1
else
echo "SUCCESS: New command recognized!"
fi
echo "Test completed successfully!"

View File

@ -1,71 +0,0 @@
#!/bin/bash
# Test script for iframe timeout functionality
set -e
echo "Starting iframe timeout test..."
# Start the daemon in background
echo "Starting cremotedaemon..."
./cremotedaemon --debug &
DAEMON_PID=$!
# Wait for daemon to start
sleep 2
# Function to cleanup
cleanup() {
echo "Cleaning up..."
kill $DAEMON_PID 2>/dev/null || true
wait $DAEMON_PID 2>/dev/null || true
}
# Set trap for cleanup
trap cleanup EXIT
# Test 1: Basic iframe switching with timeout
echo "Test 1: Basic iframe switching with timeout"
TAB_ID=$(./cremote open-tab --timeout 5 | grep -o '"[^"]*"' | tr -d '"')
echo "Created tab: $TAB_ID"
# Load test page
./cremote load-url --tab "$TAB_ID" --url "file://$(pwd)/test-iframe.html" --timeout 10
echo "Loaded test page"
# Switch to iframe with timeout
echo "Switching to iframe with 5 second timeout..."
./cremote switch-iframe --tab "$TAB_ID" --selector "#test-iframe" --timeout 5
echo "Successfully switched to iframe"
# Try to click button in iframe
echo "Clicking button in iframe..."
./cremote click-element --tab "$TAB_ID" --selector "#iframe-button" --selection-timeout 5 --action-timeout 5
echo "Successfully clicked iframe button"
# Switch back to main
echo "Switching back to main context..."
./cremote switch-main --tab "$TAB_ID"
echo "Successfully switched back to main"
# Try to click main button
echo "Clicking main page button..."
./cremote click-element --tab "$TAB_ID" --selector "#main-button" --selection-timeout 5 --action-timeout 5
echo "Successfully clicked main button"
# Test 2: Test timeout with non-existent iframe
echo ""
echo "Test 2: Testing timeout with non-existent iframe"
set +e # Allow command to fail
./cremote switch-iframe --tab "$TAB_ID" --selector "#non-existent-iframe" --timeout 2
RESULT=$?
set -e
if [ $RESULT -eq 0 ]; then
echo "ERROR: Expected timeout failure but command succeeded"
exit 1
else
echo "SUCCESS: Timeout correctly handled for non-existent iframe"
fi
echo ""
echo "All iframe timeout tests passed!"

View File

@ -1,20 +0,0 @@
<!DOCTYPE html>
<html>
<head>
<title>Iframe Test</title>
</head>
<body>
<h1>Main Page</h1>
<p>This is the main page content.</p>
<iframe id="test-iframe" src="data:text/html,<html><body><h2>Iframe Content</h2><button id='iframe-button'>Click Me</button><script>document.getElementById('iframe-button').onclick = function() { alert('Iframe button clicked!'); }</script></body></html>" width="400" height="200"></iframe>
<button id="main-button">Main Page Button</button>
<script>
document.getElementById('main-button').onclick = function() {
alert('Main page button clicked!');
};
</script>
</body>
</html>

View File

@ -1,226 +0,0 @@
#!/bin/bash
# Test script for Phase 1 Element Checking functionality
# This script tests the new element checking commands in cremote
set -e
echo "=== Phase 1 Element Checking Test ==="
echo "Starting cremote daemon..."
# Start the daemon in the background
./cremotedaemon --debug &
DAEMON_PID=$!
# Wait for daemon to start
sleep 3
echo "Daemon started with PID: $DAEMON_PID"
# Function to cleanup on exit
cleanup() {
echo "Cleaning up..."
if [ ! -z "$DAEMON_PID" ]; then
kill $DAEMON_PID 2>/dev/null || true
wait $DAEMON_PID 2>/dev/null || true
fi
echo "Cleanup complete"
}
trap cleanup EXIT
# Test the new functionality using cremote client
echo ""
echo "=== Testing Element Checking Commands ==="
# Open a tab and load our test page
echo "Opening tab and loading test page..."
TAB_ID=$(./cremote open-tab)
echo "Tab ID: $TAB_ID"
# Get the absolute path to the test file
TEST_FILE="file://$(pwd)/test-element-checking.html"
echo "Loading: $TEST_FILE"
./cremote load-url --tab "$TAB_ID" --url "$TEST_FILE"
# Wait for page to load
sleep 2
echo ""
echo "=== Test 1: Check if elements exist ==="
# Test element existence
echo "Checking if main title exists..."
./cremote eval-js --tab "$TAB_ID" --code "
const result = {
exists: document.querySelector('#main-title') !== null,
count: document.querySelectorAll('#main-title').length
};
console.log('Element check result:', JSON.stringify(result));
result;
"
echo ""
echo "=== Test 2: Check visibility states ==="
# Test visible element
echo "Checking visible paragraph..."
./cremote eval-js --tab "$TAB_ID" --code "
const element = document.querySelector('#visible-paragraph');
const result = {
exists: element !== null,
visible: element ? window.getComputedStyle(element).display !== 'none' : false,
visibilityStyle: element ? window.getComputedStyle(element).visibility : null
};
console.log('Visible paragraph check:', JSON.stringify(result));
result;
"
# Test hidden element
echo "Checking hidden paragraph..."
./cremote eval-js --tab "$TAB_ID" --code "
const element = document.querySelector('#hidden-paragraph');
const result = {
exists: element !== null,
visible: element ? window.getComputedStyle(element).display !== 'none' : false,
displayStyle: element ? window.getComputedStyle(element).display : null
};
console.log('Hidden paragraph check:', JSON.stringify(result));
result;
"
echo ""
echo "=== Test 3: Check form element states ==="
# Test enabled vs disabled inputs
echo "Checking enabled input..."
./cremote eval-js --tab "$TAB_ID" --code "
const element = document.querySelector('#text-input');
const result = {
exists: element !== null,
enabled: element ? !element.disabled : false,
value: element ? element.value : null
};
console.log('Enabled input check:', JSON.stringify(result));
result;
"
echo "Checking disabled input..."
./cremote eval-js --tab "$TAB_ID" --code "
const element = document.querySelector('#disabled-input');
const result = {
exists: element !== null,
enabled: element ? !element.disabled : false,
disabled: element ? element.disabled : null
};
console.log('Disabled input check:', JSON.stringify(result));
result;
"
echo ""
echo "=== Test 4: Check selected/checked states ==="
# Test checked checkbox
echo "Checking checked checkbox..."
./cremote eval-js --tab "$TAB_ID" --code "
const element = document.querySelector('#checkbox1');
const result = {
exists: element !== null,
checked: element ? element.checked : false,
type: element ? element.type : null
};
console.log('Checked checkbox:', JSON.stringify(result));
result;
"
# Test unchecked checkbox
echo "Checking unchecked checkbox..."
./cremote eval-js --tab "$TAB_ID" --code "
const element = document.querySelector('#checkbox2');
const result = {
exists: element !== null,
checked: element ? element.checked : false
};
console.log('Unchecked checkbox:', JSON.stringify(result));
result;
"
echo ""
echo "=== Test 5: Count multiple elements ==="
# Count elements with class 'item'
echo "Counting elements with class 'item'..."
./cremote eval-js --tab "$TAB_ID" --code "
const elements = document.querySelectorAll('.item');
const result = {
count: elements.length,
elements: Array.from(elements).map(el => ({
tagName: el.tagName,
textContent: el.textContent.trim(),
dataId: el.getAttribute('data-id')
}))
};
console.log('Item count result:', JSON.stringify(result));
result;
"
echo ""
echo "=== Test 6: Get element attributes ==="
# Get attributes of custom element
echo "Getting attributes of custom element..."
./cremote eval-js --tab "$TAB_ID" --code "
const element = document.querySelector('#custom-element');
const result = {
exists: element !== null,
attributes: {}
};
if (element) {
// Get all attributes
for (let attr of element.attributes) {
result.attributes[attr.name] = attr.value;
}
// Get some computed styles
const styles = window.getComputedStyle(element);
result.computedStyles = {
display: styles.display,
color: styles.color,
fontSize: styles.fontSize
};
}
console.log('Custom element attributes:', JSON.stringify(result));
result;
"
echo ""
echo "=== Test 7: Focus testing ==="
# Test focus state
echo "Testing focus state..."
./cremote eval-js --tab "$TAB_ID" --code "
// Focus the test button
const button = document.querySelector('#test-button');
button.focus();
// Check if it's focused
const result = {
buttonExists: button !== null,
activeElement: document.activeElement ? document.activeElement.id : null,
isFocused: document.activeElement === button
};
console.log('Focus test result:', JSON.stringify(result));
result;
"
echo ""
echo "=== All tests completed successfully! ==="
echo "The element checking functionality appears to be working correctly."
echo "Ready for Phase 1 MCP tool testing."
# Take a screenshot for verification
echo "Taking screenshot for verification..."
./cremote screenshot --tab "$TAB_ID" --output "test-phase1-screenshot.png"
echo "Screenshot saved as test-phase1-screenshot.png"
echo ""
echo "Test completed successfully!"

View File

@ -1,205 +0,0 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Phase 3 Form Testing</title>
<style>
body {
font-family: Arial, sans-serif;
max-width: 800px;
margin: 0 auto;
padding: 20px;
}
.form-section {
margin: 30px 0;
padding: 20px;
border: 1px solid #ccc;
border-radius: 5px;
}
.form-group {
margin: 15px 0;
}
label {
display: block;
margin-bottom: 5px;
font-weight: bold;
}
input, select, textarea {
width: 100%;
padding: 8px;
margin-bottom: 10px;
border: 1px solid #ddd;
border-radius: 3px;
}
button {
background-color: #007bff;
color: white;
padding: 10px 20px;
border: none;
border-radius: 3px;
cursor: pointer;
}
button:hover {
background-color: #0056b3;
}
.checkbox-group {
display: flex;
align-items: center;
margin: 10px 0;
}
.checkbox-group input[type="checkbox"] {
width: auto;
margin-right: 10px;
}
.radio-group {
margin: 10px 0;
}
.radio-group input[type="radio"] {
width: auto;
margin-right: 5px;
}
.result {
margin-top: 20px;
padding: 10px;
background-color: #f8f9fa;
border-radius: 3px;
}
</style>
</head>
<body>
<h1>Phase 3 Form Testing</h1>
<div class="form-section">
<h2>User Registration Form</h2>
<form id="registration-form" action="#" method="post">
<div class="form-group">
<label for="username">Username:</label>
<input type="text" id="username" name="username" required placeholder="Enter username">
</div>
<div class="form-group">
<label for="email">Email:</label>
<input type="email" id="email" name="email" required placeholder="Enter email">
</div>
<div class="form-group">
<label for="password">Password:</label>
<input type="password" id="password" name="password" required placeholder="Enter password">
</div>
<div class="form-group">
<label for="country">Country:</label>
<select id="country" name="country" required>
<option value="">Select Country</option>
<option value="us">United States</option>
<option value="ca">Canada</option>
<option value="uk">United Kingdom</option>
<option value="de">Germany</option>
<option value="fr">France</option>
</select>
</div>
<div class="form-group">
<label for="bio">Bio:</label>
<textarea id="bio" name="bio" rows="4" placeholder="Tell us about yourself"></textarea>
</div>
<div class="checkbox-group">
<input type="checkbox" id="newsletter" name="newsletter" value="yes">
<label for="newsletter">Subscribe to newsletter</label>
</div>
<div class="checkbox-group">
<input type="checkbox" id="terms" name="terms" value="agreed" required>
<label for="terms">I agree to the terms and conditions</label>
</div>
<div class="form-group">
<label>Preferred Contact Method:</label>
<div class="radio-group">
<input type="radio" id="contact-email" name="contact-method" value="email">
<label for="contact-email">Email</label>
</div>
<div class="radio-group">
<input type="radio" id="contact-phone" name="contact-method" value="phone">
<label for="contact-phone">Phone</label>
</div>
<div class="radio-group">
<input type="radio" id="contact-sms" name="contact-method" value="sms">
<label for="contact-sms">SMS</label>
</div>
</div>
<button type="submit" id="register-btn">Register</button>
</form>
</div>
<div class="form-section">
<h2>Quick Contact Form</h2>
<form id="contact-form" action="#" method="post">
<div class="form-group">
<label for="contact-name">Name:</label>
<input type="text" id="contact-name" name="name" required>
</div>
<div class="form-group">
<label for="contact-email">Email:</label>
<input type="email" id="contact-email" name="email" required>
</div>
<div class="form-group">
<label for="message">Message:</label>
<textarea id="message" name="message" rows="3" required></textarea>
</div>
<button type="submit" id="contact-submit">Send Message</button>
</form>
</div>
<div class="form-section">
<h2>Interactive Elements</h2>
<div class="form-group">
<button id="test-button" onclick="showResult('Button clicked!')">Test Button</button>
<button id="toggle-button" onclick="toggleVisibility()">Toggle Visibility</button>
</div>
<div id="hidden-content" style="display: none;">
<p>This content was hidden and is now visible!</p>
<input type="text" id="hidden-input" placeholder="Hidden input field">
</div>
<div class="result" id="result-area">
Results will appear here...
</div>
</div>
<script>
function showResult(message) {
document.getElementById('result-area').textContent = message;
}
function toggleVisibility() {
const hiddenContent = document.getElementById('hidden-content');
if (hiddenContent.style.display === 'none') {
hiddenContent.style.display = 'block';
showResult('Content is now visible');
} else {
hiddenContent.style.display = 'none';
showResult('Content is now hidden');
}
}
// Form submission handlers
document.getElementById('registration-form').addEventListener('submit', function(e) {
e.preventDefault();
showResult('Registration form submitted successfully!');
});
document.getElementById('contact-form').addEventListener('submit', function(e) {
e.preventDefault();
showResult('Contact form submitted successfully!');
});
</script>
</body>
</html>

View File

@ -1,242 +0,0 @@
#!/bin/bash
# Phase 3 Functionality Test Script
# Tests form analysis, multiple interactions, and bulk form filling
set -e
echo "=== Phase 3 Functionality Test ==="
echo "Testing form analysis, multiple interactions, and bulk form filling"
# Configuration
DAEMON_PORT=9223
CLIENT_PORT=9223
TEST_FILE="test-phase3-forms.html"
DAEMON_PID=""
CHROME_PID=""
# Cleanup function
cleanup() {
echo "Cleaning up..."
if [ ! -z "$DAEMON_PID" ]; then
echo "Stopping daemon (PID: $DAEMON_PID)"
kill $DAEMON_PID 2>/dev/null || true
wait $DAEMON_PID 2>/dev/null || true
fi
if [ ! -z "$CHROME_PID" ]; then
echo "Stopping Chrome (PID: $CHROME_PID)"
kill $CHROME_PID 2>/dev/null || true
wait $CHROME_PID 2>/dev/null || true
fi
# Clean up any screenshots
rm -f /tmp/phase3-*.png
}
# Set up cleanup trap
trap cleanup EXIT
# Start daemon
echo "Starting daemon on port $DAEMON_PORT..."
./cremotedaemon --port=$DAEMON_PORT --debug &
DAEMON_PID=$!
# Wait for daemon to start
echo "Waiting for daemon to start..."
sleep 3
# Check if daemon is running
if ! kill -0 $DAEMON_PID 2>/dev/null; then
echo "ERROR: Daemon failed to start"
exit 1
fi
echo "Daemon started successfully (PID: $DAEMON_PID)"
# Test 1: Open tab and navigate to test page
echo ""
echo "=== Test 1: Navigation ==="
TAB_ID=$(./cremote open-tab --port=$CLIENT_PORT)
echo "Opened tab: $TAB_ID"
# Get absolute path to test file
TEST_PATH="file://$(pwd)/$TEST_FILE"
echo "Navigating to: $TEST_PATH"
./cremote load-url --tab="$TAB_ID" --url="$TEST_PATH" --port=$CLIENT_PORT
# Take initial screenshot
./cremote screenshot --tab="$TAB_ID" --output="/tmp/phase3-initial.png" --port=$CLIENT_PORT
echo "Initial screenshot saved to /tmp/phase3-initial.png"
# Test 2: Form Analysis
echo ""
echo "=== Test 2: Form Analysis ==="
echo "Analyzing registration form..."
# Test the daemon command directly
echo "Testing analyze-form daemon command..."
FORM_ANALYSIS=$(curl -s -X POST http://localhost:$DAEMON_PORT/command \
-H "Content-Type: application/json" \
-d '{
"action": "analyze-form",
"params": {
"tab": "'$TAB_ID'",
"selector": "#registration-form",
"timeout": "10"
}
}')
echo "Form analysis result:"
echo "$FORM_ANALYSIS" | jq '.'
# Check if analysis was successful
if echo "$FORM_ANALYSIS" | jq -e '.success' > /dev/null; then
echo "✓ Form analysis successful"
# Extract field count
FIELD_COUNT=$(echo "$FORM_ANALYSIS" | jq -r '.data.field_count')
echo "Found $FIELD_COUNT form fields"
# Check if we found expected fields
if [ "$FIELD_COUNT" -gt 5 ]; then
echo "✓ Expected number of fields found"
else
echo "✗ Unexpected field count: $FIELD_COUNT"
fi
else
echo "✗ Form analysis failed"
echo "$FORM_ANALYSIS"
fi
# Test 3: Multiple Interactions
echo ""
echo "=== Test 3: Multiple Interactions ==="
echo "Testing multiple interactions..."
INTERACTIONS_RESULT=$(curl -s -X POST http://localhost:$DAEMON_PORT/command \
-H "Content-Type: application/json" \
-d '{
"action": "interact-multiple",
"params": {
"tab": "'$TAB_ID'",
"interactions": "[
{\"selector\": \"#test-button\", \"action\": \"click\"},
{\"selector\": \"#toggle-button\", \"action\": \"click\"},
{\"selector\": \"#hidden-input\", \"action\": \"fill\", \"value\": \"Test input\"}
]",
"timeout": "10"
}
}')
echo "Multiple interactions result:"
echo "$INTERACTIONS_RESULT" | jq '.'
# Check if interactions were successful
if echo "$INTERACTIONS_RESULT" | jq -e '.success' > /dev/null; then
echo "✓ Multiple interactions successful"
SUCCESS_COUNT=$(echo "$INTERACTIONS_RESULT" | jq -r '.data.success_count')
TOTAL_COUNT=$(echo "$INTERACTIONS_RESULT" | jq -r '.data.total_count')
echo "Successful interactions: $SUCCESS_COUNT/$TOTAL_COUNT"
if [ "$SUCCESS_COUNT" -eq "$TOTAL_COUNT" ]; then
echo "✓ All interactions successful"
else
echo "✗ Some interactions failed"
fi
else
echo "✗ Multiple interactions failed"
echo "$INTERACTIONS_RESULT"
fi
# Take screenshot after interactions
./cremote screenshot --tab="$TAB_ID" --output="/tmp/phase3-after-interactions.png" --port=$CLIENT_PORT
echo "Screenshot after interactions saved to /tmp/phase3-after-interactions.png"
# Test 4: Bulk Form Filling
echo ""
echo "=== Test 4: Bulk Form Filling ==="
echo "Testing bulk form filling..."
BULK_FILL_RESULT=$(curl -s -X POST http://localhost:$DAEMON_PORT/command \
-H "Content-Type: application/json" \
-d '{
"action": "fill-form-bulk",
"params": {
"tab": "'$TAB_ID'",
"form-selector": "#registration-form",
"fields": "{
\"username\": \"testuser123\",
\"email\": \"test@example.com\",
\"password\": \"testpass123\",
\"bio\": \"This is a test bio for Phase 3 testing.\"
}",
"timeout": "10"
}
}')
echo "Bulk form filling result:"
echo "$BULK_FILL_RESULT" | jq '.'
# Check if bulk filling was successful
if echo "$BULK_FILL_RESULT" | jq -e '.success' > /dev/null; then
echo "✓ Bulk form filling successful"
SUCCESS_COUNT=$(echo "$BULK_FILL_RESULT" | jq -r '.data.success_count')
TOTAL_COUNT=$(echo "$BULK_FILL_RESULT" | jq -r '.data.total_count')
echo "Successfully filled fields: $SUCCESS_COUNT/$TOTAL_COUNT"
if [ "$SUCCESS_COUNT" -eq "$TOTAL_COUNT" ]; then
echo "✓ All fields filled successfully"
else
echo "✗ Some fields failed to fill"
fi
else
echo "✗ Bulk form filling failed"
echo "$BULK_FILL_RESULT"
fi
# Take final screenshot
./cremote screenshot --tab="$TAB_ID" --output="/tmp/phase3-final.png" --port=$CLIENT_PORT
echo "Final screenshot saved to /tmp/phase3-final.png"
# Test 5: Contact Form Bulk Fill
echo ""
echo "=== Test 5: Contact Form Bulk Fill ==="
echo "Testing bulk fill on contact form..."
CONTACT_FILL_RESULT=$(curl -s -X POST http://localhost:$DAEMON_PORT/command \
-H "Content-Type: application/json" \
-d '{
"action": "fill-form-bulk",
"params": {
"tab": "'$TAB_ID'",
"form-selector": "#contact-form",
"fields": "{
\"name\": \"John Doe\",
\"email\": \"john@example.com\",
\"message\": \"This is a test message for the contact form.\"
}",
"timeout": "10"
}
}')
echo "Contact form bulk filling result:"
echo "$CONTACT_FILL_RESULT" | jq '.'
if echo "$CONTACT_FILL_RESULT" | jq -e '.success' > /dev/null; then
echo "✓ Contact form bulk filling successful"
else
echo "✗ Contact form bulk filling failed"
fi
# Summary
echo ""
echo "=== Test Summary ==="
echo "Phase 3 functionality tests completed."
echo "Screenshots saved:"
echo " - Initial: /tmp/phase3-initial.png"
echo " - After interactions: /tmp/phase3-after-interactions.png"
echo " - Final: /tmp/phase3-final.png"
echo ""
echo "All Phase 3 tests completed successfully!"

View File

@ -1,36 +0,0 @@
<!DOCTYPE html>
<html>
<head>
<title>Timeout Test</title>
<style>
.delayed-element {
display: none;
}
</style>
</head>
<body>
<h1>Timeout Test</h1>
<div id="immediate">This element is immediately available</div>
<div id="delayed" class="delayed-element">This element appears after 3 seconds</div>
<button id="slow-button">Slow Button (3s delay)</button>
<div id="result"></div>
<script>
// Show the delayed element after 3 seconds
setTimeout(() => {
document.getElementById('delayed').style.display = 'block';
}, 3000);
// Add a click handler to the slow button that takes 3 seconds to complete
document.getElementById('slow-button').addEventListener('click', function() {
const result = document.getElementById('result');
result.textContent = 'Processing...';
// Simulate a slow operation
setTimeout(() => {
result.textContent = 'Button click processed!';
}, 3000);
});
</script>
</body>
</html>

View File

@ -1,36 +0,0 @@
<!DOCTYPE html>
<html>
<head>
<title>Timeout Test</title>
</head>
<body>
<h1>Timeout Test</h1>
<div id="immediate">This element is immediately available</div>
<div id="result"></div>
<script>
// Create the delayed element after 3 seconds
setTimeout(() => {
const delayed = document.createElement('div');
delayed.id = 'delayed';
delayed.textContent = 'This element appears after 3 seconds';
document.body.appendChild(delayed);
}, 3000);
// Create the slow button
const button = document.createElement('button');
button.id = 'slow-button';
button.textContent = 'Slow Button (3s delay)';
button.addEventListener('click', function() {
const result = document.getElementById('result');
result.textContent = 'Processing...';
// Simulate a slow operation
setTimeout(() => {
result.textContent = 'Button click processed!';
}, 3000);
});
document.body.appendChild(button);
</script>
</body>
</html>

View File

@ -1 +0,0 @@
This is a test file for cremote file transfer

View File

@ -1,53 +0,0 @@
#!/usr/bin/env python3
import requests
import json
# Test the dropdown selection fix
def test_dropdown_selection():
url = "http://localhost:8080/interact-multiple"
# Test data - select by value "CA"
data = {
"interactions": [
{
"selector": "[name='state']",
"action": "select",
"value": "CA"
}
],
"timeout": 15
}
print("Testing dropdown selection by value 'CA'...")
response = requests.post(url, json=data)
if response.status_code == 200:
result = response.json()
print(f"Response: {json.dumps(result, indent=2)}")
# Check if it was successful
if result.get('success_count', 0) > 0:
print("✅ SUCCESS: Dropdown selection worked!")
else:
print("❌ FAILED: Dropdown selection failed")
# Verify the actual value was set
verify_url = "http://localhost:8080/eval-js"
verify_data = {"code": "document.querySelector('[name=\"state\"]').value"}
verify_response = requests.post(verify_url, json=verify_data)
if verify_response.status_code == 200:
actual_value = verify_response.json().get('result', '')
print(f"Actual dropdown value: '{actual_value}'")
if actual_value == 'CA':
print("✅ VERIFICATION: Value correctly set to 'CA'")
else:
print(f"❌ VERIFICATION: Expected 'CA' but got '{actual_value}'")
else:
print("❌ Could not verify dropdown value")
else:
print(f"❌ HTTP Error: {response.status_code}")
print(response.text)
if __name__ == "__main__":
test_dropdown_selection()

View File

@ -1,128 +0,0 @@
#!/usr/bin/env python3
"""
Test script to verify the select element fix in cremote MCP system.
This script demonstrates that select dropdowns now work correctly with:
1. Single web_interact_cremotemcp with "select" action (after server restart)
2. Multiple web_interact_multiple_cremotemcp with "select" action (works now)
3. Bulk form fill web_form_fill_bulk_cremotemcp (after server restart)
The fix includes:
- Added "select" action to web_interact_cremotemcp
- Added SelectElement method to client
- Added select-element endpoint to daemon
- Modified fillFormBulk to detect select elements and use appropriate action
"""
import requests
import json
def test_multiple_interactions_select():
"""Test select functionality using web_interact_multiple (works immediately)"""
print("Testing select with web_interact_multiple...")
url = "http://localhost:8080/interact-multiple"
data = {
"interactions": [
{
"selector": "#state",
"action": "select",
"value": "TX"
}
],
"timeout": 5
}
response = requests.post(url, json=data)
if response.status_code == 200:
result = response.json()
print(f"✅ Multiple interactions select: {json.dumps(result, indent=2)}")
# Verify the value was set
verify_url = "http://localhost:8080/eval-js"
verify_data = {"code": "document.querySelector('#state').value"}
verify_response = requests.post(verify_url, json=verify_data)
if verify_response.status_code == 200:
actual_value = verify_response.json().get('result', '')
print(f"✅ Verified dropdown value: '{actual_value}'")
return actual_value == 'TX'
else:
print(f"❌ HTTP Error: {response.status_code}")
return False
def test_form_completion():
"""Test complete form filling with mixed field types"""
print("\nTesting complete form with mixed field types...")
url = "http://localhost:8080/interact-multiple"
data = {
"interactions": [
{"selector": "#firstName", "action": "fill", "value": "Jane"},
{"selector": "#lastName", "action": "fill", "value": "Smith"},
{"selector": "#email", "action": "fill", "value": "jane.smith@test.com"},
{"selector": "#state", "action": "select", "value": "Florida"},
{"selector": "#contactPhone", "action": "click"},
{"selector": "#interestMusic", "action": "check"},
{"selector": "#newsletter", "action": "check"}
],
"timeout": 10
}
response = requests.post(url, json=data)
if response.status_code == 200:
result = response.json()
success_count = result.get('success_count', 0)
total_count = result.get('total_count', 0)
print(f"✅ Form completion: {success_count}/{total_count} fields successful")
# Extract all values to verify
extract_url = "http://localhost:8080/extract-multiple"
extract_data = {
"selectors": {
"firstName": "#firstName",
"lastName": "#lastName",
"email": "#email",
"state": "#state",
"contactMethod": "input[name='contactMethod']:checked",
"musicInterest": "#interestMusic",
"newsletter": "#newsletter"
}
}
extract_response = requests.post(extract_url, json=extract_data)
if extract_response.status_code == 200:
values = extract_response.json().get('results', {})
print(f"✅ Form values: {json.dumps(values, indent=2)}")
return success_count == total_count
else:
print(f"❌ HTTP Error: {response.status_code}")
return False
def main():
print("🧪 Testing cremote select element fixes")
print("=" * 50)
# Test 1: Multiple interactions select (works immediately)
test1_passed = test_multiple_interactions_select()
# Test 2: Complete form with mixed field types
test2_passed = test_form_completion()
print("\n" + "=" * 50)
print("📋 Test Results Summary:")
print(f"✅ Multiple interactions select: {'PASS' if test1_passed else 'FAIL'}")
print(f"✅ Complete form filling: {'PASS' if test2_passed else 'FAIL'}")
if test1_passed and test2_passed:
print("\n🎉 All tests passed! Select elements are working correctly.")
print("\n📝 Note: After server restart, these will also work:")
print(" - Single web_interact_cremotemcp with 'select' action")
print(" - Bulk form fill web_form_fill_bulk_cremotemcp with select detection")
else:
print("\n❌ Some tests failed. Check the cremote daemon status.")
if __name__ == "__main__":
main()