Files
cremote/feedback/CREMOTE_REALITY_CHECK.md
Josh at WLTechBlog 34a512e278 bump
2025-12-16 12:26:36 -07:00

309 lines
8.2 KiB
Markdown

# Cremote Extraction - Reality Check
## The Correct Understanding
### Cremote Boundary (Browser Only)
```
┌─────────────────────────────────────┐
│ CREMOTE TOOLS │
│ - Rendered HTML/CSS/JS only │
│ - Browser DOM access │
│ - Execute JavaScript in console │
│ - Download visible assets │
│ │
│ ❌ NO WordPress API access │
│ ❌ NO server-side data │
│ ❌ NO database access │
└─────────────────────────────────────┘
```
### WordPress MCP Boundary (Server Side)
```
┌─────────────────────────────────────┐
│ WORDPRESS MCP TOOLS │
│ - WordPress REST API │
│ - Database/post meta │
│ - Original shortcode/JSON │
│ - All builder settings │
│ │
│ ❌ NO access to external sites │
│ ❌ Requires WordPress credentials │
└─────────────────────────────────────┘
```
---
## What Cremote Can ACTUALLY Extract
### From Rendered HTML Classes
```javascript
// Section types
.et_pb_section_regular regular section
.et_section_specialty specialty section
.et_pb_fullwidth_section fullwidth section
.et_pb_section_parallax has parallax
// Column layouts
.et_pb_column_4_4 full width
.et_pb_column_1_2 half width
.et_pb_column_1_3 one third
.et_pb_column_2_3 two thirds
// Module types
.et_pb_text text module
.et_pb_image image module
.et_pb_button button module
.et_pb_blurb blurb module
```
### From Computed Styles
```javascript
// Background colors
window.getComputedStyle(element).backgroundColor
// → "rgb(255, 255, 255)"
// Background images
window.getComputedStyle(element).backgroundImage
// → "url('https://site.com/image.jpg')"
// → "linear-gradient(...), url(...)"
// Padding, margins, colors
window.getComputedStyle(element).padding
window.getComputedStyle(element).color
```
### From DOM Content
```javascript
// Text content
element.innerHTML
element.textContent
// Image sources
img.src
img.alt
img.width
img.height
// Button URLs
button.href
button.textContent
// Icon data
element.getAttribute('data-icon')
```
---
## What Cremote CANNOT Extract
### ❌ Builder Settings (Not in HTML)
- Animation settings (entrance, duration, delay)
- Custom CSS IDs added in builder
- Custom CSS classes added in builder
- Module-specific IDs
- Z-index values set in builder
- Border radius set in builder
- Box shadows set in builder
### ❌ Responsive Settings (Not in Desktop HTML)
- Tablet-specific layouts
- Phone-specific layouts
- Responsive font sizes
- Responsive padding/margins
- Responsive visibility settings
### ❌ Original Divi Data (Server Side)
- Original shortcode
- Original JSON structure
- Post meta data
- Module settings stored in database
- Dynamic content sources (ACF fields)
### ❌ Complex Module Configurations
- Contact form field structure (only see rendered form)
- Gallery image IDs (only see rendered images)
- Slider settings (only see first slide)
- Blog module query parameters
- Social follow network configurations
---
## The Real Workflow
### What We Can Do
```
1. CREMOTE: Extract visible structure from rendered HTML
2. CREMOTE: Extract visible content (text, images, buttons)
3. CREMOTE: Extract computed styles (colors, backgrounds)
4. CREMOTE: Download images via browser
5. WORDPRESS MCP: Upload images to target site
6. WORDPRESS MCP: REBUILD page from scratch using extracted data
```
### What We CANNOT Do
```
❌ Extract original Divi shortcode/JSON
❌ Get exact builder settings
❌ Recreate responsive configurations
❌ Get animation settings
❌ Access any WordPress API data
```
---
## Corrected Tool Proposal
### Tool 1: `extract_divi_visual_structure_cremote`
**What it does:** Extract VISIBLE structure from rendered HTML
**Input:** URL
**Output:** Approximated structure based on CSS classes
**Accuracy:** 60-70% (approximation only)
```json
{
"sections": [
{
"type": "regular", // from .et_pb_section_regular
"hasParallax": true, // from .et_pb_section_parallax
"backgroundColor": "rgb(255,255,255)", // computed
"backgroundImage": "url(...)", // computed
"rows": [
{
"columns": [
{
"type": "1_2", // from .et_pb_column_1_2
"modules": [
{
"type": "text", // from .et_pb_text
"content": "<p>...</p>" // innerHTML
}
]
}
]
}
]
}
]
}
```
### Tool 2: `extract_divi_images_cremote`
**What it does:** Extract all visible images
**Input:** URL
**Output:** Array of image URLs with metadata
**Accuracy:** 100% (for visible images)
```json
{
"images": [
{
"url": "https://site.com/image.jpg",
"alt": "Image description",
"width": 1920,
"height": 1080,
"context": "section 0, row 0, column 0, module 2"
}
]
}
```
### Tool 3: `rebuild_page_from_visual_data_wordpress`
**What it does:** REBUILD page on target site using extracted visual data
**Input:** Extracted structure + target site
**Output:** New page ID
**Accuracy:** 60-70% (missing builder settings)
**Important:** This REBUILDS from scratch, not recreates exactly.
---
## Key Limitations
### 1. No Original Shortcode/JSON
We cannot extract the original Divi shortcode or JSON. We can only approximate the structure from CSS classes.
### 2. No Builder Settings
We cannot get animation settings, custom CSS IDs, responsive configs, or any builder-specific settings.
### 3. Approximation Only
The extracted structure is an APPROXIMATION based on visible HTML. It will not be pixel-perfect.
### 4. Manual Work Required
After rebuilding, user must manually:
- Add animations
- Configure responsive settings
- Add custom CSS
- Configure complex modules (forms, sliders)
- Adjust spacing/styling to match
---
## Realistic Expectations
### What We Can Achieve
- ✅ Extract basic structure (sections, rows, columns)
- ✅ Extract content (text, images, buttons)
- ✅ Extract visible styling (colors, backgrounds)
- ✅ Download and upload images
- ✅ REBUILD page with basic structure
### What We Cannot Achieve
- ❌ Exact recreation of original page
- ❌ Builder settings and configurations
- ❌ Responsive layouts
- ❌ Animations and effects
- ❌ Complex module configurations
### Accuracy Estimate
- **Structure:** 60-70% (approximation from classes)
- **Content:** 90-100% (visible content)
- **Styling:** 50-60% (computed styles only)
- **Overall:** 60-70% (requires significant manual work)
---
## Recommendation
### Should We Build These Tools?
**YES, but with correct expectations:**
1. These tools enable BASIC page recreation from external sites
2. They provide a STARTING POINT, not a finished product
3. They save time on manual content extraction
4. They require 30-40% manual work after extraction
### Use Cases
- ✅ Competitive analysis (get basic structure)
- ✅ Quick prototyping (approximate layout)
- ✅ Content extraction (text, images)
- ❌ Production migrations (too inaccurate)
- ❌ Exact recreations (impossible without API)
### Alternative Approach
For sites you control, ALWAYS use WordPress MCP tools directly. Only use cremote for external sites where you have no other option.
---
## Corrected Conclusion
**Can we extract Divi pages with cremote?**
- YES, but only APPROXIMATE structure from rendered HTML
- NO original shortcode/JSON
- NO builder settings
- 60-70% accuracy
- Requires significant manual work
**Do we need additional tools?**
- YES, if you need to analyze external sites
- NO, if you only work with sites you control (use WordPress MCP)
**Should we build them?**
- YES, for competitive analysis and basic extraction
- Set correct expectations: approximation, not recreation