This commit is contained in:
Josh at WLTechBlog
2025-12-16 12:26:36 -07:00
parent 051b912122
commit 34a512e278
9 changed files with 2450 additions and 0 deletions

View File

@@ -0,0 +1,281 @@
# Divi Extraction Tools - User Guide
## Overview
The Divi extraction tools enable you to extract page structure, images, and content from any Divi-powered website using browser automation. These tools are designed for competitive analysis, external site recreation, and quick prototyping.
## Important Limitations
⚠️ **These tools extract from rendered HTML only (60-70% accuracy)**
### What You CAN Extract
- ✅ Section, row, and column structure (from CSS classes)
- ✅ Module types and visible content
- ✅ Images with metadata (URLs, dimensions, alt text)
- ✅ Background colors and images (computed styles)
- ✅ Text content and button URLs
### What You CANNOT Extract
- ❌ Original Divi shortcode/JSON
- ❌ Builder settings (animations, responsive, custom CSS)
- ❌ Advanced module configurations
- ❌ Dynamic content sources (ACF fields)
- ❌ Exact responsive layouts
## Tools Available
### 1. web_extract_divi_structure_cremotemcp
Extracts the complete page structure including sections, rows, columns, and modules.
**Parameters:**
- `url` (optional): URL to navigate to before extraction
- `tab` (optional): Tab ID to use (uses current tab if not specified)
- `clear_cache` (optional): Clear browser cache before extraction (default: false)
- `timeout` (optional): Timeout in seconds (default: 30)
**Example:**
```javascript
{
"tool": "web_extract_divi_structure_cremotemcp",
"arguments": {
"url": "https://example.com/divi-page",
"clear_cache": true,
"timeout": 30
}
}
```
**Output Structure:**
```json
{
"url": "https://example.com/page",
"sections": [
{
"type": "regular",
"has_parallax": false,
"background_color": "rgb(255,255,255)",
"background_image": "url(...)",
"background_style": "image",
"rows": [
{
"column_structure": "1_2,1_2",
"columns": [
{
"type": "1_2",
"modules": [
{
"type": "text",
"content": "<p>...</p>",
"attributes": {},
"css_classes": ["et_pb_text", "et_pb_module"]
}
],
"css_classes": ["et_pb_column", "et_pb_column_1_2"]
}
],
"css_classes": ["et_pb_row"]
}
],
"css_classes": ["et_pb_section"]
}
],
"metadata": {
"extraction_date": "2025-01-16T...",
"accuracy": "60-70% (approximation from CSS classes)",
"limitations": "Cannot access original Divi shortcode/JSON or builder settings"
}
}
```
### 2. web_extract_divi_images_cremotemcp
Extracts all images from the page including regular images and background images.
**Parameters:**
- `url` (optional): URL to navigate to before extraction
- `tab` (optional): Tab ID to use
- `clear_cache` (optional): Clear browser cache (default: false)
- `timeout` (optional): Timeout in seconds (default: 30)
**Example:**
```javascript
{
"tool": "web_extract_divi_images_cremotemcp",
"arguments": {
"url": "https://example.com/divi-page",
"timeout": 30
}
}
```
**Output Structure:**
```json
[
{
"url": "https://example.com/image.jpg",
"alt": "Image description",
"title": "Image title",
"width": 1920,
"height": 1080,
"context": "image 0",
"is_background": false
},
{
"url": "https://example.com/bg.jpg",
"alt": "",
"title": "",
"width": 0,
"height": 0,
"context": "background 1",
"is_background": true
}
]
```
### 3. web_extract_divi_content_cremotemcp
Extracts all module content and images with comprehensive metadata.
**Parameters:**
- `url` (optional): URL to navigate to before extraction
- `tab` (optional): Tab ID to use
- `clear_cache` (optional): Clear browser cache (default: false)
- `timeout` (optional): Timeout in seconds (default: 30)
**Example:**
```javascript
{
"tool": "web_extract_divi_content_cremotemcp",
"arguments": {
"url": "https://example.com/divi-page",
"timeout": 30
}
}
```
**Output Structure:**
```json
{
"url": "https://example.com/page",
"modules": [
{
"type": "text",
"content": "<p>Text content</p>",
"attributes": {},
"css_classes": ["et_pb_text", "et_pb_module"]
},
{
"type": "button",
"content": "Click Here",
"attributes": {
"href": "https://example.com/link",
"target": "_blank"
},
"css_classes": ["et_pb_button", "et_pb_module"]
}
],
"images": [...],
"metadata": {
"extraction_date": "2025-01-16T...",
"total_modules": 15,
"total_images": 8
}
}
```
## Workflow Examples
### Extract Complete Page Data
```javascript
// 1. Navigate to page
{
"tool": "web_navigate_cremotemcp",
"arguments": {
"url": "https://example.com/divi-page",
"clear_cache": true
}
}
// 2. Extract structure
{
"tool": "web_extract_divi_structure_cremotemcp",
"arguments": {
"timeout": 30
}
}
// 3. Extract images
{
"tool": "web_extract_divi_images_cremotemcp",
"arguments": {
"timeout": 30
}
}
// 4. Extract content
{
"tool": "web_extract_divi_content_cremotemcp",
"arguments": {
"timeout": 30
}
}
```
### Quick Single-Call Extraction
```javascript
// Extract structure with automatic navigation
{
"tool": "web_extract_divi_structure_cremotemcp",
"arguments": {
"url": "https://example.com/divi-page",
"clear_cache": true,
"timeout": 30
}
}
```
## Module Types Detected
The tools can identify the following Divi module types:
- `text` - Text modules
- `image` - Image modules
- `button` - Button modules
- `blurb` - Blurb modules
- `cta` - Call-to-action modules
- `slider` - Slider modules
- `gallery` - Gallery modules
- `video` - Video modules
- `unknown` - Unrecognized modules
## Best Practices
1. **Always set appropriate timeouts** for slow-loading pages
2. **Clear cache** when extracting from a new site
3. **Use structure extraction first** to understand page layout
4. **Extract images separately** if you need detailed image metadata
5. **Combine with WordPress MCP tools** for page recreation on your own sites
## Troubleshooting
### Timeout Errors
- Increase the `timeout` parameter
- Check if the page is loading slowly
- Verify the URL is accessible
### Empty Results
- Verify the page uses Divi (check for `et_pb_` CSS classes)
- Check if JavaScript is enabled
- Try navigating to the page first with `web_navigate_cremotemcp`
### Incomplete Data
- This is expected - tools extract 60-70% accuracy
- Manual refinement will be required
- Use for starting point, not exact recreation
## See Also
- [Implementation Summary](../DIVI_EXTRACTION_IMPLEMENTATION.md)
- [Implementation Plan](../IMPLEMENTATION_PLAN.md)
- [Feedback Analysis](../feedback/)