282 lines
6.6 KiB
Markdown
282 lines
6.6 KiB
Markdown
# Divi Extraction Tools - User Guide
|
|
|
|
## Overview
|
|
|
|
The Divi extraction tools enable you to extract page structure, images, and content from any Divi-powered website using browser automation. These tools are designed for competitive analysis, external site recreation, and quick prototyping.
|
|
|
|
## Important Limitations
|
|
|
|
⚠️ **These tools extract from rendered HTML only (60-70% accuracy)**
|
|
|
|
### What You CAN Extract
|
|
- ✅ Section, row, and column structure (from CSS classes)
|
|
- ✅ Module types and visible content
|
|
- ✅ Images with metadata (URLs, dimensions, alt text)
|
|
- ✅ Background colors and images (computed styles)
|
|
- ✅ Text content and button URLs
|
|
|
|
### What You CANNOT Extract
|
|
- ❌ Original Divi shortcode/JSON
|
|
- ❌ Builder settings (animations, responsive, custom CSS)
|
|
- ❌ Advanced module configurations
|
|
- ❌ Dynamic content sources (ACF fields)
|
|
- ❌ Exact responsive layouts
|
|
|
|
## Tools Available
|
|
|
|
### 1. web_extract_divi_structure_cremotemcp
|
|
|
|
Extracts the complete page structure including sections, rows, columns, and modules.
|
|
|
|
**Parameters:**
|
|
- `url` (optional): URL to navigate to before extraction
|
|
- `tab` (optional): Tab ID to use (uses current tab if not specified)
|
|
- `clear_cache` (optional): Clear browser cache before extraction (default: false)
|
|
- `timeout` (optional): Timeout in seconds (default: 30)
|
|
|
|
**Example:**
|
|
```javascript
|
|
{
|
|
"tool": "web_extract_divi_structure_cremotemcp",
|
|
"arguments": {
|
|
"url": "https://example.com/divi-page",
|
|
"clear_cache": true,
|
|
"timeout": 30
|
|
}
|
|
}
|
|
```
|
|
|
|
**Output Structure:**
|
|
```json
|
|
{
|
|
"url": "https://example.com/page",
|
|
"sections": [
|
|
{
|
|
"type": "regular",
|
|
"has_parallax": false,
|
|
"background_color": "rgb(255,255,255)",
|
|
"background_image": "url(...)",
|
|
"background_style": "image",
|
|
"rows": [
|
|
{
|
|
"column_structure": "1_2,1_2",
|
|
"columns": [
|
|
{
|
|
"type": "1_2",
|
|
"modules": [
|
|
{
|
|
"type": "text",
|
|
"content": "<p>...</p>",
|
|
"attributes": {},
|
|
"css_classes": ["et_pb_text", "et_pb_module"]
|
|
}
|
|
],
|
|
"css_classes": ["et_pb_column", "et_pb_column_1_2"]
|
|
}
|
|
],
|
|
"css_classes": ["et_pb_row"]
|
|
}
|
|
],
|
|
"css_classes": ["et_pb_section"]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"extraction_date": "2025-01-16T...",
|
|
"accuracy": "60-70% (approximation from CSS classes)",
|
|
"limitations": "Cannot access original Divi shortcode/JSON or builder settings"
|
|
}
|
|
}
|
|
```
|
|
|
|
### 2. web_extract_divi_images_cremotemcp
|
|
|
|
Extracts all images from the page including regular images and background images.
|
|
|
|
**Parameters:**
|
|
- `url` (optional): URL to navigate to before extraction
|
|
- `tab` (optional): Tab ID to use
|
|
- `clear_cache` (optional): Clear browser cache (default: false)
|
|
- `timeout` (optional): Timeout in seconds (default: 30)
|
|
|
|
**Example:**
|
|
```javascript
|
|
{
|
|
"tool": "web_extract_divi_images_cremotemcp",
|
|
"arguments": {
|
|
"url": "https://example.com/divi-page",
|
|
"timeout": 30
|
|
}
|
|
}
|
|
```
|
|
|
|
**Output Structure:**
|
|
```json
|
|
[
|
|
{
|
|
"url": "https://example.com/image.jpg",
|
|
"alt": "Image description",
|
|
"title": "Image title",
|
|
"width": 1920,
|
|
"height": 1080,
|
|
"context": "image 0",
|
|
"is_background": false
|
|
},
|
|
{
|
|
"url": "https://example.com/bg.jpg",
|
|
"alt": "",
|
|
"title": "",
|
|
"width": 0,
|
|
"height": 0,
|
|
"context": "background 1",
|
|
"is_background": true
|
|
}
|
|
]
|
|
```
|
|
|
|
### 3. web_extract_divi_content_cremotemcp
|
|
|
|
Extracts all module content and images with comprehensive metadata.
|
|
|
|
**Parameters:**
|
|
- `url` (optional): URL to navigate to before extraction
|
|
- `tab` (optional): Tab ID to use
|
|
- `clear_cache` (optional): Clear browser cache (default: false)
|
|
- `timeout` (optional): Timeout in seconds (default: 30)
|
|
|
|
**Example:**
|
|
```javascript
|
|
{
|
|
"tool": "web_extract_divi_content_cremotemcp",
|
|
"arguments": {
|
|
"url": "https://example.com/divi-page",
|
|
"timeout": 30
|
|
}
|
|
}
|
|
```
|
|
|
|
**Output Structure:**
|
|
```json
|
|
{
|
|
"url": "https://example.com/page",
|
|
"modules": [
|
|
{
|
|
"type": "text",
|
|
"content": "<p>Text content</p>",
|
|
"attributes": {},
|
|
"css_classes": ["et_pb_text", "et_pb_module"]
|
|
},
|
|
{
|
|
"type": "button",
|
|
"content": "Click Here",
|
|
"attributes": {
|
|
"href": "https://example.com/link",
|
|
"target": "_blank"
|
|
},
|
|
"css_classes": ["et_pb_button", "et_pb_module"]
|
|
}
|
|
],
|
|
"images": [...],
|
|
"metadata": {
|
|
"extraction_date": "2025-01-16T...",
|
|
"total_modules": 15,
|
|
"total_images": 8
|
|
}
|
|
}
|
|
```
|
|
|
|
## Workflow Examples
|
|
|
|
### Extract Complete Page Data
|
|
```javascript
|
|
// 1. Navigate to page
|
|
{
|
|
"tool": "web_navigate_cremotemcp",
|
|
"arguments": {
|
|
"url": "https://example.com/divi-page",
|
|
"clear_cache": true
|
|
}
|
|
}
|
|
|
|
// 2. Extract structure
|
|
{
|
|
"tool": "web_extract_divi_structure_cremotemcp",
|
|
"arguments": {
|
|
"timeout": 30
|
|
}
|
|
}
|
|
|
|
// 3. Extract images
|
|
{
|
|
"tool": "web_extract_divi_images_cremotemcp",
|
|
"arguments": {
|
|
"timeout": 30
|
|
}
|
|
}
|
|
|
|
// 4. Extract content
|
|
{
|
|
"tool": "web_extract_divi_content_cremotemcp",
|
|
"arguments": {
|
|
"timeout": 30
|
|
}
|
|
}
|
|
```
|
|
|
|
### Quick Single-Call Extraction
|
|
```javascript
|
|
// Extract structure with automatic navigation
|
|
{
|
|
"tool": "web_extract_divi_structure_cremotemcp",
|
|
"arguments": {
|
|
"url": "https://example.com/divi-page",
|
|
"clear_cache": true,
|
|
"timeout": 30
|
|
}
|
|
}
|
|
```
|
|
|
|
## Module Types Detected
|
|
|
|
The tools can identify the following Divi module types:
|
|
- `text` - Text modules
|
|
- `image` - Image modules
|
|
- `button` - Button modules
|
|
- `blurb` - Blurb modules
|
|
- `cta` - Call-to-action modules
|
|
- `slider` - Slider modules
|
|
- `gallery` - Gallery modules
|
|
- `video` - Video modules
|
|
- `unknown` - Unrecognized modules
|
|
|
|
## Best Practices
|
|
|
|
1. **Always set appropriate timeouts** for slow-loading pages
|
|
2. **Clear cache** when extracting from a new site
|
|
3. **Use structure extraction first** to understand page layout
|
|
4. **Extract images separately** if you need detailed image metadata
|
|
5. **Combine with WordPress MCP tools** for page recreation on your own sites
|
|
|
|
## Troubleshooting
|
|
|
|
### Timeout Errors
|
|
- Increase the `timeout` parameter
|
|
- Check if the page is loading slowly
|
|
- Verify the URL is accessible
|
|
|
|
### Empty Results
|
|
- Verify the page uses Divi (check for `et_pb_` CSS classes)
|
|
- Check if JavaScript is enabled
|
|
- Try navigating to the page first with `web_navigate_cremotemcp`
|
|
|
|
### Incomplete Data
|
|
- This is expected - tools extract 60-70% accuracy
|
|
- Manual refinement will be required
|
|
- Use for starting point, not exact recreation
|
|
|
|
## See Also
|
|
|
|
- [Implementation Summary](../DIVI_EXTRACTION_IMPLEMENTATION.md)
|
|
- [Implementation Plan](../IMPLEMENTATION_PLAN.md)
|
|
- [Feedback Analysis](../feedback/)
|
|
|