bump
This commit is contained in:
281
docs/DIVI_EXTRACTION_TOOLS.md
Normal file
281
docs/DIVI_EXTRACTION_TOOLS.md
Normal file
@@ -0,0 +1,281 @@
|
||||
# Divi Extraction Tools - User Guide
|
||||
|
||||
## Overview
|
||||
|
||||
The Divi extraction tools enable you to extract page structure, images, and content from any Divi-powered website using browser automation. These tools are designed for competitive analysis, external site recreation, and quick prototyping.
|
||||
|
||||
## Important Limitations
|
||||
|
||||
⚠️ **These tools extract from rendered HTML only (60-70% accuracy)**
|
||||
|
||||
### What You CAN Extract
|
||||
- ✅ Section, row, and column structure (from CSS classes)
|
||||
- ✅ Module types and visible content
|
||||
- ✅ Images with metadata (URLs, dimensions, alt text)
|
||||
- ✅ Background colors and images (computed styles)
|
||||
- ✅ Text content and button URLs
|
||||
|
||||
### What You CANNOT Extract
|
||||
- ❌ Original Divi shortcode/JSON
|
||||
- ❌ Builder settings (animations, responsive, custom CSS)
|
||||
- ❌ Advanced module configurations
|
||||
- ❌ Dynamic content sources (ACF fields)
|
||||
- ❌ Exact responsive layouts
|
||||
|
||||
## Tools Available
|
||||
|
||||
### 1. web_extract_divi_structure_cremotemcp
|
||||
|
||||
Extracts the complete page structure including sections, rows, columns, and modules.
|
||||
|
||||
**Parameters:**
|
||||
- `url` (optional): URL to navigate to before extraction
|
||||
- `tab` (optional): Tab ID to use (uses current tab if not specified)
|
||||
- `clear_cache` (optional): Clear browser cache before extraction (default: false)
|
||||
- `timeout` (optional): Timeout in seconds (default: 30)
|
||||
|
||||
**Example:**
|
||||
```javascript
|
||||
{
|
||||
"tool": "web_extract_divi_structure_cremotemcp",
|
||||
"arguments": {
|
||||
"url": "https://example.com/divi-page",
|
||||
"clear_cache": true,
|
||||
"timeout": 30
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Output Structure:**
|
||||
```json
|
||||
{
|
||||
"url": "https://example.com/page",
|
||||
"sections": [
|
||||
{
|
||||
"type": "regular",
|
||||
"has_parallax": false,
|
||||
"background_color": "rgb(255,255,255)",
|
||||
"background_image": "url(...)",
|
||||
"background_style": "image",
|
||||
"rows": [
|
||||
{
|
||||
"column_structure": "1_2,1_2",
|
||||
"columns": [
|
||||
{
|
||||
"type": "1_2",
|
||||
"modules": [
|
||||
{
|
||||
"type": "text",
|
||||
"content": "<p>...</p>",
|
||||
"attributes": {},
|
||||
"css_classes": ["et_pb_text", "et_pb_module"]
|
||||
}
|
||||
],
|
||||
"css_classes": ["et_pb_column", "et_pb_column_1_2"]
|
||||
}
|
||||
],
|
||||
"css_classes": ["et_pb_row"]
|
||||
}
|
||||
],
|
||||
"css_classes": ["et_pb_section"]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"extraction_date": "2025-01-16T...",
|
||||
"accuracy": "60-70% (approximation from CSS classes)",
|
||||
"limitations": "Cannot access original Divi shortcode/JSON or builder settings"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2. web_extract_divi_images_cremotemcp
|
||||
|
||||
Extracts all images from the page including regular images and background images.
|
||||
|
||||
**Parameters:**
|
||||
- `url` (optional): URL to navigate to before extraction
|
||||
- `tab` (optional): Tab ID to use
|
||||
- `clear_cache` (optional): Clear browser cache (default: false)
|
||||
- `timeout` (optional): Timeout in seconds (default: 30)
|
||||
|
||||
**Example:**
|
||||
```javascript
|
||||
{
|
||||
"tool": "web_extract_divi_images_cremotemcp",
|
||||
"arguments": {
|
||||
"url": "https://example.com/divi-page",
|
||||
"timeout": 30
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Output Structure:**
|
||||
```json
|
||||
[
|
||||
{
|
||||
"url": "https://example.com/image.jpg",
|
||||
"alt": "Image description",
|
||||
"title": "Image title",
|
||||
"width": 1920,
|
||||
"height": 1080,
|
||||
"context": "image 0",
|
||||
"is_background": false
|
||||
},
|
||||
{
|
||||
"url": "https://example.com/bg.jpg",
|
||||
"alt": "",
|
||||
"title": "",
|
||||
"width": 0,
|
||||
"height": 0,
|
||||
"context": "background 1",
|
||||
"is_background": true
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### 3. web_extract_divi_content_cremotemcp
|
||||
|
||||
Extracts all module content and images with comprehensive metadata.
|
||||
|
||||
**Parameters:**
|
||||
- `url` (optional): URL to navigate to before extraction
|
||||
- `tab` (optional): Tab ID to use
|
||||
- `clear_cache` (optional): Clear browser cache (default: false)
|
||||
- `timeout` (optional): Timeout in seconds (default: 30)
|
||||
|
||||
**Example:**
|
||||
```javascript
|
||||
{
|
||||
"tool": "web_extract_divi_content_cremotemcp",
|
||||
"arguments": {
|
||||
"url": "https://example.com/divi-page",
|
||||
"timeout": 30
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Output Structure:**
|
||||
```json
|
||||
{
|
||||
"url": "https://example.com/page",
|
||||
"modules": [
|
||||
{
|
||||
"type": "text",
|
||||
"content": "<p>Text content</p>",
|
||||
"attributes": {},
|
||||
"css_classes": ["et_pb_text", "et_pb_module"]
|
||||
},
|
||||
{
|
||||
"type": "button",
|
||||
"content": "Click Here",
|
||||
"attributes": {
|
||||
"href": "https://example.com/link",
|
||||
"target": "_blank"
|
||||
},
|
||||
"css_classes": ["et_pb_button", "et_pb_module"]
|
||||
}
|
||||
],
|
||||
"images": [...],
|
||||
"metadata": {
|
||||
"extraction_date": "2025-01-16T...",
|
||||
"total_modules": 15,
|
||||
"total_images": 8
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Workflow Examples
|
||||
|
||||
### Extract Complete Page Data
|
||||
```javascript
|
||||
// 1. Navigate to page
|
||||
{
|
||||
"tool": "web_navigate_cremotemcp",
|
||||
"arguments": {
|
||||
"url": "https://example.com/divi-page",
|
||||
"clear_cache": true
|
||||
}
|
||||
}
|
||||
|
||||
// 2. Extract structure
|
||||
{
|
||||
"tool": "web_extract_divi_structure_cremotemcp",
|
||||
"arguments": {
|
||||
"timeout": 30
|
||||
}
|
||||
}
|
||||
|
||||
// 3. Extract images
|
||||
{
|
||||
"tool": "web_extract_divi_images_cremotemcp",
|
||||
"arguments": {
|
||||
"timeout": 30
|
||||
}
|
||||
}
|
||||
|
||||
// 4. Extract content
|
||||
{
|
||||
"tool": "web_extract_divi_content_cremotemcp",
|
||||
"arguments": {
|
||||
"timeout": 30
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Quick Single-Call Extraction
|
||||
```javascript
|
||||
// Extract structure with automatic navigation
|
||||
{
|
||||
"tool": "web_extract_divi_structure_cremotemcp",
|
||||
"arguments": {
|
||||
"url": "https://example.com/divi-page",
|
||||
"clear_cache": true,
|
||||
"timeout": 30
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Module Types Detected
|
||||
|
||||
The tools can identify the following Divi module types:
|
||||
- `text` - Text modules
|
||||
- `image` - Image modules
|
||||
- `button` - Button modules
|
||||
- `blurb` - Blurb modules
|
||||
- `cta` - Call-to-action modules
|
||||
- `slider` - Slider modules
|
||||
- `gallery` - Gallery modules
|
||||
- `video` - Video modules
|
||||
- `unknown` - Unrecognized modules
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Always set appropriate timeouts** for slow-loading pages
|
||||
2. **Clear cache** when extracting from a new site
|
||||
3. **Use structure extraction first** to understand page layout
|
||||
4. **Extract images separately** if you need detailed image metadata
|
||||
5. **Combine with WordPress MCP tools** for page recreation on your own sites
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Timeout Errors
|
||||
- Increase the `timeout` parameter
|
||||
- Check if the page is loading slowly
|
||||
- Verify the URL is accessible
|
||||
|
||||
### Empty Results
|
||||
- Verify the page uses Divi (check for `et_pb_` CSS classes)
|
||||
- Check if JavaScript is enabled
|
||||
- Try navigating to the page first with `web_navigate_cremotemcp`
|
||||
|
||||
### Incomplete Data
|
||||
- This is expected - tools extract 60-70% accuracy
|
||||
- Manual refinement will be required
|
||||
- Use for starting point, not exact recreation
|
||||
|
||||
## See Also
|
||||
|
||||
- [Implementation Summary](../DIVI_EXTRACTION_IMPLEMENTATION.md)
|
||||
- [Implementation Plan](../IMPLEMENTATION_PLAN.md)
|
||||
- [Feedback Analysis](../feedback/)
|
||||
|
||||
Reference in New Issue
Block a user