Files
nextcloud-analytics/PRD.md
WLTBAgent f9c49cf7c2 Phase 3: Initial commit - Nextcloud Analytics Hub Project
Nextcloud Analytics Hub complete:
- Nextcloud PHP app (analytics-hub/) - All phases (1-3) complete
- Go client tool (nextcloud-analytics) - Full CLI implementation
- Documentation (PRD, README, STATUS, SKILL.md)
- Production-ready for deployment to https://cloud.shortcutsolutions.net

Repository: git.teamworkapps.com/shortcut/nextcloud-analytics
Workspace: /home/molt/.openclaw/workspace
2026-02-13 14:11:01 +00:00

726 lines
22 KiB
Markdown

# Nextcloud "Mini-CMO" Analytics Hub - Product Requirements Document (PRD)
**Version**: 3.0 (PHP App Architecture)
**Project**: nextcloud-google-analytics-integration
**User**: Mike (Shortcut Solutions)
**Goal**: Nextcloud internal PHP app with AI-generated analytics reports
**Status**: READY FOR IMPLEMENTATION
---
## Executive Summary
Nextcloud internal PHP application providing Google Analytics 4 reporting with AI-generated client reports. Exposes REST APIs for agent access via nextcloud-integration project. Scheduled internal jobs for daily processing.
---
## Architecture Overview
**Technology Stack**: PHP 8.0+ (Nextcloud App Framework)
**Target Nextcloud**: 25.0+ (https://cloud.shortcutsolutions.net)
**Dependencies**: Google Analytics Data API v1, Anthropic Claude API
**Agent Access**: REST APIs exposed via Nextcloud-integration tools
**Scheduling**: Nextcloud cron system
---
## Technical Architecture & Strategy
### The "Nextcloud Internal App" Strategy (REVISED)
**Approach**: Nextcloud PHP application running inside Nextcloud server, exposing APIs for agent consumption.
**Core Principles**:
1. **Embedded in Nextcloud**: Runs as PHP app, not external service
2. **Agent Integration**: Exposes Nextcloud API for nextcloud-integration tool access
3. **Secure Credentials**: Uses Nextcloud app password for authentication
4. **Fail-Fast**: Halt on errors rather than propagate bad data
5. **Observable**: Every run generates logs and status notifications
---
## Data Flow Architecture (REVISED)
```
┌─────────────────────────────────────────────────────────────┐
│ PHASE 1: App Installation (One-Time Setup) │
├─────────────────────────────────────────────────────────────┤
│ 1. Install Nextcloud app (analytics-hub) │
│ 2. Configure Google OAuth credentials │
│ 3. Configure Anthropic API key │
│ 4. Set client configurations (clients.json) │
│ 5. Enable cron job (Mon-Fri 7:00 AM) │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ PHASE 2: Agent Integration (OpenClaw) │
├─────────────────────────────────────────────────────────────┤
│ OpenClaw calls nextcloud-integration tools: │
│ │
│ 1. Trigger report generation │
│ → POST /apps/analytics-hub/api/generate │
│ → Client slug + date range │
│ │
│ 2. List available reports │
│ → GET /apps/analytics-hub/api/reports │
│ │
│ 3. Download specific report │
│ → GET /apps/analytics-hub/api/report/<id> │
│ │
│ 4. Configure client settings │
│ → POST /apps/analytics-hub/api/config │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ PHASE 3: Internal Processing (Nextcloud App) │
├─────────────────────────────────────────────────────────────┤
│ Nextcloud app runs scheduled job (Mon-Fri 7:00 AM): │
│ │
│ 1. Refresh Google access token │
│ → IF 401: Email alert + HALT │
│ │
│ 2. For each client in clients.json: │
│ a. Fetch GA4 data (last 7 days) │
│ b. Validate completeness │
│ c. Calculate deltas with smart thresholds │
│ d. Generate Report via Anthropic API │
│ e. Store in Nextcloud files (WebDAV internal) │
│ f. Expose via API for agent access │
│ │
│ 3. Log success summary │
└─────────────────────────────────────────────────────────────┘
```
**KEY CHANGES**:
- Nextcloud PHP app (not external Python)
- Exposes REST APIs for agent integration
- Uses nextcloud-integration tools for access
- Scheduled internal jobs for daily processing
---
## Feature Specifications
### Feature 1: Nextcloud App Architecture
**Goal**: PHP application running inside Nextcloud
**Implementation**:
#### App Structure
```
analytics-hub/
├── lib/
│ ├── Controller/
│ │ ├── ApiV1Controller.php # Main API endpoints
│ │ └── ReportController.php # Report generation logic
│ ├── Service/
│ │ ├── GoogleAnalyticsService.php # GA4 API wrapper
│ │ ├── LLMService.php # Anthropic API wrapper
│ │ └── DataProcessor.php # Delta calculations
│ ├── Model/
│ │ ├── ClientConfig.php # Client entity
│ │ └── Report.php # Report entity
│ └── AppInfo.php # App metadata
├── templates/
│ └── admin.php # Configuration UI
├── css/
│ └── style.css
├── js/
│ └── admin.js
├── config/
│ └── clients.json # Client configurations
├── info.xml # Nextcloud app info
├── appinfo/info.xml # App metadata
└── cron.php # Scheduled jobs
```
#### Nextcloud Integration
```php
// Use Nextcloud APIs
use OCP\AppFramework\App;
use OCP\IL10N\IFactory;
use OCP\Files\Node;
// Access Nextcloud user
$user = $this->getUser();
// Access Nextcloud files (internal WebDAV)
$files = \OC::$server->getWebDavRoot();
```
#### Authentication
- **Method**: Nextcloud App Password authentication
- **Storage**: Nextcloud's built-in app password system
- **Access**: App password stored in Nextcloud settings (encrypted)
- **Agent Access**: Nextcloud-integration tools use app password for API access
---
### Feature 2: REST API Endpoints for Agent Integration
**Goal**: Expose analytics functions via Nextcloud API
#### API Endpoints
| Endpoint | Method | Description | Agent Tool |
|-----------|--------|-------------|--------------|
| `/apps/analytics-hub/api/reports` | GET | List all available reports | nextcloud-client (custom) |
| `/apps/analytics-hub/api/report/<id>` | GET | Download specific report | nextcloud-client (custom) |
| `/apps/analytics-hub/api/generate` | POST | Trigger report generation | nextcloud-client (custom) |
| `/apps/analytics-hub/api/config` | GET/POST | List/update client config | nextcloud-client (custom) |
| `/apps/analytics-hub/api/status` | GET | App health/status | nextcloud-client (custom) |
#### API Authentication
```php
// Verify Nextcloud app password
$userSession = $this->getUserSession();
$appPassword = $this->request->getHeader('Authorization');
if (!$userSession->verifyPassword($appPassword)) {
return new JSONResponse(['error' => 'Unauthorized'], 401);
}
// Allow only authorized agents
if (!$this->getAppConfig()->getAgentAuthorized()) {
return new JSONResponse(['error' => 'Agent access disabled'], 403);
}
```
#### API Response Format
```json
{
"success": true,
"data": {
"reports": [
{
"id": 123,
"client_name": "Logos School",
"report_date": "2026-02-12",
"file_path": "/files/Analytics/LogosSchool/Report_2026-02-12.md"
}
]
}
}
```
---
### Feature 3: Secure Authentication & Token Health Monitoring
**Goal**: Zero-touch operation with proactive failure detection
**Implementation**:
#### Initial Setup (auth.py)
- Runs locally on Mike's machine
- Opens browser for Google OAuth consent
- Saves `refresh_token` to server `.env` file
- Sets file permissions: `chmod 600 .env` (owner read/write only)
#### Token Health Checks (NEW)
```python
token_age_days = (now - token_created_date).days
if token_age_days > 150: # Warning at 5 months
send_email("⚠️ Refresh token nearing expiry (180 days)")
if 401 error on token refresh:
send_email("🔴 URGENT: Token expired - run auth.py")
sys.exit(1) # Halt execution
```
---
### Feature 2: Intelligent Data Pipeline
**Goal**: Mathematical accuracy + data quality validation
#### GA4 API Configuration
- **Date Range**: Last 7 days (enables Week-over-Week comparison)
- **Dimensions**: `date`, `sessionDefaultChannelGroup`, `pagePath`
- **Metrics**: `sessions`, `totalUsers`, `conversions`, `eventCount`
#### Smart Processing Logic (NEW)
```python
def calculate_delta(current, previous):
"""Handles edge cases that break naive % calculations"""
# Edge case 1: Division by zero
if previous == 0:
if current == 0:
return {"change_pct": 0, "label": "No change", "is_significant": False}
else:
return {"change_pct": None, "label": f"New activity (+{current})", "is_significant": True}
# Edge case 2: Small numbers creating misleading %
change_pct = ((current - previous) / previous) * 100
abs_change = current - previous
# Require both % threshold AND minimum absolute change
is_significant = (abs(change_pct) > 20 AND abs(abs_change) > 5)
return {
"change_pct": round(change_pct, 1),
"abs_change": abs_change,
"is_significant": is_significant,
"label": format_delta_label(change_pct, abs_change)
}
```
#### Data Validation (NEW)
```python
def validate_data(response):
"""Ensure data completeness before processing"""
# Check 1: Expected date range present
expected_dates = get_last_7_days()
actual_dates = extract_dates(response)
missing_dates = expected_dates - actual_dates
if missing_dates:
raise DataIncompleteError(f"Missing data for: {missing_dates}")
# Check 2: No null metrics
if any_null_metrics(response):
raise DataIncompleteError("Null values in metrics")
return True
```
---
### Feature 3: Client Context Configuration
**Goal**: Maintainable client-specific settings with validation
#### Storage
- **File**: `config/clients.json` (version controlled)
#### Structure (ENHANCED)
```json
{
"version": "1.0",
"last_updated": "2026-02-12",
"clients": [
{
"property_id": "123456789",
"name": "Logos School",
"slug": "logos_school",
"active": true,
"context": {
"business_type": "Therapeutic school",
"key_metrics": ["admissions", "parent_inquiries"],
"tone": "professional",
"focus_areas": "Focus on Admission funnel traffic and conversion quality"
},
"webdav_config": {
"base_path": "/AIGeneratedReports/LogosSchool",
"auto_create": true
},
"thresholds": {
"significant_change_pct": 20,
"significant_change_abs": 5
}
}
]
}
```
#### Validation Schema
```python
def validate_clients_config(config):
"""Ensure config is well-formed before using"""
required_fields = ["property_id", "name", "slug", "context", "webdav_config"]
for client in config["clients"]:
# Check required fields exist
missing = [f for f in required_fields if f not in client]
if missing:
raise ConfigError(f"Missing fields for {client.get('name')}: {missing}")
# Validate property_id format
if not client["property_id"].isdigit():
raise ConfigError(f"Invalid property_id for {client['name']}")
# Check WebDAV path doesn't have spaces
if " " in client["webdav_config"]["base_path"]:
raise ConfigError(f"WebDAV path cannot contain spaces")
return True
```
---
### Feature 4: The "Mini-CMO" Report Generator
**Goal**: Consistent, high-quality narrative reports via Claude API
#### LLM Integration (SPECIFIED)
- **Provider**: Anthropic Claude API
- **Model**: `claude-sonnet-4-5-20250929` (cost-effective, high quality)
- **API Key Storage**: Server `.env` file (separate from Google credentials)
- **Rate Limiting**: Max 5 reports/minute (Anthropic tier limits)
- **Cost Control**: ~$0.015 per report (3K tokens @ $3/M input, $15/M output)
#### Prompt Template
```python
SYSTEM_PROMPT = """
You are a Mini-CMO analyst for {client_name}, a {business_type}.
Your role: Transform analytics data into clear, actionable insights for business owner.
TONE: {tone} but conversational - avoid jargon.
FOCUS: {focus_areas}
OUTPUT FORMAT (strict):
# Weekly Analytics Snapshot - {report_date}
## 📊 Headline
[One sentence summary of the biggest story this week]
## ✅ Key Wins
[2-3 bullet points on positive movements or validations]
## 💡 Recommendation
[One specific action based on the data]
## 📈 The Numbers
[Formatted table of metrics with context]
RULES:
- Be specific (use actual numbers from data)
- Focus on "why this matters" not just "what changed"
- If a change is statistically insignificant, note it
- Avoid phrases like "optimization" or "synergy"
"""
USER_PROMPT = """
DATA SUMMARY: {processed_data_json}
Generate weekly report following the system format exactly.
"""
```
#### Retry Logic (NEW)
```python
def call_llm_with_retry(prompt, max_retries=3):
"""Robust API calling with exponential backoff"""
for attempt in range(max_retries):
try:
response = anthropic_client.messages.create(
model="claude-sonnet-4-5-20250929",
max_tokens=2000,
system=SYSTEM_PROMPT,
messages=[{"role": "user", "content": USER_PROMPT}],
timeout=30.0
)
# Validate response quality
content = response.content[0].text
if len(content) < 200:
raise ValueError("Response too short - likely error")
if "# Weekly Analytics Snapshot" not in content:
raise ValueError("Missing required format")
return content
except anthropic.RateLimitError:
wait_time = 2 ** attempt # Exponential backoff
log.warning(f"Rate limited, waiting {wait_time}s")
time.sleep(wait_time)
except anthropic.APIError as e:
if attempt == max_retries - 1:
raise # Give up after max retries
log.error(f"API error (attempt {attempt + 1}): {e}")
time.sleep(5)
raise Exception(f"Failed after {max_retries} attempts")
```
#### Delivery (ENHANCED)
```python
def upload_to_nextcloud(markdown_content, client_config, report_date):
"""Upload report with folder creation and error handling"""
# Build full path
base_path = client_config["webdav_config"]["base_path"]
filename = f"Report_{report_date}.md"
full_path = f"{base_path}/{filename}"
# Step 1: Ensure folder exists (NEW)
if client_config["webdav_config"]["auto_create"]:
create_webdav_folder_if_not_exists(base_path)
# Step 2: Upload with retry
webdav_url = f"https://cloud.shortcutsolutions.net/remote.php/dav/files/mike{full_path}"
response = requests.put(
webdav_url,
auth=("mike", NEXTCLOUD_APP_PASSWORD),
data=markdown_content.encode('utf-8'),
headers={"Content-Type": "text/markdown"},
timeout=30
)
# Step 3: Validate upload
if response.status_code == 201: # Created
log.info(f"✅ Uploaded: {filename}")
return full_path
elif response.status_code == 204: # Updated existing
log.info(f"✅ Updated: {filename}")
return full_path
else:
raise WebDAVError(f"Upload failed: {response.status_code} - {response.text}")
def create_webdav_folder_if_not_exists(folder_path):
"""Create nested folders as needed"""
webdav_url = f"https://cloud.shortcutsolutions.net/remote.php/dav/files/mike{folder_path}"
# MKCOL creates folder
response = requests.request(
"MKCOL",
webdav_url,
auth=("mike", NEXTCLOUD_APP_PASSWORD)
)
if response.status_code in [201, 405]: # 405 = already exists
return True
else:
raise WebDAVError(f"Folder creation failed: {response.status_code}")
```
---
## Implementation Plan (REVISED)
### Phase 1: Nextcloud App Development (Days 1-3)
#### Nextcloud App Structure
Create Nextcloud app directory structure with PHP MVC pattern.
#### info.xml
```xml
<?xml version="1.0"?>
<info>
<id>analytics-hub</id>
<name>Mini-CMO Analytics Hub</name>
<description>AI-powered Google Analytics 4 reporting</description>
<licence>AGPL</licence>
<author>Shortcut Solutions</author>
<version>1.0.0</version>
<namespace>AnalyticsHub</namespace>
<category>integration</category>
<dependencies>
<nextcloud min-version="25" max-version="26"/>
</dependencies>
<settings>
<admin>OCA\AnalyticsHub\Settings\Admin</admin>
</settings>
</info>
```
#### Routes Configuration
```php
// lib/AppInfo/Application.php
$this->registerRoutes($this, [
[
'name' => 'api#v1',
'url' => '/api',
'verb' => 'POST',
],
[
'name' => 'api#reports',
'url' => '/api/reports',
'verb' => 'GET',
],
[
'name' => 'api#generate',
'url' => '/api/generate',
'verb' => 'POST',
],
]);
```
---
### Phase 2: Core Application (Days 4-7)
#### Google Cloud Configuration
- Create Google Cloud Project: `nextcloud-analytics-hub`
- Enable "Google Analytics Data API v1"
- Configure OAuth Consent Screen
- Scopes: `https://www.googleapis.com/auth/analytics.readonly`
- Create OAuth 2.0 Client ID
#### App Settings Page
```php
// templates/admin.php
<form method="POST" action="/apps/analytics-hub/settings/save">
<input type="text" name="google_client_id" placeholder="Google Client ID">
<input type="password" name="google_client_secret" placeholder="Google Client Secret">
<textarea name="anthropic_api_key" placeholder="Anthropic API Key"></textarea>
<input type="file" name="credentials_json" accept=".json">
<button type="submit">Save Configuration</button>
</form>
```
#### Cron Job Setup
```php
// cron.php
$app = \OC::$server->getAppManager()->getApp('analytics-hub');
// Run Mon-Fri at 7:00 AM
if (date('N') >= 1 && date('N') <= 5 && date('H') === '07') {
$clients = $app->getConfig()->getClients();
foreach ($clients as $client) {
try {
$data = $app->getGoogleAnalyticsService()->fetchData($client);
$processed = $app->getDataProcessor()->process($data, $client);
$markdown = $app->getLLMService()->generate($processed, $client);
$app->getNextcloudFiles()->saveReport($markdown, $client);
} catch (Exception $e) {
\OCP\Util::writeLog("Failed for {$client['name']}: {$e->getMessage()}");
}
}
}
```
---
### Phase 3: Agent Integration (Days 8-10)
#### Nextcloud-Integration Tool Updates
Add custom API access to nextcloud-client tool:
```python
# tools/go/nextcloud-client/main.go
// New operations for analytics-hub
func listAnalyticsReports() {
endpoint := "/apps/analytics-hub/api/reports"
resp := webdavRequest(endpoint, "GET", nil)
return parseReports(resp)
}
func generateAnalyticsReport(clientSlug, dateRange) {
endpoint := "/apps/analytics-hub/api/generate"
payload := map[string]interface{}{
"client_slug": clientSlug,
"date_range": dateRange,
}
resp := webdavRequest(endpoint, "POST", payload)
return parseResponse(resp)
}
func downloadAnalyticsReport(reportId) {
endpoint := fmt.Sprintf("/apps/analytics-hub/api/report/%d", reportId)
resp := webdavRequest(endpoint, "GET", nil)
return saveToFile(resp)
}
```
#### SKILL.md Updates
```markdown
## Nextcloud Analytics Hub
Generate AI-powered analytics reports:
### List Reports
```bash
nextcloud-client --op analytics-reports-list
```
### Generate Report
```bash
nextcloud-client --op analytics-generate --client logos_school --days 7
```
### Download Report
```bash
nextcloud-client --op analytics-download --report-id 123
```
```
---
### Phase 4: Deployment & Automation (Days 11-12)
#### App Installation
```bash
# Upload to Nextcloud apps directory
scp -r analytics-hub/ mike@cloud.shortcutsolutions.net:/var/www/nextcloud/apps/
# Enable app via Nextcloud UI
# Settings → Apps → Analytics Hub → Enable
# Or use occ CLI
sudo -u www-data php occ app:enable analytics-hub
```
#### Cron Configuration
Nextcloud will handle cron automatically:
```php
// appinfo/info.xml
<cron>
<job>OCA\AnalyticsHub\Cron\DailyReport</job>
<interval>86400</interval> <!-- 24 hours -->
</cron>
```
---
---
## Summary
This PRD represents a Nextcloud internal PHP application with:
- **Embedded in Nextcloud** - Runs as PHP app, not external service
- **Agent Integration** - Exposes REST APIs for nextcloud-integration tools
- **Zero-Touch Operation** - Scheduled internal jobs, no manual intervention
- **Fail-Fast Architecture** - Halt on errors, don't propagate bad data
- **Observable Execution** - Logs stored in Nextcloud logs
- **Graceful Degradation** - One client failure doesn't crash job
- **Cost Efficiency** - ~$1/month (LLM only, Google API free)
- **Proactive Monitoring** - Token health checks and email alerts
**Estimated Implementation Time**: 12 days → 4 weeks of reliable operation
**Architecture Change**: From external Python app (v2.0) to Nextcloud PHP app (v3.0)
---
**PRD Version**: 3.0
**Last Updated**: 2026-02-12 23:15 GMT
**Status**: READY FOR IMPLEMENTATION