Cloud Sayings Architecture

Analytics & Prompting

The analytics dashboard provides near real-time insights into model performance and user feedback, while the prompting strategy ensures consistent, high-quality analogies optimized for each LLM provider.

Dashboard Analytics

Architecture

The dashboard aggregates feedback data from DynamoDB and CloudWatch to provide near real-time analytics:

┌─────────────────────────────────────────────────────────┐
│              getDashboardStats Lambda                    │
│                                                          │
│  1. Check 2-minute in-memory cache                      │
│  2. If cache miss:                                       │
│     a. Scan saying_feedback table (DynamoDB)            │
│     b. Query CloudWatch for total invocations           │
│     c. Aggregate by provider/model                      │
│     d. Extract AWS services from saying text             │
│     e. Calculate success rates, scores, response times  │
│     f. Update cache                                      │
│  3. Return JSON response                                │
└─────────────────────────────────────────────────────────┘

Data Aggregation

Provider-Level Metrics:

For each feedback record:

Count positive/negative feedback
Track execution times (filtering cached responses)
Calculate success rate: (positive / total) * 100
Calculate score: positive - negative
Calculate average response time: sum(execution_times) / len(execution_times)

Response Time Calculation

The system uses a priority-based approach for response time tracking:

Priority 1: original_llm_time (most accurate - from new data)
- Stored when feedback is submitted with cached response
- Represents actual LLM API call time
Priority 2: is_cached flag (explicit marking)
- If is_cached=True and no original_llm_time, skip (likely cache lookup)
Priority 3: Heuristic for historical data
- If execution_time < 100ms, assume cached (skip)
- If execution_time ≥ 100ms, assume direct LLM call (use it)

CACHE_THRESHOLD_MS = 100

if original_llm_time is not None:
    # Use original LLM API time (most accurate)
    exec_time_float = float(original_llm_time)
    provider_stats[vendor]['execution_times'].append(exec_time_float)
elif execution_time is not None:
    exec_time_float = float(execution_time)
    if is_cached:
        # Skip cached responses without original LLM time
        pass
    elif exec_time_float < CACHE_THRESHOLD_MS:
        # Likely cached - skip it (heuristic)
        pass
    else:
        # Likely direct LLM call - use it (heuristic for historical data)
        provider_stats[vendor]['execution_times'].append(exec_time_float)

AWS Service Extraction

The dashboard extracts AWS service names from saying text using regex patterns:

Try full service name match (Amazon X or AWS X)
Try abbreviation match (S3, Lambda, EC2, etc.)
Try canonical name matching
Fallback to "Unknown"

Service mappings handle variations:

"DynamoDB" → "Amazon DynamoDB"
"Lambda" → "AWS Lambda"
"EC2" → "Amazon EC2"

Caching Strategy

Cache TTL: 2 minutes
Cache Key: In-memory variable (per Lambda instance)
Cache Invalidation: Automatic after TTL expires
Purpose: Near real-time updates while reducing DynamoDB scan costs

LLM Prompting Strategy

Prompt Architecture

The system uses two distinct prompt types:

Single Saying Prompt: For live LLM calls (request #4)
Cache Building Prompt: For batch generation (3 sayings at once)

Provider-Specific Prompt Differences

Anthropic (Claude) Prompts

Philosophy: Detailed, structured, context-rich

Single Saying Prompt Structure:

### ROLE
You are a witty cloud computing expert...

### TASK
Create exactly ONE clever analogy for the AWS service: {service_name}

### CONTEXT
Category: {category_name}
Related services in this category:
- Service 1
- Service 2
...

### OUTPUT FORMAT
The analogy MUST follow this exact format:
"[Service Name] is like [clever comparison] - [technical insight]!"

### REQUIREMENTS
1. Use the exact service name: {service_name}
2. Maximum 120 characters total
3. Must end with an exclamation mark (!)
...

### STYLE GUIDELINES
- Tone: Upbeat, constructive, and positive
- Avoid: ninjas, martial arts, butlers, chefs
...

### EXAMPLES
Example 1: "Amazon S3 is like..."
Example 2: "AWS Lambda is like..."
...

### YOUR TASK
Now create an analogy for {service_name}...

OpenAI (GPT) Prompts

Philosophy: Concise, direct, optimized for single responses

Single Saying Prompt Structure:

System: You are a witty cloud computing expert who creates brand-positive,
        upbeat analogies that make AWS services relatable and memorable.

User: Generate exactly one analogy for {service_name}.

Requirements:
- Format: "[{service_name}] is like [comparison] - [insight]!"
- Under 120 characters total
- End with exclamation mark
- Output ONLY the analogy, nothing else

Key Differences

Aspect	Anthropic (Claude)	OpenAI (GPT)
Structure	Hierarchical sections (ROLE, TASK, CONTEXT, etc.)	Flat system/user messages
Detail Level	Highly detailed with examples	Concise and direct
Context	Full category context	Minimal context
Examples	Multiple examples provided	No examples in prompt
Format Instructions	Repeated in multiple sections	Single clear instruction block
Length	~500-800 tokens	~200-300 tokens

Service Selection Strategy

Critical: The system explicitly selects services rather than asking the LLM to pick randomly:

def get_llm_sayings_for_cache(provider: str, model_name: str):
    # Select 3 different services explicitly
    selected_services = []
    for _ in range(3):
        selection = select_random_service()  # Avoids recently used
        selected_services.append(selection.service_name)

    # Create prompt with explicit services
    services_list = '\n'.join([f"- {service}" for service in selected_services])
    # ... prompt includes these exact services

This ensures:

Variety: Services are explicitly different
Control: No reliance on LLM randomness
Tracking: Recently used services are avoided

Response Parsing

The system uses multiple parsing strategies to extract sayings from LLM responses:

Strategy 1: Newline Splitting

lines = [line.strip() for line in content.split('\n') if line.strip()]
for line in lines:
    saying = line.strip().strip('"\'')
    saying = re.sub(r'^\d+[\.\)]\s*', '', saying)  # Remove numbering
    saying = re.sub(r'^[-*]\s*', '', saying)  # Remove markdown
    if saying and saying.endswith('!') and len(saying) <= 130:
        sayings.append(saying)

Strategy 2: Quoted Sayings

quoted_matches = re.findall(r'"[^"]+!"', content)

Strategy 3: Pattern Matching

pattern = r'([A-Za-z0-9\s\-]+(?:AWS|Amazon)?\s+[A-Za-z0-9\s\-]+)\s+is\s+like\s+([^!]+)!'
matches = re.findall(pattern, content, re.IGNORECASE)

                    Character Limit Handling:
                    Target: 120 characters per saying
Acceptance: Up to 130 characters (allows slight overflow from OpenAI)
Truncation: If >130 chars, truncate at last complete sentence (ending with !)

                

← LLM Gateway & Caching Infrastructure →

Table of Contents