Assay Architecture

Event-Driven Serverless Design for Document Intelligence

8. Operational Considerations

Error Handling & Fault Tolerance

Retry Logic: Pub/Sub automatically retries failed messages with exponential backoff. Workers are designed to be idempotent, ensuring safe re-execution.

Error States: Failed processing updates Firestore status to failed with error details. Users can see error messages in the UI and retry if needed.

Monitoring: All workers log structured events with document IDs, user IDs, and processing metadata. Errors are logged with full context for debugging.

Observability

Logging: Structured logging using Google Cloud Logging with:

  • Document ID and user ID in every log entry
  • Processing stage and progress percentage
  • Error details with stack traces
  • Performance metrics (processing time, token counts)

Monitoring: Cloud Functions provide built-in metrics:

  • Invocation count
  • Execution time
  • Error rate
  • Memory usage

Alerting: Can be configured for error rates, processing failures, or system health issues.

Cost Optimization

Serverless Scaling: Functions scale to zero during quiet periods, eliminating idle costs.

Summary-Only Storage: PDFs deleted after processing, only AI-generated summaries retained (no extracted text kept), reducing storage costs by 10-20x.

Efficient Queries: Firestore indexes optimized for common query patterns, reducing read costs.

Caching: Frontend uses static generation and CDN caching to minimize database reads.

Security

Authentication: Firebase Authentication for user management, API keys for programmatic access.

Authorization: Document-level access control (users can only access their own documents or public documents).

Data Privacy: Original PDFs deleted after processing, only summaries retained (no extracted text from PDFs is kept).

API Security: API keys use hashed secrets, header-only authentication, time-limited expiration.

User Tiers

Anonymous: Can browse public documents only, no uploads.

Member (Registered): Can upload documents, access private collection, use MCP integration.

Premium (Upgraded): Same as Member, plus access to Premium quality processing (Gemini 2.5 Pro).

Admin: Full system access for moderation and management.

Document Visibility

Public: Document summaries are visible to all users, enabling community discovery.

Private: Document summaries are visible only to the uploader.

Visibility Selection: Chosen at upload time, cannot be changed after upload (privacy and data integrity).

9. Conclusion

Assay's architecture demonstrates how event-driven, serverless design can create a scalable, maintainable, and user-friendly document intelligence system. By decoupling processing stages, enabling parallel execution, and providing real-time updates, we've built a system that scales automatically while delivering intelligent insights to users.

Key Strengths:
  • Automatic Scaling: Handles traffic spikes without manual intervention
  • Fault Tolerance: Retry logic and idempotent workers ensure reliable processing
  • Real-Time Updates: Users see progress as it happens
  • Cost Efficiency: Pay-per-use model with automatic scaling to zero
  • Extensibility: New features can be added by subscribing to existing topics

Future Enhancements:

  • Additional processing stages (e.g., citation extraction, figure analysis)
  • Enhanced similarity algorithms (e.g., semantic embeddings)
  • Advanced search capabilities (e.g., full-text search, semantic search)
  • Multi-language support
  • Batch processing for large document sets

The architecture continues to evolve as we add new capabilities, but the core principles—event-driven design, serverless execution, and real-time updates—remain the foundation for everything we build.

Explore Assay: assay.cirrusly-clever.com

Glossary

L0 Theme
Root domain in canonical taxonomy (broad category, e.g., ARTIFICIAL_INTELLIGENCE)
L1 Theme
Specific theme in canonical taxonomy (e.g., ARTIFICIAL_INTELLIGENCE.AI_SAFETY)
MCP
Model Context Protocol - Standard protocol for AI assistants to interact with external tools
Pub/Sub
Google Cloud Pub/Sub - Message queue service for event-driven architecture
Jaccard Similarity
Set similarity measure: |A ∩ B| / |A ∪ B|, used for theme overlap calculation
Idempotent
Operation that can be safely repeated without side effects
Single-Pass Strategy
Direct summary generation for documents < 5,000 tokens
Hierarchical Strategy
Chunk-and-merge summarization for documents ≥ 5,000 tokens