Workers: Cloud Functions scale automatically based on Pub/Sub message volume. During peak upload times, more workers spin up to handle the load. During quiet periods, workers scale down to zero, minimizing costs.
Frontend: Next.js application is statically generated and served via Firebase Hosting CDN, providing global edge caching and automatic scaling.
Database: Firestore automatically scales based on read/write volume, with no manual sharding required.
Multiple processing stages run in parallel, significantly reducing total processing time:
Firestore onSnapshot listeners provide real-time updates to the frontend, so users see progress as it happens. No polling required—updates appear instantly as processing completes.
Update Flow:
display/{documentId}onSnapshot listener in frontendUsers can select processing quality, affecting speed and cost:
Fast (Flash) Quality:
Premium (Pro) Quality:
Beyond processing individual documents, Assay enables intelligent document discovery through theme-based search and similarity matching.
Users can search for documents by selecting themes from the hierarchical taxonomy. The system uses fuzzy matching to find relevant themes, then queries documents that match those themes.
Search Implementation:
label (weight: 0.5), Theme id (weight: 0.3), Theme synonyms[] (weight: 0.2)Query Optimization:
array-contains-any queries (supports up to 10 values per query)When viewing a document, users can discover similar documents based on theme overlap. The system calculates similarity scores by comparing theme sets between documents, prioritizing documents with more specific theme matches.
Similarity is calculated using a weighted Jaccard coefficient that measures the proportion of overlapping themes between documents:
L1_Jaccard = |L1_intersection| / |L1_union|
L0_Jaccard = |L0_intersection| / |L0_union|
Final_Score = (0.8 × L1_Jaccard) + (0.2 × L0_Jaccard)
Why Jaccard?
Weighting:
Query Process:
array-contains-any)Result Grouping:
Assay supports multiple integration patterns, enabling users to interact with their document library through various interfaces:
The primary interface is a Next.js web application that provides:
Technology Stack:
Cloud Functions expose HTTPS endpoints that enable programmatic access to document processing, search, and retrieval capabilities.
API Structure:
https://api.assay.cirrusly-clever.com/api/v1X-API-Key headerask_live_{keyId}_{keySecret}API Key Management:
Available Endpoints:
GET /api/v1/health - Health checkGET /api/v1/me - Current user/key infoGET /api/v1/documents/search - Unified searchGET /api/v1/documents/:id - Get documentGET /api/v1/documents/:id/summary - Get summaries (comprehensive, casual, or FAQ)GET /api/v1/documents/:id/similar - Get similar documentsGET /api/v1/themes - Browse canonical themesAssay supports the Model Context Protocol, enabling integration with AI assistants and other tools that support the protocol.
MCP Server:
Available Tools (13 Total):
Search & Discovery:
search_documents - Search by theme, author, or titlesearch_by_theme - Search documents by specific canonical themesearch_by_author - Find documents by specific authorssearch_by_title - Search document titlessearch_by_keywords - Search in keywords, concepts, and phrasesbrowse_themes - Explore the canonical theme taxonomybrowse_all_documents - Browse all documents with optional filtersDocument Retrieval:
get_document_summary - Get comprehensive, casual, or FAQ summariesget_similar_documents - Find related documents using Jaccard similarityget_library_insight - Get personalized research profile insightAdvanced Analysis:
ask_question - Ask questions about your library with AI-powered answer synthesiscompare_documents - Compare up to 10 documents with AI-powered comparison synthesisproduce_faq - Generate FAQs from multiple documents by themeMCP Interaction Modes:
Mode 1: MCP Only (Basic Integration)
Mode 2: MCP + Skills (Enhanced Integration)