BRONZE LAYER — Raw Data Ingestion & ETL Goal: Get all 16,852 historical records into unified JSON format -------------------------------------------------------------------------------- STORY B-01: Build IRIS Solution Intake Excel Ingestion Pipeline -------------------------------------------------------------------------------- Story Name: Build IRIS Solution Intake Excel Ingestion Pipeline Epic: Bronze Layer - Data Ingestion Story Points: 5 Sprint: Sprint 1 Priority: Critical Assignee: [TBD - Backend Developer] Labels: bronze, etl, iris, data-pipeline Description: As a data engineer, I need to build a Python script using openpyxl that reads the IRIS Solution Intake XLSX file (3,091 records, 124 columns) and outputs normalized JSON records so that downstream semantic processing can consume clean, standardized data. The script must: - Read IRIS_Solution_Intake_requests.xlsx - Parse all 124 fields per record - Handle missing/null values gracefully (empty string, not None) - Extract requestor names from "Full Name (ID)" format e.g., "Johnny Hoogenboom (406508)" → name: "Johnny Hoogenboom", id: "406508" - Normalize date fields to ISO 8601 format e.g., "2025-08-26 04:55:58" → "2025-08-26T04:55:58Z" - Output each record as a JSON object - Store output to Content Sphere Bronze bucket (partitioned: source=iris) Key fields to extract: - number (RITM number), cat_item, stage, state, approval - request.requested_for, request.opened_by, due_date - short_description, description, work_notes, comments - business_service, service_offering, assignment_group - opened_at, closed_at, made_sla, project_id Acceptance Criteria: AC-1: Script reads IRIS_Solution_Intake_requests.xlsx without errors AC-2: Outputs exactly 3,091 JSON records AC-3: All 124 fields mapped to output JSON AC-4: Date fields converted to ISO 8601 (e.g., "2025-08-26T04:55:58Z") AC-5: Names extracted from parenthetical format (name separate from ID) AC-6: Null/missing values handled as empty strings AC-7: Output JSON stored to Content Sphere Bronze bucket AC-8: Script logs processing stats (records processed, errors, time taken) AC-9: Unit tests cover date parsing, name extraction, null handling Dependencies: None (first story in pipeline) Test Data: IRIS_Solution_Intake_requests.xlsx (in project files) -------------------------------------------------------------------------------- STORY B-02: Build Jira Epics Excel Ingestion Pipeline with Summary Parsing -------------------------------------------------------------------------------- Story Name: Build Jira Epics Excel Ingestion Pipeline with Summary Parsing Epic: Bronze Layer - Data Ingestion Story Points: 8 Sprint: Sprint 1 Priority: Critical Assignee: [TBD - Backend Developer] Labels: bronze, etl, jira, epics, regex Description: As a data engineer, I need to build a Python script that reads the MLL_JIRA_Epics_extract.xlsx file (7,210 records, 476 columns) and parses the Epic Summary field using a regex pattern to extract structured site and request type information. The Epic Summary follows this pattern: "{SITE}, {COUNTRY_CODE} - {Request Type} - {RITM_NUMBER}" Example: "JACKSONVILLE, US - InfoLink Modules Implementation - RITM000023585344" The regex must extract: - site_name: "JACKSONVILLE" - site_country_code: "US" - request_type_parsed: "InfoLink Modules Implementation" - linked_ritm: "RITM000023585344" The script must also extract: - site_id from labels (e.g., "Site_5619" → "5619") - site_region from epic_region field - site_tier from labels (e.g., "Tier_1") - epic_components as array (e.g., ["ProjectDemand", "Network"]) - epic_labels as array - epic_issue_links as structured objects Acceptance Criteria: AC-1: Script reads MLL_JIRA_Epics_extract.xlsx without errors AC-2: Outputs exactly 7,210 JSON records AC-3: Regex correctly parses Summary field for records matching pattern AC-4: Extracted fields: site_name, site_country_code, request_type_parsed, linked_ritm AC-5: site_id extracted from labels (Site_XXXX pattern) AC-6: site_tier extracted from labels (Tier_X pattern) AC-7: Records not matching Summary pattern have empty parsed fields (no errors) AC-8: epic_components and epic_labels stored as JSON arrays AC-9: Output stored to Content Sphere Bronze bucket (partitioned: source=jira_epics) AC-10: Unit tests cover regex parsing with 10+ sample summaries Dependencies: None (can run parallel with B-01) Test Data: MLL_JIRA_Epics_extract.xlsx (in project files) -------------------------------------------------------------------------------- STORY B-03: Build Jira User Stories Excel Ingestion Pipeline -------------------------------------------------------------------------------- Story Name: Build Jira User Stories Excel Ingestion Pipeline Epic: Bronze Layer - Data Ingestion Story Points: 5 Sprint: Sprint 1 Priority: High Assignee: [TBD - Backend Developer] Labels: bronze, etl, jira, user-stories Description: As a data engineer, I need to build a Python script that reads the MLL_JIRA_Userstory_extracts.xlsx file (6,551 records, 407 columns) and links each story to its parent Epic via the Parent ID field so that stories can be nested under their Epics in the unified record. Key fields to extract per story: - story_key (Issue key, e.g., "ABFZ-97354") - summary - status (Completed, Open, In Progress, etc.) - assignee - story_points - sprint (sprint name) - description (full text) - parent_id (links to Epic) - acceptance_criteria (if present in description) The output must be a lookup dictionary: { parent_epic_id: [list of stories] } so the Unified Field Mapper (B-05) can attach stories to each Epic. Acceptance Criteria: AC-1: Script reads MLL_JIRA_Userstory_extracts.xlsx without errors AC-2: Outputs exactly 6,551 story records AC-3: Each story linked to parent Epic via parent_id AC-4: Output includes: story_key, summary, status, assignee, story_points, sprint, description AC-5: Lookup dictionary keyed by parent Epic ID produced AC-6: Stories with no parent_id logged as warnings (not errors) AC-7: Output stored to Content Sphere Bronze bucket (partitioned: source=jira_stories) AC-8: Unit tests verify parent linkage with known Epic-Story pairs Dependencies: None (can run parallel with B-01 and B-02) Test Data: MLL_JIRA_Userstory_extracts.xlsx (in project files) -------------------------------------------------------------------------------- STORY B-04: Build PDF Invoice OCR Extraction Pipeline -------------------------------------------------------------------------------- Story Name: Build PDF Invoice OCR Extraction Pipeline Epic: Bronze Layer - Data Ingestion Story Points: 8 Sprint: Sprint 2 Priority: High Assignee: [TBD - Backend Developer] Labels: bronze, etl, ocr, pdf, invoices Description: As a data engineer, I need to build an OCR pipeline using Tesseract + PyMuPDF + pdfplumber that extracts structured cost data from ~2,200 PDF vendor invoices so that cost estimation features can use real historical pricing data. The pipeline must extract from each invoice: - vendor_name (e.g., "Crown Equipment Corporation") - invoice_number (e.g., "INV-2024-AESR-0847") - invoice_date - po_number (Purchase Order) - currency (USD, EUR, etc.) - line_items: array of { description, quantity, unit_cost, total } e.g., { "InfoLink Terminal IT5000": qty=24, unit=$1,850, total=$44,400 } - subtotal - total_amount The pipeline must handle: - Multi-page invoices (concatenate pages before parsing) - Scanned vs digital PDFs (OCR for scanned, text extraction for digital) - Multiple table formats (vendors use different layouts) - Currency symbols and number formatting Acceptance Criteria: AC-1: Pipeline processes PDF files from input directory AC-2: Extracts vendor_name, invoice_number, invoice_date, po_number AC-3: Extracts line items with description, quantity, unit_cost, total AC-4: Handles multi-page invoices correctly AC-5: Uses OCR (Tesseract) for scanned PDFs, text extraction for digital AC-6: Output as structured JSON per invoice AC-7: Error logging for invoices that fail extraction (with PDF filename) AC-8: >85% successful extraction rate on test set of 100 invoices AC-9: Output stored to Content Sphere Bronze bucket (partitioned: source=pdf_invoices) AC-10: Processing time <2 minutes per invoice on average Dependencies: None Test Data: Sample PDF invoices (to be provided) -------------------------------------------------------------------------------- STORY B-05: Build Confluence Knowledge Base Ingestion Pipeline -------------------------------------------------------------------------------- Story Name: Build Confluence Knowledge Base Ingestion Pipeline Epic: Bronze Layer - Data Ingestion Story Points: 5 Sprint: Sprint 2 Priority: Medium Assignee: [TBD - Backend Developer] Labels: bronze, etl, confluence, knowledge-base Description: As a data engineer, I need to build a pipeline that extracts content from key Confluence pages used in MLL intake processes so that the Knowledge Agent has access to process documentation and guidelines. Target Confluence pages: - MLLF Intake Process (detailed submission instructions) - MLL Network and Firewall (network team engagement process) - Intake vs Incident decision guide - TS Engage or IRIS routing guide - User Story Checklist template The pipeline must: - Connect to Confluence REST API (or use exported HTML) - Extract page title, body content (cleaned HTML → plain text) - Preserve section structure (headings, lists, tables) - Extract linked KB article references - Store as JSON documents with page_id, title, content, last_updated Acceptance Criteria: AC-1: All 5 target Confluence pages extracted AC-2: HTML cleaned to plain text with structure preserved AC-3: Section headings maintained for chunking AC-4: Tables converted to structured text AC-5: Output JSON includes: page_id, title, content, sections[], last_updated AC-6: Stored to Content Sphere Bronze bucket (partitioned: source=confluence) AC-7: Handles Confluence API authentication Dependencies: Confluence API access credentials Test Data: Confluence page URLs (in project screenshots) -------------------------------------------------------------------------------- STORY B-06: Build Unified Field Mapper (IRIS + Epic + Stories + PDF Merge) -------------------------------------------------------------------------------- Story Name: Build Unified Field Mapper - Merge All Sources Epic: Bronze Layer - Data Ingestion Story Points: 8 Sprint: Sprint 2-3 Priority: Critical Assignee: [TBD - Senior Backend Developer] Labels: bronze, etl, unified-schema, merge, critical-path Description: As a data engineer, I need to create the Unified Field Mapper that merges data from all four sources (IRIS, Jira Epics, Jira Stories, PDF invoices) into a single unified JSON record per request, following the MLL_Unified_Ticket_Schema.json schema with 12 logical sections. Merge logic: 1. Start with each IRIS RITM record as the base 2. Link to Jira Epic via work_notes field (contains Jira URL like "https://jira.jnj.com/browse/ABFZ-97353") 3. Attach User Stories to Epic via parent_id linkage 4. Attach PDF invoice data to Epic via RITM number match 5. For Epics without IRIS RITM, create record from Epic data only 6. Compute derived fields: - duration_days (closed_at - opened_at) - has_network_requirement (true if "Network" in components) - has_quotation (true if PDF invoice linked) - story_count (number of linked stories) - attachment_count (number of attachments) The 12 schema sections: 1. iris_identity 2. iris_workflow 3. iris_people 4. epic_identity 5. epic_classification 6. epic_description 7. epic_scope 8. epic_timeline 9. user_stories 10. network_request 11. quotation 12. computed_fields Output: ~16,852 unified JSON records (union of IRIS + Epics) Acceptance Criteria: AC-1: Merges IRIS + Epics + Stories + PDF data into single records AC-2: IRIS-to-Epic linkage via work_notes Jira URL extraction AC-3: Stories nested under their parent Epic AC-4: PDF invoice data attached via RITM number AC-5: 12-section schema structure maintained per record AC-6: Computed fields calculated correctly (duration, story_count, etc.) AC-7: ~16,852 unified records produced (IRIS + Epics union) AC-8: Records with partial data (missing Epic or missing IRIS) handled gracefully AC-9: Deduplication: no duplicate records for same RITM/Epic pair AC-10: Output stored to Content Sphere Bronze bucket (partitioned: source=unified) AC-11: Mapping report generated: fields mapped, fields dropped, merge stats Dependencies: B-01, B-02, B-03, B-04 (all source ingestion complete) Test Data: Output from B-01 through B-04 -------------------------------------------------------------------------------- STORY B-07: Bronze Layer Validation & Quality Checks -------------------------------------------------------------------------------- Story Name: Bronze Layer Data Validation & Quality Report Epic: Bronze Layer - Data Ingestion Story Points: 3 Sprint: Sprint 3 Priority: High Assignee: [TBD - Backend Developer] Labels: bronze, validation, quality Description: As a data engineer, I need to build a validation script that verifies the quality and completeness of all Bronze layer data before it moves to the Silver layer for semantic processing. Validation checks: - Record count verification (3,091 + 7,210 + 6,551 expected) - Required field completeness rates (description, site_name, etc.) - Date format consistency (all ISO 8601) - RITM-to-Epic linkage success rate - Story-to-Epic linkage success rate - PDF extraction success rate - Duplicate detection (same RITM or Epic appearing twice) Output: HTML quality report with pass/fail status and statistics. Acceptance Criteria: AC-1: Validates record counts per source AC-2: Reports field completeness percentages AC-3: Flags records with missing required fields AC-4: Reports linkage success rates (RITM→Epic, Story→Epic) AC-5: Identifies duplicate records AC-6: Generates HTML quality report AC-7: Returns pass/fail status (pass = >95% completeness) AC-8: Report stored to Content Sphere Bronze bucket Dependencies: B-06 (unified records available) SILVER LAYER — Semantic Processing & Embedding Goal: Convert unified JSON into searchable semantic vectors -------------------------------------------------------------------------------- STORY S-01: Build Text Cleaning Module (clean_text) -------------------------------------------------------------------------------- Story Name: Build Text Cleaning Module for Embedding Quality Epic: Silver Layer - Semantic Processing Story Points: 3 Sprint: Sprint 3 Priority: Critical Assignee: [TBD - ML Engineer] Labels: silver, semantic, text-cleaning, nlp Description: As an ML engineer, I need to implement the clean_text() function that removes noise from raw ticket text fields before they are converted to semantic documents. This directly impacts embedding quality — noisy text produces poor vectors that reduce RAG retrieval accuracy. The function must handle these noise patterns: - Jira markup: [~username], {panel:title=...}, {code:...}, {noformat} - Confluence artifacts: _x000D_ carriage returns - URLs → replace with [URL] token (URLs add noise to embeddings) - Email addresses → replace with [EMAIL] token - MAC addresses (e.g., 00:07:4d:a0:1b:cb) → replace with [MAC] token - Multiple consecutive newlines → collapse to double newline - Multiple spaces/tabs → collapse to single space - Non-breaking spaces (U+00A0) → regular space - Leading/trailing whitespace → strip Important: Do NOT remove RITM numbers or Jira keys — they carry identity meaning for the embedding model. Acceptance Criteria: AC-1: Removes all Jira markup patterns ([~user], {panel}, {code}, {noformat}) AC-2: Removes _x000D_ artifacts AC-3: URLs replaced with [URL] token AC-4: Email addresses replaced with [EMAIL] token AC-5: MAC addresses replaced with [MAC] token AC-6: Whitespace normalized (no triple+ newlines, no double+ spaces) AC-7: Non-breaking spaces converted to regular spaces AC-8: RITM numbers and Jira keys preserved (NOT removed) AC-9: Function returns empty string for None/empty input AC-10: Unit tests with 15+ test cases covering all patterns Dependencies: None (can start while Bronze completes) Reference: mll_semantic_document_converter.py Section 3 -------------------------------------------------------------------------------- STORY S-02: Build Schema Normalization Module (normalize_schema) -------------------------------------------------------------------------------- Story Name: Build Schema Normalization Module Epic: Silver Layer - Semantic Processing Story Points: 3 Sprint: Sprint 3 Priority: Critical Assignee: [TBD - ML Engineer] Labels: silver, semantic, normalization Description: As an ML engineer, I need to implement normalize_schema() that standardizes field formats from the unified JSON into a structure suitable for the semantic template engine. Normalization tasks: - Extract names: "Johnny Hoogenboom (406508)" → "Johnny Hoogenboom" - Normalize dates: "2025-08-26 04:55:58" → "2025-08-26T04:55:58Z" - Human-readable dates: "2025-08-26T04:55:58Z" → "August 26, 2025" - Compute derived fields: * days_open = (today - opened_at).days * sla_sentence = "The request met its SLA target." or "missed" * nexus_id_sentence = "A Nexus ID is required..." or "" - Default empty values for missing optional sections - Structure classification sub-object (request_type, capability_center, routing_system, complexity_tier, confidence_score) Acceptance Criteria: AC-1: Names extracted from "Name (ID)" format correctly AC-2: All dates normalized to ISO 8601 AC-3: Human-readable date strings generated (e.g., "August 26, 2025") AC-4: days_open computed correctly AC-5: sla_sentence generated based on made_sla field AC-6: nexus_id_sentence generated based on capability_center AC-7: Classification sub-object populated AC-8: Missing optional fields defaulted (empty string, not None) AC-9: Unit tests with 10+ test cases Dependencies: B-06 (unified JSON schema must be defined) Reference: mll_semantic_document_converter.py Section 2 -------------------------------------------------------------------------------- STORY S-03: Build Semantic Document Template Engine (THE CORE) -------------------------------------------------------------------------------- Story Name: Build Semantic Document Template Engine Epic: Silver Layer - Semantic Processing Story Points: 8 Sprint: Sprint 4 Priority: Critical (MOST IMPORTANT STORY IN SILVER) Assignee: [TBD - Senior ML Engineer] Labels: silver, semantic, template-engine, critical-path Description: As an ML engineer, I need to implement the semantic template engine that converts structured JSON records into natural language documents across 8 section templates. This is the CORE of the entire pipeline — the quality of these semantic documents directly determines RAG retrieval accuracy. WHY THIS MATTERS: Embedding models (like all-MiniLM-L6-v2) are trained on natural language text, NOT on database field names. The template engine bridges this gap. BAD input for embedding (raw key-value): "site_name: JACKSONVILLE, site_country_code: US, site_id: 5619" GOOD input for embedding (semantic text): "This request is for site JACKSONVILLE, US (Site ID: 5619). The site is in the NA region and is classified as Tier 1. The sector is MedTech." The 8 templates to implement: 1. TEMPLATE_IDENTITY - RITM + Epic + Site identifiers "MLL Solution Intake request {ritm_number} linked to Jira Epic {epic_key} in ABFZ project..." 2. TEMPLATE_SUMMARY - Request type + capability + confidence "Request summary: {epic_summary}. Request type: {request_type_parsed}. Classified as {classification_request_type}..." 3. TEMPLATE_DESCRIPTION - Business problem in natural language "Business problem and description for {ritm_number} ({epic_key}): {epic_description_cleaned}" 4. TEMPLATE_SCOPE - Components, team, labels "Components: {epic_components_text}. Labels: {epic_labels_text}. Submitted by {opened_by} for {requested_for}..." 5. TEMPLATE_TIMELINE - Dates, status, SLA "Timeline for {ritm_number}: Opened {opened_at_human}, state {state}, approval {approval}..." 6. TEMPLATE_STORIES - Linked stories and sprints (OPTIONAL - omit if empty) "User stories for {epic_key}: Story 1: {story_key}..." 7. TEMPLATE_NETWORK - LAN/WAN/Firewall (OPTIONAL - omit if no network) "Network requirements for {ritm_number}: Network type: {type}..." 8. TEMPLATE_QUOTATION - Vendor pricing (OPTIONAL - omit if no invoice) "Quotation and cost estimate for {ritm_number}: Vendor: {vendor}..." Design principles: - Write as natural language paragraphs, never key:value pairs - Group related fields into coherent sentences - Use domain vocabulary (RITM, Epic, MLL, FLNEC, etc.) - Omit empty/null sections — shorter docs embed better - Lead with most important info (summary, site, type) Two output modes: A) Full document: All 8 sections concatenated with section headers B) Chunked: Dict of {section_name: text} for per-section embedding Acceptance Criteria: AC-1: All 8 templates implemented AC-2: Each template produces natural language paragraphs (not key:value) AC-3: Empty/null sections omitted from output AC-4: Full document mode: concatenates all sections with [SECTION] headers AC-5: Chunked mode: returns dict of {section_name: text} AC-6: Template variables populated from normalized record via build_template_variables() AC-7: Output matches reference format in semantic_document_output.txt AC-8: doc_hash (SHA-256) generated for deduplication AC-9: Processing time <50ms per record AC-10: Unit tests verify output for sample record (InfoLink Jacksonville) Dependencies: S-01 (clean_text), S-02 (normalize_schema) Reference: mll_semantic_document_converter.py Sections 4-6 Test Data: sample_input_json.json → expected: semantic_document_output.txt -------------------------------------------------------------------------------- STORY S-04: Build Embedding Generation Pipeline -------------------------------------------------------------------------------- Story Name: Build Embedding Generation Pipeline (all-MiniLM-L6-v2) Epic: Silver Layer - Semantic Processing Story Points: 5 Sprint: Sprint 4 Priority: Critical Assignee: [TBD - ML Engineer] Labels: silver, embedding, sentence-transformers, vectors Description: As an ML engineer, I need to implement the embedding pipeline using sentence-transformers (all-MiniLM-L6-v2) that generates 384-dimensional vectors for each semantic document and chunk. The pipeline must support two modes: A) Full-document embedding: entire semantic doc → single 384-dim vector B) Chunked embedding: each of the 8 sections → separate 384-dim vector All vectors must be L2-normalized for cosine similarity computation. Batch processing requirements: - Process all ~16,852 records - Batch size: 256 records per batch (GPU) or 64 (CPU) - Show progress bar during processing - Save intermediate results every 1,000 records (resume on failure) - Total processing time target: <60 minutes on CPU Output format per record: { "ritm_number": "RITM000023587097", "full_embedding": [0.023, -0.041, ...], // 384 floats "chunk_embeddings": { "identity_summary": [0.018, ...], // 384 floats "classification": [0.031, ...], // 384 floats ... } } Acceptance Criteria: AC-1: all-MiniLM-L6-v2 model loads successfully AC-2: Full document → single 384-dim vector per record AC-3: Each chunk → separate 384-dim vector AC-4: All vectors L2-normalized (unit length) AC-5: Batch processing for all ~16,852 records AC-6: Progress bar shows processing status AC-7: Intermediate saves every 1,000 records AC-8: Total processing time <60 min on CPU AC-9: Output embeddings saved as NumPy arrays (.npy) AC-10: Spot-check: cosine similarity between related tickets > 0.7 Dependencies: S-03 (semantic documents must be generated first) Reference: mll_semantic_document_converter.py Section 9 -------------------------------------------------------------------------------- STORY S-05: Build Metadata Extraction Module -------------------------------------------------------------------------------- Story Name: Build Metadata Extraction for Filtered Vector Retrieval Epic: Silver Layer - Semantic Processing Story Points: 3 Sprint: Sprint 4 Priority: High Assignee: [TBD - ML Engineer] Labels: silver, metadata, filtering Description: As an ML engineer, I need to implement extract_metadata() that pulls filterable dimensions from each normalized record. These metadata fields are stored ALONGSIDE vectors in the vector store and used for filtered retrieval (e.g., "find similar tickets only from the same region"). Metadata fields to extract: Primary keys: ritm_number, request_number, epic_key, epic_id Site dimensions: site_name, site_country_code, site_id, site_region, site_tier Classification: request_type, capability_center, routing_system, complexity_tier, confidence_score Status: epic_status, epic_priority, epic_sector, state, stage Flags: has_network, has_quotation, has_stories Computed: doc_hash (SHA-256 of full semantic doc, first 16 chars) These metadata fields enable queries like: - "Find similar network requests at Tier 1 sites in NA region" - "Find InfoLink implementations with vendor quotations" Acceptance Criteria: AC-1: All filterable fields extracted per record AC-2: doc_hash computed (SHA-256, first 16 chars) for deduplication AC-3: Metadata output as JSON compatible with vector store format AC-4: Boolean flags (has_network, has_quotation, has_stories) computed AC-5: Null/missing fields defaulted to empty string (not None) AC-6: Metadata stored to Content Sphere Silver bucket AC-7: Unit tests verify extraction for sample record Dependencies: S-02 (normalize_schema) Reference: mll_semantic_document_converter.py Section 7 -------------------------------------------------------------------------------- STORY S-06: Load Embeddings into LangFlow Vector Store -------------------------------------------------------------------------------- Story Name: Load Embeddings into LangFlow-Compatible Vector Store Epic: Silver Layer - Semantic Processing Story Points: 8 Sprint: Sprint 5 Priority: Critical Assignee: [TBD - ML Engineer] Labels: silver, vector-store, langflow, hnsw, critical-path Description: As an ML engineer, I need to load all generated embeddings and metadata into the LangFlow-compatible vector store (AstraDB or Chroma) and configure the HNSW index for fast similarity search. Setup tasks: 1. Provision vector store (AstraDB cloud or Chroma local) 2. Create collection/index: "mll_intake_vectors" 3. Configure HNSW index: 384 dimensions, cosine similarity metric 4. Bulk load all embeddings with associated metadata 5. Verify query latency: Top-K=5 retrieval in <3ms Loading strategy: - Load full-document embeddings (primary retrieval) - Load chunk embeddings (secondary, for section-level search) - Attach metadata to each vector for filtered queries - Batch upload: 500 vectors per batch Verification queries to run after loading: 1. "network drops Jacksonville warehouse" → should return InfoLink ticket 2. "server implementation Mexico" → should return JUAREZ server tickets 3. "firewall MACD request" → should return network/firewall tickets Acceptance Criteria: AC-1: Vector store provisioned (AstraDB or Chroma) AC-2: Collection created with HNSW index (384-dim, cosine) AC-3: All ~16,852 full-document embeddings loaded AC-4: All chunk embeddings loaded (~100,000+ vectors) AC-5: Metadata attached to each vector AC-6: Top-K=5 query latency <3ms (measured) AC-7: 3 verification queries return semantically correct results AC-8: Filtered query works (e.g., site_region="NA" + similarity search) AC-9: Vector store connection config documented for LangFlow AC-10: Content Sphere Silver bucket updated with index state snapshot Dependencies: S-04 (embeddings), S-05 (metadata) -------------------------------------------------------------------------------- STORY S-07: Build Confluence KB Embedding Pipeline -------------------------------------------------------------------------------- Story Name: Build Confluence Knowledge Base Embedding Pipeline Epic: Silver Layer - Semantic Processing Story Points: 5 Sprint: Sprint 5 Priority: Medium Assignee: [TBD - ML Engineer] Labels: silver, confluence, embedding, knowledge-base Description: As an ML engineer, I need to process the Confluence KB documents through the same Silver layer pipeline: clean text, chunk by section headings, generate embeddings, and load into the vector store as a separate collection ("mll_kb_vectors") so the Knowledge Agent can perform RAG search over process documentation. Chunking strategy for Confluence pages: - Split by H2 headings (each section = one chunk) - Include page title as prefix for each chunk - Target chunk size: 200-500 words - Overlap: 50 words between consecutive chunks Acceptance Criteria: AC-1: Confluence pages chunked by section headings AC-2: Each chunk prefixed with page title AC-3: Chunks embedded using all-MiniLM-L6-v2 AC-4: Loaded into "mll_kb_vectors" collection in vector store AC-5: Metadata includes: page_id, page_title, section_heading AC-6: Verification query: "how to submit intake request" returns MLLF Intake Process AC-7: Stored to Content Sphere Silver bucket Dependencies: B-05 (Confluence ingestion), S-06 (vector store ready) -------------------------------------------------------------------------------- STORY S-08: Silver Layer Validation & Embedding Quality Report -------------------------------------------------------------------------------- Story Name: Silver Layer Validation & Embedding Quality Report Epic: Silver Layer - Semantic Processing Story Points: 5 Sprint: Sprint 5 Priority: High Assignee: [TBD - ML Engineer] Labels: silver, validation, quality, embedding Description: As an ML engineer, I need to build a validation suite that verifies embedding quality and vector store integrity before the Gold layer agents can use it for RAG retrieval. Validation checks: - Vector count matches record count (~16,852) - No zero vectors or NaN values - Cosine similarity sanity checks: * Same-site tickets should cluster (similarity > 0.6) * Different-domain tickets should be distant (similarity < 0.4) - Top-K retrieval relevance test (20 curated queries with expected results) - Metadata filter verification (filtered search returns correct subset) - Latency benchmark (measure p50, p95, p99 query times) Acceptance Criteria: AC-1: Vector count verified (matches ~16,852 records) AC-2: No zero vectors or NaN values found AC-3: Same-site clustering verified (avg similarity > 0.6) AC-4: Cross-domain separation verified (avg similarity < 0.4) AC-5: 20 curated queries return expected top results AC-6: Metadata filtering works correctly AC-7: Latency: p50 <2ms, p95 <5ms, p99 <10ms AC-8: HTML quality report generated AC-9: Report stored to Content Sphere Silver bucket Dependencies: S-06, S-07 (all vectors loaded) GOLD LAYER — Multi-Agent RAG System & Chat UI Goal: Build 7 agents in LangFlow + Chainlit chat UI -------------------------------------------------------------------------------- STORY G-01: Build Supervisor Agent Flow in LangFlow -------------------------------------------------------------------------------- Story Name: Build Supervisor Agent - Central Orchestrator Epic: Gold Layer - Multi-Agent System Story Points: 5 Sprint: Sprint 6 Priority: Critical Assignee: [TBD - AI Engineer] Labels: gold, agent, supervisor, langflow, critical-path Description: As an AI engineer, I need to create the Supervisor Agent flow in LangFlow that serves as the central orchestrator for all user interactions. Every user message first hits the Supervisor, which decides where to route it. Routing logic: 1. New message + no active workflow → Route to Intent Classifier (G-02) 2. Active workflow + awaiting user response → Route back to active agent 3. User says "cancel" or "start over" → Reset state, fresh start 4. User asks about status → Route directly to Status Agent (G-09) 5. Classification confidence < 0.6 → Ask user to clarify intent 6. Agent cannot resolve → Escalate to human agent State management (via Redis): - session_id: unique per user session - active_agent: which agent currently owns the conversation - collected_fields: dict of info gathered so far - missing_fields: list of required fields not yet collected - classification_result: output from Intent Classifier - created_tickets: list of RITMs/Jira tickets created in session Acceptance Criteria: AC-1: Supervisor receives all user messages as entry point AC-2: Routes new messages to Intent Classifier AC-3: Routes follow-up messages to active agent AC-4: "Cancel"/"start over" resets conversation state AC-5: Status keywords route directly to Status Agent AC-6: Low confidence (<0.6) triggers clarification question AC-7: Escalation path to human agent works AC-8: Session state persisted in Redis AC-9: State survives page refresh (session resumed from Redis) AC-10: LangFlow flow exported and version-controlled Dependencies: None (LangFlow setup must be complete) -------------------------------------------------------------------------------- STORY G-02: Build Intent Classifier Agent with RAG Pipeline -------------------------------------------------------------------------------- Story Name: Build Intent Classifier Agent with RAG Retrieval Epic: Gold Layer - Multi-Agent System Story Points: 8 Sprint: Sprint 6-7 Priority: Critical Assignee: [TBD - Senior AI Engineer] Labels: gold, agent, classifier, rag, langflow, critical-path Description: As an AI engineer, I need to create the Intent Classifier Agent in LangFlow that uses the RAG pipeline to classify user requests by comparing them against historical data in the vector store. RAG Classification Flow (step by step): 1. Receive user message from Supervisor 2. Embed user message using all-MiniLM-L6-v2 (384-dim vector) 3. Query vector store for Top-K=5 most similar historical tickets 4. Build LLM prompt with: - User's original message - 5 retrieved similar tickets (semantic document text) - Classification instructions 5. Send to LLM (Claude/GPT-4) for classification 6. Parse LLM response into structured output Classification output: { "request_type": "NetworkRequest", // or IntakeRequest, IncidentRequest, AccessRequest, StatusQuery "capability_center": "ApprovedProjectDemand", // or AgileDemand, TSInternalDemand, EstimateDemand "confidence_score": 0.96, "similar_tickets": [ {"ritm": "RITM000023590001", "summary": "...", "similarity": 0.94}, ... ], "extracted_entities": { "site_name": "JACKSONVILLE", "site_id": "5619", "request_details": "12 network drops for wireless access points" } } LangFlow components to use: - Embedding component (all-MiniLM-L6-v2) - Vector Store Retriever (connected to S-06 vector store) - Prompt Template (classification prompt) - LLM component (Claude/GPT-4) - Output Parser (structured JSON) Acceptance Criteria: AC-1: User message embedded via all-MiniLM-L6-v2 in LangFlow AC-2: Top-5 similar tickets retrieved from vector store AC-3: LLM classifies using retrieved context (not just keywords) AC-4: Output includes: request_type, capability_center, confidence_score AC-5: similar_tickets returned with RITM, summary, similarity score AC-6: Entities extracted from user message (site, quantities, etc.) AC-7: Classification accuracy >95% on 50-case test set AC-8: Total classification time <2 seconds end-to-end AC-9: Handles ambiguous requests (asks for clarification if confidence <0.6) AC-10: LangFlow flow tested and exported Dependencies: S-06 (vector store loaded), G-01 (Supervisor routes to this) Test Data: 50 test messages covering all 5 request types -------------------------------------------------------------------------------- STORY G-03: Build Clarity Agent - Smart Question Generator -------------------------------------------------------------------------------- Story Name: Build Clarity Agent - Missing Field Detection & Smart Questions Epic: Gold Layer - Multi-Agent System Story Points: 8 Sprint: Sprint 7 Priority: Critical Assignee: [TBD - AI Engineer] Labels: gold, agent, clarity, questions, langflow, critical-path Description: As an AI engineer, I need to create the Clarity Agent in LangFlow that identifies missing required fields based on the classified request type and asks targeted, context-aware follow-up questions. Required fields by request type: IntakeRequest: - site_id (4-digit, validated against known sites) - capability_center (Agile Demand / Approved Project Demand / etc.) - nexus_id (required for Agile and Approved Project) - business_problem (min 50 chars, natural language) - business_value (why this matters) NetworkRequest: - existing_intake_ritm (REQUIRED prerequisite — must exist) - network_type (LAN/WAN, Firewall MACD, Switch Config — multi-select) - cable_drops (number) - switch_names (optional, format validation) - network_zone (ICE or standard) IncidentRequest: - what_stopped_working (description) - when_it_broke (date/time) - who_is_affected (scope) - business_impact (urgency justification) AccessRequest: - application_name (e.g., "TS ENGAGE - PROD") - access_type (new account, modify, remove) - user_network_id Smart defaults from similar tickets: - If Intent Classifier returned similar_tickets, extract likely values - Example: Jacksonville network ticket → suggest "ICE zone" and known switch names - Present as: "Based on similar projects at Jacksonville, this would typically be an ICE zone network with switches JAX-MDF-SW01. Can you confirm?" Question behavior: - Ask ONE question at a time (not a form dump) - Accept substantive text responses (if user provides business problem when asked for site ID, extract both) - Validate responses (site_id must be 4 digits, RITM must match pattern) - Track collected vs missing fields in state Acceptance Criteria: AC-1: Required field list defined per request type AC-2: Missing fields identified by comparing collected vs required AC-3: Questions asked one at a time AC-4: Smart defaults suggested from similar historical tickets AC-5: Accepts substantive text (extracts multiple fields from one response) AC-6: Input validation: site_id (4 digits), RITM (pattern match), etc. AC-7: State tracks collected_fields and missing_fields AC-8: Passes completed fields to specialized agent when all required gathered AC-9: Handles "I don't know" responses gracefully (mark as optional or explain why needed) AC-10: Works for all 4 request types Dependencies: G-02 (classification result provides request_type and similar_tickets) -------------------------------------------------------------------------------- STORY G-04: Build Intake Agent - Solution Intake RITM Creation -------------------------------------------------------------------------------- Story Name: Build Intake Agent - IRIS RITM Creation for Solution Intake Epic: Gold Layer - Multi-Agent System Story Points: 5 Sprint: Sprint 8 Priority: High Assignee: [TBD - AI Engineer] Labels: gold, agent, intake, iris, servicenow Description: As an AI engineer, I need to create the Intake Agent in LangFlow that handles new Solution Intake requests. It takes the complete field set from the Clarity Agent and creates an IRIS RITM via the ServiceNow API. Workflow: 1. Receive collected fields from Clarity Agent 2. Validate all required fields present 3. Format payload for ServiceNow API (sc_req_item table) 4. Create RITM via POST /api/now/table/sc_req_item 5. Return RITM number to user 6. Inform user that IRIS automation will create Jira Epic in ABFZ 7. Provide link format: https://jira.jnj.com/browse/ABFZ-XXXXX 8. Log action to Content Sphere Gold audit bucket Acceptance Criteria: AC-1: Receives complete field set from Clarity Agent AC-2: Validates all required fields before API call AC-3: Creates RITM in IRIS via ServiceNow REST API AC-4: Returns RITM number to user (e.g., "RITM000023XXXXXX") AC-5: Informs user about Jira Epic auto-creation AC-6: Handles API errors gracefully (timeout, auth failure, validation error) AC-7: Retry logic for transient failures (3 retries with backoff) AC-8: Audit log entry written to Content Sphere Gold bucket AC-9: Response time <5 seconds for RITM creation Dependencies: G-03 (Clarity Agent provides collected fields) ServiceNow API access credentials -------------------------------------------------------------------------------- STORY G-05: Build Network Agent - Network RITM Creation -------------------------------------------------------------------------------- Story Name: Build Network Agent - Network/Firewall RITM with Prerequisite Check Epic: Gold Layer - Multi-Agent System Story Points: 5 Sprint: Sprint 8 Priority: High Assignee: [TBD - AI Engineer] Labels: gold, agent, network, firewall, pete-ward Description: As an AI engineer, I need to create the Network Agent in LangFlow that handles LAN/WAN, Firewall MACD, and switch configuration requests. It enforces the prerequisite that an MLL Intake RITM must exist before a Network RITM can be created. Workflow: 1. Validate prerequisite: parent Intake RITM exists (query IRIS API) 2. If no parent RITM → route back to Intake Agent first 3. Collect network-specific fields (from Clarity Agent): - network_type, cable_drops, switch_names, network_zone, firewall_required 4. Assign priority (P1-P4) based on business impact: - P1: Emergency/security concern - P2: High priority project work - P3: Standard (DEFAULT) - P4: Deferred/back burner 5. Create Network RITM in IRIS, linked to parent Intake RITM 6. Inform user: "Network RITM created. Will be reviewed at next triage call and assigned to Pete Ward's network team." 7. Log to Content Sphere Gold audit bucket Acceptance Criteria: AC-1: Validates parent Intake RITM exists before proceeding AC-2: If no parent RITM, redirects to Intake Agent workflow AC-3: Network RITM created and linked to parent RITM AC-4: Priority assigned (P1-P4), defaults to P3 AC-5: Network RITM includes: type, cable_drops, switch_names, zone AC-6: User informed about triage call review process AC-7: Handles API errors gracefully AC-8: Audit log written to Content Sphere Gold bucket Dependencies: G-03 (Clarity Agent), G-04 (Intake Agent for prerequisite creation) -------------------------------------------------------------------------------- STORY G-06: Build Incident Agent - Break/Fix Handling -------------------------------------------------------------------------------- Story Name: Build Incident Agent - Break/Fix Scenario Handler Epic: Gold Layer - Multi-Agent System Story Points: 5 Sprint: Sprint 9 Priority: High Assignee: [TBD - AI Engineer] Labels: gold, agent, incident, break-fix Description: As an AI engineer, I need to create the Incident Agent in LangFlow that handles break-fix scenarios where existing functionality has stopped working. The agent must distinguish between incidents and intakes. Decision logic: INTAKE (new demand): INCIDENT (break-fix): - Something never worked before - Something WAS working, now broke - New capability request - Unplanned interruption - Enhancement or upgrade - Degraded service quality → Route to Intake Agent → Continue as Incident If classified as incident, collect: - What stopped working (description) - When it broke (date/time) - Who is affected (individuals, teams, sites) - Business impact (critical, high, medium, low) - Any error messages or symptoms Acceptance Criteria: AC-1: Intake vs Incident decision logic implemented AC-2: Collects: what, when, who, impact, symptoms AC-3: If user describes new demand, routes to Intake Agent AC-4: Creates incident ticket via IRIS API AC-5: Assigns priority based on business impact AC-6: Provides incident reference number to user AC-7: Audit log written to Content Sphere Gold bucket Dependencies: G-03 (Clarity Agent) -------------------------------------------------------------------------------- STORY G-07: Build Access Agent - Application Access Requests -------------------------------------------------------------------------------- Story Name: Build Access Agent - TS ENGAGE and Application Access Epic: Gold Layer - Multi-Agent System Story Points: 3 Sprint: Sprint 9 Priority: Medium Assignee: [TBD - AI Engineer] Labels: gold, agent, access, ts-engage Description: As an AI engineer, I need to create the Access Agent in LangFlow that handles application access requests, particularly TS ENGAGE-PROD access. TS ENGAGE-PROD access workflow: 1. Instruct user: Submit in IRIS → "Create a new account for an application" 2. Application CI: TS ENGAGE - PROD 3. Application Login ID: User's Network Account 4. Account Type: Standard Account 5. Provide direct link to IRIS form if available For other applications: - Ask for application name - Check if application is in known list (CMDB lookup) - Guide through appropriate access request process Acceptance Criteria: AC-1: TS ENGAGE-PROD workflow guides user through all 4 steps AC-2: Other application access requests handled AC-3: CMDB lookup for application CI (if available) AC-4: Provides IRIS form link AC-5: Audit log written to Content Sphere Gold bucket Dependencies: G-03 (Clarity Agent) -------------------------------------------------------------------------------- STORY G-08: Build Status Agent - RITM/Jira Ticket Tracking -------------------------------------------------------------------------------- Story Name: Build Status Agent - Real-Time Ticket Tracking Epic: Gold Layer - Multi-Agent System Story Points: 5 Sprint: Sprint 9 Priority: High Assignee: [TBD - AI Engineer] Labels: gold, agent, status, tracking Description: As an AI engineer, I need to create the Status Agent in LangFlow that queries both IRIS and Jira APIs to provide real-time ticket status. The agent must handle three input types: 1. RITM number: "What's the status of RITM000023587097?" → Query IRIS API for RITM state, stage, assignment 2. Jira key: "What's happening with ABFZ-97353?" → Query Jira API for Epic status, stories, assignees 3. Natural language: "What's my latest request?" → Search by user's ID across both systems Cross-reference capability: - Given RITM → find linked Jira Epic → show both statuses - Given Jira key → find linked RITM → show both statuses Status response format: "Your request RITM000023587097 is currently in [Fulfilled] state. The linked Jira Epic ABFZ-97353 is [In Progress] with 2 user stories: - ABFZ-97354: Gather User Stories (Completed) - ABFZ-97355: Network Assessment (Open) Next step: Network assessment is pending assignment." Acceptance Criteria: AC-1: Accepts RITM numbers (regex: RITM\d{15}) AC-2: Accepts Jira keys (regex: ABFZ-\d+) AC-3: Accepts natural language status queries AC-4: Queries IRIS REST API for RITM status AC-5: Queries Jira REST API for Epic/Story status AC-6: Cross-references RITM to Epic (and vice versa) AC-7: Returns formatted status with next steps AC-8: Handles "ticket not found" gracefully AC-9: Response time <3 seconds Dependencies: G-01 (Supervisor routes status queries here) -------------------------------------------------------------------------------- STORY G-09: Build Chainlit Chat UI with LangFlow Integration -------------------------------------------------------------------------------- Story Name: Build Chainlit Chat Interface with LangFlow Backend Epic: Gold Layer - Chat UI Story Points: 8 Sprint: Sprint 10 Priority: Critical Assignee: [TBD - Full-Stack Developer] Labels: gold, chainlit, ui, langflow, chat, critical-path Description: As a full-stack developer, I need to build the Chainlit-based chat interface that serves as the user-facing front end, connecting to the LangFlow multi-agent backend via REST API. Implementation: @cl.on_chat_start → Initialize session, display welcome message @cl.on_message → Forward to LangFlow API, stream response back Features required: 1. Welcome message with quick-action buttons: "New Intake Request | Check Status | Report Incident | Get Access" 2. Streaming responses (tokens appear as generated) 3. Agent step visualization (show which agent is handling: "🔍 Classifier analyzing your request..." → " Clarity Agent: asking follow-up...") 4. File upload support (for PDF invoices and supporting docs) 5. Clickable option buttons (when Clarity Agent offers choices) 6. SSO authentication (OAuth2/SAML integration) 7. Conversation history (sidebar, persisted via Redis) 8. Mobile responsive layout 9. Custom branding (MLL/J&J colors and logo) Acceptance Criteria: AC-1: Chainlit app launches and displays welcome message AC-2: Messages forwarded to LangFlow REST API endpoint AC-3: Streaming responses work (tokens appear progressively) AC-4: Agent step visualization shows active agent name and status AC-5: File upload works for PDF and common document types AC-6: Quick-action buttons on welcome screen AC-7: SSO authentication configured AC-8: Conversation history in sidebar AC-9: Mobile responsive (tested on iPhone and Android) AC-10: Custom MLL branding applied (colors, logo) Dependencies: G-01 through G-08 (all agents must be functional) -------------------------------------------------------------------------------- STORY G-10: Build Knowledge Agent - Confluence KB RAG Search -------------------------------------------------------------------------------- Story Name: Build Knowledge Agent - Process Documentation RAG Search Epic: Gold Layer - Multi-Agent System Story Points: 3 Sprint: Sprint 10 Priority: Medium Assignee: [TBD - AI Engineer] Labels: gold, agent, knowledge, confluence, rag Description: As an AI engineer, I need to create the Knowledge Agent in LangFlow that searches Confluence KB articles using RAG to answer process questions like "How do I submit an intake request?" or "What's the difference between intake and incident?" The agent queries the "mll_kb_vectors" collection (from S-07) and returns relevant Confluence content with source attribution. Acceptance Criteria: AC-1: Queries mll_kb_vectors collection for relevant KB content AC-2: Returns synthesized answer with source page reference AC-3: Handles: intake process, network engagement, intake vs incident, TS Engage AC-4: Falls back gracefully if no relevant KB article found AC-5: Provides Confluence page link when available Dependencies: S-07 (KB vectors loaded) -------------------------------------------------------------------------------- STORY G-11: Build Cost Estimation Feature -------------------------------------------------------------------------------- Story Name: Build Cost Estimation from Historical Invoice Data Epic: Gold Layer - Multi-Agent System Story Points: 5 Sprint: Sprint 11 Priority: High Assignee: [TBD - AI Engineer] Labels: gold, cost-estimation, invoices, rag Description: As an AI engineer, I need to add cost estimation capability to the Intake Agent that uses historical PDF invoice data (from Silver layer embeddings) to provide ballpark cost estimates for new requests. Estimation formula: Est_Cost = SUM(Unit_Cost_i × Qty_i) × Geo_Factor × Complexity_Multiplier + Network_Surcharge + Firewall_Fee + MedDevice_Compliance Factors: - Geolocation: US=1.0x, EU=1.3x, LATAM=0.8x, APAC=1.1x - Complexity: Simple (1-5 items)=1.0x, Medium (5-20)=1.2x, Complex (20+)=1.5x - Network surcharge: $350/drop - Firewall MACD: $2,500 per rule set - Medical device compliance: +15% The agent retrieves Top-3 similar historical tickets with quotation data and uses their actual costs as reference points. Acceptance Criteria: AC-1: Retrieves Top-3 similar tickets with quotation_cost chunks AC-2: Applies estimation formula with all factors AC-3: Returns estimated cost range (low-high) with confidence AC-4: Shows reference tickets: "Based on similar project at Jacksonville: $98,200" AC-5: Geo, complexity, and surcharge factors applied correctly AC-6: Handles cases with no historical cost data gracefully Dependencies: G-04 (Intake Agent), S-06 (quotation vectors available) -------------------------------------------------------------------------------- STORY G-12: End-to-End Integration Testing -------------------------------------------------------------------------------- Story Name: End-to-End Integration Testing - All Agents & Workflows Epic: Gold Layer - Quality Assurance Story Points: 8 Sprint: Sprint 11-12 Priority: Critical Assignee: [TBD - QA Engineer] Labels: gold, testing, e2e, integration, critical-path Description: As a QA engineer, I need to verify the complete flow from user message through all agents to RITM/Jira creation for every supported request type. Test matrix (50 test cases total): IntakeRequest: 10 cases (various sites, capability centers) NetworkRequest: 10 cases (LAN/WAN, Firewall, Switch, with/without parent RITM) IncidentRequest: 10 cases (various break/fix scenarios) AccessRequest: 5 cases (TS ENGAGE, other apps) StatusQuery: 10 cases (RITM lookup, Jira lookup, natural language) KnowledgeQuery: 5 cases (process questions) Verification per test case: 1. Intent Classifier produces correct request_type 2. Clarity Agent asks correct follow-up questions 3. Smart defaults from similar tickets are relevant 4. Specialized agent creates correct ticket 5. Response is clear and helpful 6. Audit log entry created in Content Sphere Gold bucket 7. Response time <5 seconds end-to-end Acceptance Criteria: AC-1: 50 test cases executed across all request types AC-2: 100% pass rate (all workflows complete successfully) AC-3: Classification accuracy >95% (47+ of 50 correct) AC-4: RAG retrieval returns relevant similar tickets AC-5: RITM creation verified in IRIS for intake/network tests AC-6: Audit logs verified in Content Sphere Gold bucket AC-7: End-to-end response time <5s for 95th percentile AC-8: Test report generated with pass/fail per test case AC-9: Any failures have detailed repro steps and logs Dependencies: All G-stories (complete agent system) -------------------------------------------------------------------------------- STORY G-13: Production Deployment & Monitoring Setup -------------------------------------------------------------------------------- Story Name: Production Deployment, Monitoring & Content Sphere Audit Pipeline Epic: Gold Layer - Deployment Story Points: 5 Sprint: Sprint 12 Priority: Critical Assignee: [TBD - DevOps / Senior Engineer] Labels: gold, deployment, monitoring, production, content-sphere Description: As a DevOps engineer, I need to deploy the complete MLL Intake system to production and configure monitoring, alerting, and the Content Sphere audit logging pipeline. Deployment components: - Chainlit app (containerized, 3 replicas behind load balancer) - LangFlow backend (containerized, with all agent flows) - Redis cluster (3 nodes for session state) - Vector store (AstraDB cloud or managed Chroma) - FastAPI gateway (2 replicas for IRIS/Jira API integration) Content Sphere Gold audit pipeline: - Every agent action logged: timestamp, session_id, agent_name, action, result - Every RITM/Jira creation logged: ticket_number, fields, user - Every RAG retrieval logged: query, top_k results, classification - Conversation transcripts stored (anonymized) - Daily aggregation job for analytics dashboard Monitoring & alerting: - Response latency: p95 alert if >5s - Error rate: alert if >1% - Classification accuracy: weekly report - IRIS/Jira API health: alert on >1% failure rate - Vector store query latency: alert if p95 >10ms Acceptance Criteria: AC-1: All components deployed and running in production AC-2: Health checks pass for all services AC-3: Load balancer routes traffic to Chainlit replicas AC-4: SSL/TLS configured for all endpoints AC-5: SSO authentication working in production AC-6: Content Sphere Gold audit pipeline capturing all actions AC-7: Monitoring dashboards live with alerting rules AC-8: Runbook documented for common operational scenarios AC-9: Rollback procedure tested AC-10: Stakeholder sign-off obtained Dependencies: G-12 (E2E testing passed) LAYER | STORIES | POINTS | SPRINTS | FOCUS ------------|---------|--------|----------|---------------------------------- Bronze | 7 | 42 | 1-2 | ETL, Data Ingestion, Unification Silver | 8 | 44 | 3-4 | Semantic Text, Embeddings, Vector DB Gold | 13 | 67 | 6-8 | 7 Agents, Chainlit UI, Deployment ------------|---------|--------|----------|---------------------------------- TOTAL | 28 | 153 | 12 | Complete MLL Intake System CRITICAL PATH (must complete in order): B-01/02/03 → B-06 → S-01/02 → S-03 → S-04 → S-06 → G-01 → G-02 → G-03 → G-04/05 → G-09 → G-12 → G-13 PARALLEL TRACKS: Track A: B-01, B-02, B-03 (all Sprint 1, independent) Track B: B-04, B-05 (Sprint 2, independent) Track C: S-01, S-02 (Sprint 3, can start before Bronze complete) Track D: G-06, G-07, G-08 (Sprints 9, after G-03) Track E: G-10, G-11 (Sprint 10-11, after vector store ready) STORAGE (Content Sphere): Bronze Bucket: Raw JSON records (~16,852), partitioned by source Silver Bucket: Semantic docs, embeddings (.npy), metadata JSON Gold Bucket: Audit logs, conversation history, analytics (append-only)