ClaimDesk
Solo-built AI co-pilot that auto-resolves 80% of warranty claims at 99.4% accuracy. 17 weeks.
Self-administered warranty programs drown operators in free-text claims — every one needs facts extracted, policy adjudicated, and a customer email drafted. Most claims are repetitive, but each still eats human time.
Multi-agent LLM pipeline on LangGraph that extracts structured facts, adjudicates against policy with verbatim citations, drafts the customer email, and calibrates confidence via an XGBoost classifier. Auto-resolves at ≥0.70 confidence, otherwise routes to an operator queue with everything pre-filled. Every side-effect is idempotent, every LLM call is traced and cost-tracked, every change is regression-gated against a locked 200-claim eval set.
Synthetic baseline: 80.5% of claims auto-resolved at 99.4% accuracy, 99.5% verbatim citations, $0.0009 per claim (325× under target), p50 7.0s / p95 9.2s. 100% model stability across self-consistency runs — at the gpt-4o-mini reasoning ceiling. Real-data calibration is the next unlock before production deploy.