Skip to content
AIEngineersLabs
BFSI
·Case study

73% reduction in document review time at a Tier-1 bank

Citation-gated corporate banking RAG over 1.4M contract pages, shipped in nine weeks under a managed pod.

SectorBFSI
Duration9 weeks · build pod
Pod1 architect · 2 senior engineers · 1 MLOps
Headline metric73%

The challenge

The bank's corporate banking team reviewed contracts manually. Median review time was six hours per contract, against a backlog of ~1.4M pages. The previous attempt — an external vendor's "AI-powered document review" — had been running for eleven months without a production deployment, and an internal proof-of-concept using off-the-shelf retrieval was producing answers that didn't survive audit because they couldn't cite their sources.

The legal and compliance team had two non-negotiables: every answer had to point at a specific clause in a specific document, and the system had to pass internal model risk review before it touched a real contract.

The architecture

We replaced the existing build with a new architecture. Ingestion ran semantic chunking with structural awareness — preserving section headings, tables, and clause numbering — then embedded with text-embedding-3-large and stored in Qdrant with per-document filters. Retrieval was hybrid: vector + BM25 + structured filters on contract type and counterparty. We added a cross-encoder reranker in front of generation, and citation tagging at the prompt layer that refused to emit a claim without a source span.

The eval harness ran a curated golden set of 240 queries across contract types, with retrieval metrics (hit rate at k, MRR, NDCG) and generation metrics (faithfulness, answer relevancy) using LLM-as-judge. Every PR ran the harness in CI and blocked merges on regression.

The handover

The pod ran for nine weeks. Deliverables were the running system, the eval harness with the bank's golden set extended internally, the runbook, and a knowledge transfer to the bank's existing engineering team. The bank's MLOps function took ownership at week ten; we stayed on retainer for thirty days for incident support.

The outcome

Median review time fell from six hours to ~95 minutes — a 73% reduction — across the first 8,200 contracts reviewed under the system. Zero unsupported claims appeared in the audit window, and internal model risk review passed on first submission.

The system is now extending to two adjacent legal review workflows under the same architecture, run by the bank's internal team.

Next step

Talk to an engineer, not a salesperson.

30 minutes. No slides. Bring an architecture, a stalled roadmap, or a vendor proposal you want a second opinion on. We'll tell you what we'd do.