Target outcomes
- Achieve citation and safety compliance for customer-facing channels
- Shorten risk review cycles with measurable safety thresholds
Initiative playbook
Typical delivery arc for this pattern in enterprise programs.
- 1Discovery2 to 4 wks
Define governed corpora and safety thresholds; agree on which answers require citations and which require escalation.
- 2Pilot6 to 8 wks
Ship sequential retrieve → generate → safety pipeline with run timing; calibrate thresholds with compliance reviewers.
- 3Scaleongoing
Add tool calling, DLP, and Foundry monitoring across domains; formalize policy packs by channel (internal vs customer-facing).
Business use case
Problem
Enterprises cannot deploy “raw chat” for policy and compliance workloads. They need an orchestrated agent that:
- Retrieves approved sources
- Generates only from those sources
- Applies a safety/compliance gate before the answer reaches users
Who benefits
- Compliance, citations and safety envelopes
- Support/HR, fewer repetitive tickets
- Platform engineering, a reusable pattern for governed agents
Success metrics
- 100% of answers grounded in approved corpora
- Safety thresholds calibrated with measurable category scores
- Deflect 20 to 30% of L1 “policy” queries during pilot
Solution
This example is a sequential orchestration (retrieve → generate → safety) rather than a single opaque LLM call. It returns a step timeline to make the agent’s behaviour auditable and easy to tune.
Technical implementation
Stack
- Azure OpenAI generation (chat completion)
- Azure AI Search retrieval when configured (falls back to local seed docs)
- Azure Content Safety scoring when configured (falls back to heuristic scoring)
Architecture
A deliberate pipeline, not one chat call, so each step can be owned, timed, and swapped independently.
Implementation highlights
- The API returns explicit step timing so orchestration is measurable
- Retrieval and safety are separable steps you can evolve independently (e.g., add DLP, add tool calls)
Outcomes and learnings
- Orchestrated agents are easier to govern than “one-shot chat”
- Separate “retrieve” and “safety” steps support compliance sign-off
- Returning a timeline helps stakeholders understand latency/cost tradeoffs by step
Where else this applies
Sequential retrieve → generate → safety is the template for “governed agent” programs that are not ready for opaque end-to-end autonomy.
Legal contract first draft
Retrieve clause library, draft language, then safety and policy scan before lawyer edit.
Clinical admin assistants
Retrieve protocol snippets, answer staff questions, run output checks, never skip the safety step.
Insurance FNOL guidance
Pull policy excerpts, guide the caller, score answers before CRM notes are saved.
Executive briefing bots
Retrieve latest board metrics pack, summarise, and filter speculative claims in post-processing.
Using this stack elsewhere
Foundry orchestration (or custom step functions in Azure) gives compliance a named owner per step and timings for incident review.
Live demo
The demo is the same code path described above, not a simplified mock UI. Add keys in .env.local when you are ready; the narrative and diagrams stand on their own without them.
Business
One question, three visible steps: retrieve, generate, safety. Easier to defend in a review than a single chat bubble.
Technical
Sequential orchestrator with per-step timing returned for tuning and audit.