Target outcomes
- Baseline hallucination rate measured before enabling RAG
- Clear channel rules: baseline vs citation-required assistants
Initiative playbook
Typical delivery arc for this pattern in enterprise programs.
- 1Discovery2 to 4 wks
List channels allowed on baseline vs governed RAG; define hallucination sampling process.
- 2Pilot6 to 8 wks
Run baseline vs RAG on the same 50 questions; quantify citation lift.
- 3Scaleongoing
Restrict baseline to low-risk internal tasks; enforce Foundry project boundaries.
Business use case
Problem
Teams jump straight to RAG. Without a baseline, you cannot tell whether retrieval helped, or whether the model already knew the answer (or hallucinated convincingly).
Who benefits
- Architecture reviewers, compare baseline vs RAG vs orchestration on the same questions
- Compliance, separate “general chat” from “citation-required” channels
- Foundry admins, prove OpenAI connectivity before indexing corpora
Success metrics
- Baseline hallucination rate measured on policy questions (expect non-zero)
- Clear product rule: which channels may use baseline vs governed RAG
- Deployment and region documented for audit
Solution
A minimal Azure OpenAI chat route with a system prompt that steers policy questions toward the governed RAG example, showing how to position baseline chat without blending modes.
Technical implementation
Architecture
Simple chat completion in your Azure tenant, use as a control when measuring RAG lift.
Outcomes and learnings
If baseline answers “good enough,” you still may need RAG for citations, not for fluency.
- Document which workloads are allowed on baseline vs governed paths
- Use the same deployment name in RAG and chat routes for operational simplicity
- Add content safety and retrieval before external-facing channels
Where else this applies
A deliberate non-RAG baseline helps you measure when retrieval is worth the operational cost, and keeps general Q&A channels honest.
Brainstorming workshops
Facilitators use general chat for ideation where citations are not required.
Drafting starting points
Comms teams outline emails or talking points before legal moves them to governed RAG for fact checks.
Developer explainers
Platform teams answer “how does X work?” questions that do not need document grounding.
RAG lift measurement
Run identical policy questions through baseline vs RAG to quantify hallucination reduction.
Using this stack elsewhere
Use the same Azure OpenAI deployment and monitoring as RAG paths so comparisons are fair; Foundry projects can host both experiences side by side.
Live demo
The demo is the same code path described above, not a simplified mock UI. Add keys in .env.local when you are ready; the narrative and diagrams stand on their own without them.
Business
Straight chat without retrieval, compare answers to the RAG example on the same policy question.
Technical
Azure OpenAI chat.completions with a system prompt that defers policy facts to governed RAG.