AI Labs by The Ops Toolbox

Business sponsorsTechnical leaders

Privacy is a design constraint, not a late gate

Generative AI systems touch **prompts, retrieved documents, logs, embeddings (OpenAI embeddings guide), and tool outputs**. Each layer can persist personal data longer or wider than users expect. Privacy review in week five of a pilot forces rework or shutdown.

Legal and privacy teams need plain-language answers: what is collected, where it is stored, who can access it, how long it is kept, and whether providers train on customer content. Engineering implements those answers in architecture, not in footnotes.

Sensible pilots proceed when **data classification (Microsoft Copilot data protection), minimisation, and retention schedules** are documented before indexing. Blocking everything blocks learning; indexing everything blocks compliance.

Privacy sign-off in pilot week one, not week five
Document subprocessors and regions per provider route
Workshop question: "Would an employee expect this chat to be stored five years?"
What good looks like: one-page privacy appendix in every charter

Classify data before you index

Not every document belongs in a **vector index (Azure vector search) or general copilot. Separate public, internal, and restricted** corpora with different access controls, retention, and model routes.

HR files, customer PII, health information, and legal privilege material often require exclusion, redaction, or dedicated indexes with stricter IAM. Default deny beats default include.

Classification must match source system ACLs. If SharePoint (Azure RAG concepts) denies a user the file, RAG (Azure RAG concepts) must not surface it. Pilots that ignore ACLs create audit findings at scale.

Champions often push for "just add one more folder." Council should require classification label and owner signature per new source.

Public: marketing brochures, external FAQs
Internal: operating procedures without personal data
Restricted: HR cases, customer records, legal matters
Metadata: region, business unit, effective date, classification tag
Common mistake: indexing entire drives without inventory
Checklist: data owner sign-off per corpus

PII and sensitive data in prompts

Users paste names, employee IDs, customer account numbers, and medical details into chat despite training. Assume prompts contain sensitive data unless proven otherwise.

Minimisation means redacting or blocking patterns before model send where policy requires. Detectors are imperfect; combine automated redaction with user warnings and refusal for high-risk workflows.

Structured workflows should collect identifiers in forms with explicit purpose, not free-text chat, when possible. Forms enable validation and audit.

Tool calls can leak PII into logs and third-party systems. Scrub tool arguments in logs and restrict tool scopes to least privilege.

Redact email, phone, tax file number patterns before provider send
Warn users not to paste customer records in general copilots
Separate restricted workflow with stronger controls and training
Log redaction events for audit without logging raw PII
What good looks like: redaction tested on golden prompts with PII

Prompt, output, and conversation retention

Decide explicitly whether you store full prompts, model outputs, hashes only, or redacted excerpts. Each choice affects debugging, eval replay, and legal discovery.

Align retention with DPAs, employment law, and sector rules. Some providers offer zero-retention or no-training options for eligible tiers. Document what you purchased vs what you assume.

Shorter retention reduces risk but limits incident investigation. Typical pattern: 30 to 90 days for redacted conversation logs in production, longer for aggregated metrics only.

User-facing privacy notices must match actual retention. If logs exist for security, say so plainly.

retention schedule (Microsoft Copilot data protection) per data type: prompts, outputs, embeddings (OpenAI embeddings guide), logs
Legal hold process pauses deletion without silent exceptions
Zero-retention route for highest sensitivity workflows if available
Deletion job tested quarterly, not only documented
Workshop question: "What must we produce in litigation discovery?"

Indexes, embeddings, and derived data

Vector indexes store **embeddings (OpenAI embeddings guide) derived from source documents**. Even when raw text is not stored in the index, embeddings can enable reconstruction or inference in some threat models. Treat indexes as sensitive assets.

Re-index when documents change classification or are deleted. Stale chunks in an index are a retention violation if source deletion should remove access.

Separate indexes by classification and region. Do not mix ANZ HR policies with EU customer data in one searchable pool without legal review.

Backup and disaster recovery copies inherit the same retention and access rules as primary indexes.

Index encryption at rest and in transit
ACL (Azure RAG concepts) sync job from source systems on schedule
Tombstone or purge pipeline when source document deleted
Inventory: which indexes exist, owner, classification, region
Common mistake: dev index copied from prod without scrubbing

Regional residency and cross-border flows

Data residency commitments often drive cloud anchor choice. Document which regions host models, search, logs, and backups. User travel and remote work can complicate jurisdiction assumptions.

Cross-border transfer requires legal mechanism: standard contractual clauses, adequacy decisions, or binding corporate rules. Engineering cannot resolve this alone.

Failover to another region for availability may violate residency if not disclosed. Gateway failover rules need legal review, not only SRE review.

Australian organisations often require APAC regions with clear subprocessors list. Publish region map in security evidence pack (review pack guide).

Region per component: inference, index, logs, object storage
subprocessor (production readiness conversation) register updated when provider changes data handling
Block routing to non-approved regions in gateway config
Scenario: EU employee queries ANZ-only index, what happens?
What good looks like: architecture diagram with region labels

Provider training and subprocessors

Vendor questionnaires ask whether customer content is used to train foundation models. Answers vary by product tier, configuration, and date. Version your answers when providers update terms.

Maintain a **subprocessor (production readiness conversation) list**: model providers, embedding services, safety APIs, observability (AI SDK telemetry) vendors, and log storage. Procurement and privacy rely on the same register.

Enterprise agreements may add contractual terms beyond public policies. Track which workloads are covered by which agreement.

When switching models via gateway, subprocessors change. Treat model promotion as a privacy change requiring review if data handling differs.

Document opt-out or zero-retention flags per environment
Annual review of provider trust centre and DPA (Microsoft Copilot data protection)
Notify privacy when adding new tool integration
Common mistake: assuming all Azure OpenAI configs behave identically

Scenario: HR policy Q&A without overexposure

A bank pilots HR policy Q&A for 2,000 staff. Corpus excludes individual case files and performance reviews. Only published policies with effective dates enter the index.

Users receive notice that questions may be logged in redacted form for 90 days. PII redaction (Azure Content Safety) runs on outbound prompts. Escalation to HR advisers for personal cases is mandatory when questions include individual identifiers.

Legal accepts pilot because citations tie to approved documents and restricted data never entered the index. Scale requires quarterly ACL (Azure RAG concepts) sync and privacy impact assessment update.

Lesson: narrow corpus and clear escalation beat broad "ask anything HR" scope.

Corpus: published policies only, versioned
No individual employee records in index
Redaction plus advise HR for personal cases
Retention: 90-day redacted logs, aggregated metrics longer

Scenario: CRM assist with customer data

A telco builds **CRM (AI SDK agents) research assist** for account managers. Customer names and account numbers appear in tool responses. Logs scrub arguments; only internal user IDs correlate sessions.

Customer PII never enters the general copilot index. CRM (AI SDK agents) tool reads live with OAuth scoped to the user’s accounts. Outputs stay inside CRM UI, not emailed externally by default.

Retention on CRM (AI SDK agents) tool audit logs follows customer record policy, often seven years. Chat ephemeral layer retains 30 days redacted.

Privacy sign-off requires DPIA referencing both CRM (AI SDK agents) DPA (Microsoft Copilot data protection) and model provider DPA.

Live CRM (AI SDK agents) read, no bulk export to index
Scoped OAuth per user, not service account god mode
Separate retention tiers for chat vs CRM (AI SDK agents) audit
Block copy-to-clipboard external share without DLP (Microsoft Copilot data protection) where required

Access control and logging visibility

Who may read conversation logs, index contents, and eval datasets? Restrict to break-glass roles with MFA (OWASP LLM Top 10) and audit trail.

Support staff debugging production need redacted views by default. Full prompt access requires ticket and manager approval.

Champions and sponsors should not browse employee chats casually. Programmes lose trust quickly when measurement feels like surveillance.

Align access model with existing SIEM (AI SDK telemetry) and ticketing roles rather than inventing parallel admin groups.

Role matrix: user, support, admin, auditor, break-glass
MFA (OWASP LLM Top 10) for log and index admin access
Audit log of who viewed which conversation record
Workshop question: "Who should never see full prompts?"

Content safety and harmful outputs

Privacy intersects safety when outputs expose third-party personal data or confidential material from retrieved documents. Content safety (Azure Content Safety) filters reduce harm but do not replace access control.

Configure safety thresholds with legal for regulated industries. Log safety scores without storing blocked harmful content verbatim when possible.

Incident response for privacy breach via model output mirrors traditional data breach playbooks: contain, notify, preserve evidence, remediate index or prompt path.

Azure Content Safety (overview) and similar services are subprocessors with their own data handling terms—see the Azure Content Safety.

Input and output filtering for harassment and leakage patterns
Refuse when retrieval would expose wrong ACL (Azure RAG concepts) document
Breach runbook linked from production readiness conversation guide
Test: prompt injection (OpenAI mitigations) attempting to exfiltrate other users' data

Vendor due diligence and evidence pack

Security and privacy reviews ask for data-flow diagrams, retention schedules, **subprocessor (production readiness conversation) lists, and sample redacted logs**. Prepare these once and version per release.

Answer questionnaires with specific configuration facts: region, retention days, training opt-out status, encryption modes. Avoid generic "we use Azure" responses.

Link controls to demo patterns in vendor documentation (Azure AI Foundry documentation) as reference implementations, not proof your production config is compliant.

When auditors visit, show deletion job success metrics and ACL (Azure RAG concepts) sync lag dashboards, not only policy PDFs.

Artefact: data-flow diagram (Microsoft Copilot data protection) with classification colours
Artefact: retention table by data type
Artefact: subprocessor (production readiness conversation) register with review date
Artefact: sample redacted log line with field legend
Checklist: DPA (Microsoft Copilot data protection) signed before prod customer data

Pilot minimum vs production privacy bar

Pilots may use synthetic or anonymised data and smaller cohorts, but should still implement classification, ACL (Azure RAG concepts)-aware retrieval, and documented retention. "Pilot" is not an excuse for production customer PII in dev tenants.

Production bar adds tested deletion, legal hold integration, DPIA (Microsoft Copilot data protection) or PIA on file, user notice, and quarterly access reviews.

Graduating pilot to production triggers privacy change assessment if scope, region, or data classes expand.

stop rules (NIST AI RMF) from scoping guide apply: if privacy blockers cannot close in two weeks, pause rather than bypass.

Pilot minimum: classified corpus, ACL (Azure RAG concepts) sync, redaction on send, 90-day log cap
Production: deletion jobs, hold process, notices, access reviews, DPIA (Microsoft Copilot data protection)
Never: personal API keys (OWASP LLM Top 10) with customer data
Never: skip notice because "internal only"
What good looks like: privacy sign-off recorded in council minutes

Data privacy and retention

Privacy is a design constraint, not a late gate

Classify data before you index

PII and sensitive data in prompts

Prompt, output, and conversation retention

Indexes, embeddings, and derived data

Regional residency and cross-border flows

Provider training and subprocessors

Scenario: HR policy Q&A without overexposure

Scenario: CRM assist with customer data

Access control and logging visibility

Content safety and harmful outputs

Vendor due diligence and evidence pack

Pilot minimum vs production privacy bar

Plan your next pilot

Data privacy and retention

Executive summary

Privacy is a design constraint, not a late gate

Classify data before you index

PII and sensitive data in prompts

Prompt, output, and conversation retention

Indexes, embeddings, and derived data

Regional residency and cross-border flows

Provider training and subprocessors

Scenario: HR policy Q&A without overexposure

Scenario: CRM assist with customer data

Access control and logging visibility

Content safety and harmful outputs

Vendor due diligence and evidence pack

Pilot minimum vs production privacy bar

Plan your next pilot