AI Labs by The Ops Toolbox

Business sponsorsTechnical leaders

How AI security differs from classic app security

Traditional apps have deterministic logic; AI apps add probabilistic behaviour, untrusted natural language input, and tool calling side effects. Attackers target prompts, retrieved documents, and tool parameters per OWASP LLM Top 10.

Sponsors should expect the same accountability as any production system: named owners, audit trails, and incident response per NIST AI RMF Manage.

Shadow AI (personal accounts, unlogged browser plugins, keys in Slack) is a security programme issue. NIST AI RMF Govern champions should route teams to approved sandboxes.

This guide complements the AWS AI compliance: controls here, artefacts there. Pair with production readiness conversation.

Threat focus: prompt injection, RAG exfiltration, tool abuse
The model said so is not an authorisation model (OpenAI safety best practices)
NIST AI RMF Govern tracks exceptions with expiry dates, not permanent waivers
Workshop question: kill switch per MITRE ATLAS tabletop

Control domains at a glance

Use this map in NIST AI RMF Govern intake and architecture reviews. Each domain needs an owner and evidence mapped in NIST AI RMF Playbook.

Mature programmes link domains to patterns: HITL for tools, deployment gates for deploys, telemetry for runtime.

Red-team exercises should rotate across domains quarterly per Anthropic jailbreak guidance, not only chat jailbreaks.

InfoSec reviews fail when controls exist in policy but not in the request path.

Identity & access: SSO via Entra conditional access
Data: classification per Microsoft Copilot data protection
Model & vendor: approved routes (production readiness conversation)
Application: Content Safety, citations, refusal
Tools & agents: tool calling, HITL, least privilege
Runtime: cost caps, Bedrock logging, kill switches
SDLC: golden set evals in CI, prompt versioning

Identity, access, and segregation

Bind every session to corporate identity (SSO, Entra ID (conditional access), Okta). Separate roles for end users, approvers, admins, and break-glass operators.

Retrieval indexes must enforce the same document ACLs as source systems. A user who cannot open a file in SharePoint must not see it in RAG (Azure RAG concepts) answers.

Service principals for model APIs per production readiness conversation. Short-lived tokens for tool calls with explicit read vs write scope.

Admin actions such as disable tools require MFA and immutable Bedrock logging audit trails.

Role matrix published quarterly (production readiness conversation)
Break-glass accounts monitored (NIST Govern)
Guest access scoped (Copilot privacy)
What good looks like: JWT claims in telemetry docs

Microsoft Entra ID: conditional access

Secrets, keys, and configuration

API keys in repos are the fastest path to breach. Store secrets in a vault per production readiness conversation; rotate quarterly.

Environment separation: dev keys must not point at production vector indexes. CI needs secret scanning per OWASP LLM.

Never log full API keys. Redact tool arguments before SIEM export per data privacy.

Dual control for production prompt promotion via AI Gateway and eval CI.

Vault for model and Azure AI Search credentials
Repo secret scanning with merge blocking (AWS AI compliance)
Personal API keys prohibited (NIST AI RMF Govern policy)
Common mistake: keys in workshop recordings (Copilot adoption training instead)

OpenAI: production best practices (API keys)

Data handling and retrieval boundaries

Classify corpora before vector indexing. Restricted data excluded from Microsoft 365 Copilot and general apps.

Align retention with Microsoft Copilot data protection and legal holds. Document subprocessors per AWS AI compliance.

Metadata filters on every RAG query: tenant, region, classification, effective dates.

Ingestion pipelines need allow lists per Azure RAG solution guide and malware scanning.

Index only approved data classes (NIST AI RMF)
ACL sync failures alert ops (Foundry monitor)
Redact PII before model send (reference architecture patterns patterns)
Block unknown file types (prompt injection via PDF risk)

Prompt injection and untrusted content

Prompt injection (OpenAI mitigations) is when user or document text overrides system instructions (ignore previous rules, hidden instructions in PDFs). You cannot eliminate it with a single filter; layer controls.

Treat retrieved text as untrusted: delimit context per OpenAI safety best practices; prefer cite-only RAG Q&A.

Use Azure Prompt Shields and Anthropic jailbreak tests; still red-team your domain.

Log jailbreak patterns to SIEM. Spikes often precede incidents cited in MITRE ATLAS.

Separate system policy from user content (function calling structure)
Do not let retrieved HTML execute in rich clients (OWASP LLM)
High-risk: workflow + citation over open agents
Post-incident: add attacks to golden set

Tool calling and agent abuse

Tools are the highest-risk surface per OWASP LLM Top 10. Allow lists beat open function calling registries.

Validate tool arguments with schemas via AI SDK tools. Run tools through WAF and API auth like other internal APIs.

Human-in-the-loop before any customer-facing write. Global kill switch: disable agents without taking down read-only Q&A.

Pen-test tools directly. Chat UI tests miss parameter tampering on backend endpoints (agent vs workflow).

Tool registry reviewed before each new tool (AWS AI compliance)
Rate limits per user (cost controls)
Idempotent execute after HITL approval
Kill-switch drill in 90 days (production readiness conversation)

AI SDK: tool calling

Content safety and output controls

Filter inputs and outputs via Azure Content Safety and Bedrock Guardrails. Tune thresholds with legal.

Require citations for policy Q&A per Azure RAG solution guide. Refuse when retrieval is empty (Anthropic hallucinations).

Budget guardrails add latency; document in cost controls architecture diagrams.

Align Microsoft 365 Copilot and custom app rules via Copilot coexistence.

Logging, monitoring, and incident response

Security teams need visibility like SRE: who asked what, which tools fired, which RAG passages retrieved.

Ship logs to SIEM via Bedrock logging or Foundry monitor. Alert on override rate spikes.

Incident runbook: disable tools, human queue, preserve logs, notify legal, root cause with evals.

Post-incident reviews update golden sets and NIST AI RMF Playbook checklist.

Correlation ID across chat, retrieval, tools (telemetry)
Dashboards shared monthly (OpenAI evals)
Retention matches data privacy legal holds
Tabletop twice per year (MITRE ATLAS)

AWS Bedrock: model invocation logging

Secure SDLC for prompts, indexes, and models

Version prompts, chunking config, and embeddings like application code. PR review for tool schema changes and new data sources.

golden set (OpenAI evals) evals in CI on every prompt or index change. Staging must use production-equivalent safety defaults, not relaxed dev settings.

Model version pinning via AI Gateway. Dependency scanning for AI SDK containers.

Rollback plans for vector index rebuilds that shift RAG behaviour overnight.

Two-person review for production prompts (NIST Govern)
Eval regression threshold (Azure prompt flow evaluation)
Subprocessor register (production readiness conversation)
Common mistake: hot-fix prompts without golden set

Baseline vs production control bar

Pilots can start lighter if stop rules (NIST AI RMF) and data class are honest. Production requires the full bar unless council approves a time-bound exception.

Document pilot vs production vs enterprise bars so champions align with NIST AI RMF and production readiness conversation.

Private endpoints and DLP are enterprise add-ons per AWS Well-Architected ML lens; plan before procurement.

Pilot minimum: SSO (Entra), vault, logging, no writes, classified corpus
Production: + SIEM, eval CI, cost caps, tool allow list, pen test
Enterprise: + private endpoints, MITRE ATLAS red team, DLP
Exceptions expire (NIST AI RMF Govern register)

Council and champion security practices

Security should be a standing NIST AI RMF Govern agenda item, not a one-off gate before production readiness conversation.

NIST AI RMF Govern reviews exceptions (new tool, data class, vendor) with expiry. Champions report shadow AI via Copilot adoption paths.

Platform publishes a control checklist on every intake. Pair champions with security liaison fortnightly.

Celebrate near-miss reports; punishment drives shadow AI underground per Microsoft responsible AI.

Monthly: findings, exception register, NIST Manage learnings
Champions never rotate production keys (OpenAI key guidance)
Intake template includes data class and write scope
Exception count trending down (OpenAI evals)

Evidence for InfoSec reviews

When procurement or InfoSec asks for artefacts, use the AWS AI compliance: data-flow diagrams, IAM matrix, sample logs.

Map each control domain to one reference architecture patterns pattern plus production config evidence.

Honest amber gaps beat overclaiming. Reviewers remember credibility at ISO 42001 scale funding time.

Keep evidence per workflow: architecture, eval report, runbook drill, OWASP pen test.

Control-to-artefact matrix (NIST AI RMF Playbook) before review
Sample log walkthrough in 15 minutes (telemetry)
Red-team summary (Prompt Shields + manual tests)
Workshop: what blocks 10x scale? (choosing cloud)

AI security controls & practices

How AI security differs from classic app security

Control domains at a glance

Identity, access, and segregation

Secrets, keys, and configuration

Data handling and retrieval boundaries

Prompt injection and untrusted content

Tool calling and agent abuse

Content safety and output controls

Logging, monitoring, and incident response

Secure SDLC for prompts, indexes, and models

Baseline vs production control bar

Council and champion security practices

Evidence for InfoSec reviews

Plan your next pilot

AI security controls & practices

Executive summary

How AI security differs from classic app security

Control domains at a glance

Identity, access, and segregation

Secrets, keys, and configuration

Data handling and retrieval boundaries

Prompt injection and untrusted content

Tool calling and agent abuse

Content safety and output controls

Logging, monitoring, and incident response

Secure SDLC for prompts, indexes, and models

Baseline vs production control bar

Council and champion security practices

Evidence for InfoSec reviews

Plan your next pilot