Writing Policies¶
A guardrails policy is a single YAML file that defines how your AI agents are governed. It is the source of truth for what agents can and cannot do.
This page is the complete reference. Read it once, then use it as a lookup.
File Structure¶
Every policy file has these top-level sections:
version: "1.0" # Required. Always "1.0" for now.
metadata: # Optional. Who wrote this, when, why.
variables: # Optional. Shared values used in rules.
profiles: # Optional. Per-agent permission boundaries.
rules: # Required. The actual guardrail rules.
matchers: # Optional. Named pattern definitions.
Only version and rules are required. Everything else is optional.
Metadata¶
metadata:
name: acme-corp-ai-policy
description: AI agent governance policy for ACME Corporation
author: compliance@acme.com
Metadata is for humans and audit logs. The engine doesn't use it for evaluation — it attaches it to audit entries so you can trace which policy version produced a decision.
| Field | Type | Description |
|---|---|---|
name |
string | Policy name. Shows in guardrails inspect and audit logs. |
description |
string | What this policy governs. |
author |
string | Who owns this policy. |
Variables¶
Variables are shared values that rules reference with $variable_name. They keep your rules DRY and make it easy to update a value in one place.
variables:
company_domain: "acme.com"
sensitive_domains: ["finance", "legal", "hr"]
max_actions_per_minute: 100
Use variables in when clauses:
rules:
- name: external-email-check
scope: action
when: "recipient.domain != $company_domain"
then: require_approval
Variables can be strings, numbers, booleans, or lists. They are substituted at evaluation time, not at parse time — so they work correctly with all operators including in.
Profiles¶
Profiles define per-agent permission boundaries. They answer: "What is this agent allowed to do, and what is it absolutely forbidden from doing?"
profiles:
default:
default_tier: autonomous
sales-agent:
extends: default
allow: [read_crm, draft_email, search_knowledge, schedule_meeting]
deny: [commit_pricing, modify_contract, access_financials]
finance-agent:
extends: default
default_tier: soft
allow: [read_ledger, generate_report, read_invoices]
deny: [approve_payment, modify_budget, wire_transfer]
Profile Fields¶
| Field | Type | Default | Description |
|---|---|---|---|
extends |
string | "" |
Parent profile to inherit from. |
default_tier |
string | "autonomous" |
Default approval tier for this agent. One of autonomous, soft, strong. |
allow |
list | [] |
Actions this agent is explicitly permitted to perform. |
deny |
list | [] |
Actions this agent is absolutely forbidden from performing. Deny always wins. |
Inheritance¶
Profiles can extend other profiles. The child inherits the parent's allow and deny lists, then adds its own.
profiles:
base:
allow: [read, search]
specialized:
extends: base
allow: [draft_email] # Now has: read, search, draft_email
deny: [read] # Deny overrides allow — read is now forbidden
Key rule: If an action appears in both allow and deny (directly or via inheritance), deny wins. This prevents accidental privilege escalation.
How Profiles Are Evaluated¶
When an action event arrives:
- The engine looks up the agent's profile by matching
event.agentto a profile name - If the action is in the profile's
denylist → immediate DENY (no rules evaluated) - If the action is in the profile's
allowlist → it passes the profile check (rules still evaluate) - If the agent has no profile → no profile restrictions apply
Profiles are a first gate. Rules are a second gate. An action must pass both.
Rules¶
Rules are the core of the policy. Each rule is a condition-action pair: "when this happens, do this."
rules:
- name: block-prompt-injection
description: Block detected prompt injection attempts
scope: input
when: "content matches prompt_injection"
then: deny
reason: "Potential prompt injection detected"
severity: critical
enabled: true
tags: [security, injection]
Rule Fields¶
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
name |
string | Yes | — | Unique identifier. Shows in decisions and audit logs. |
scope |
string | Yes | — | What event type this rule applies to. |
then |
string | Yes | — | What to do when the rule matches. |
when |
string | No | "" |
Condition expression. Empty = always matches. |
description |
string | No | "" |
Human-readable description of the rule's purpose. |
reason |
string | No | "" |
Explanation attached to the decision when this rule fires. |
severity |
string | No | "medium" |
Priority level: critical, high, medium, low. |
tier |
string | No | "" |
Approval tier (only for require_approval): autonomous, soft, strong. |
enabled |
bool | No | true |
Set to false to disable without deleting. |
tags |
list | No | [] |
Labels for filtering and reporting. |
patterns |
list | No | [] |
Matcher names to use for redaction (only for redact rules). |
rate_limit |
object | No | null |
Rate limiting configuration. |
from |
string | No | "" |
Source agent (only for cross_agent scope). |
to |
string | No | "" |
Target agent (only for cross_agent scope). |
Scope¶
The scope field determines what kind of event the rule applies to:
| Scope | What it means | Typical data fields |
|---|---|---|
input |
A prompt or message being sent to an agent | content |
output |
A response generated by an agent | content |
action |
An agent performing an action (tool call, API call, etc.) | action, plus action-specific fields |
tool_call |
An agent calling a specific tool | tool_name, arguments |
cross_agent |
One agent communicating with another | message |
A rule only evaluates against events that match its scope. An input rule never fires on an action event.
Outcome (then)¶
| Outcome | What happens | When to use |
|---|---|---|
deny |
Block the event. The agent cannot proceed. | Security violations, forbidden actions, data leaks. |
require_approval |
Pause and request human approval before proceeding. | Sensitive actions, external communications, financial operations. |
redact |
Allow the event but modify the content (remove PII, mask data). | Privacy protection, data sanitization. |
allow |
Explicitly allow (useful for overriding in specific cases). | Rarely needed — the default is already allow. |
log |
Allow but flag for review. | Monitoring, anomaly tracking. |
Severity¶
Severity determines evaluation priority. Higher severity rules fire first.
| Severity | Priority | Use for |
|---|---|---|
critical |
Highest (fires first) | Prompt injection, security breaches |
high |
Second | Data leaks, financial operations, PII |
medium |
Third | External communications, policy compliance |
low |
Lowest (fires last) | Logging, monitoring, informational |
When two rules with the same severity both match, declaration order (position in the YAML file) breaks the tie.
The when Clause¶
The when clause is an expression that determines whether the rule fires. See the Expression Language page for the full syntax.
Quick examples:
# Simple field comparison
when: "action == 'send_email'"
# Nested field access
when: "recipient.domain != $company_domain"
# Pattern matching (references a named matcher)
when: "content matches prompt_injection"
# Boolean logic
when: "action == 'send_email' and recipient.domain != $company_domain"
# List membership
when: "resource.domain in $sensitive_domains"
# String operations
when: "path starts_with 'finance/'"
# Combined
when: "(action == 'write' or action == 'delete') and resource.domain in $sensitive_domains"
If when is empty or omitted, the rule always matches events of the specified scope.
Rule Types — Detailed Examples¶
Deny Rules¶
Block an event entirely. The agent cannot proceed.
- name: block-prompt-injection
scope: input
when: "content matches prompt_injection"
then: deny
reason: "Potential prompt injection detected"
severity: critical
Approval Rules¶
Pause the event and request human approval. Specify the tier:
soft— the user confirms in the agent's UI (a simple "approve/reject" dialog)strong— out-of-band verification (email code, 2FA, manager approval)
- name: external-email-approval
scope: action
when: "action == 'send_email' and recipient.domain != $company_domain"
then: require_approval
tier: soft
severity: medium
- name: financial-writes
scope: action
when: "action == 'write' and resource.domain in $sensitive_domains"
then: require_approval
tier: strong
severity: high
Redact Rules¶
Allow the event but modify the content. The patterns field lists which matchers to use for redaction.
- name: redact-pii-in-output
scope: output
when: "content matches pii"
then: redact
patterns: [ssn, email_addr, phone]
severity: high
The patterns list can reference:
- Matcher names (e.g.,
pii) — the entire matcher's redaction is applied - Sub-pattern names within a matcher (e.g.,
ssn,email_addr) — only those specific patterns are redacted
Cross-Agent Rules¶
Govern communication between agents. Use from and to to specify which agent pair the rule applies to.
- name: no-finance-data-to-sales
scope: cross_agent
from: finance-agent
to: sales-agent
when: "message matches financial_data"
then: deny
reason: "Financial data sharing restricted"
severity: high
fromis matched againstevent.source_agenttois matched againstevent.target_agent- If
tois omitted, the rule applies to any target
Rate Limit Rules¶
Prevent excessive activity. Rate limits are stateful — the engine tracks event counts per key.
- name: rate-limit-actions
scope: action
rate_limit:
max: 100 # Maximum events allowed
window: 60 # Time window in seconds
key: agent # Group by: "agent", "session", or any event data field
then: deny
reason: "Rate limit exceeded"
severity: medium
Rate limits check the count before evaluating the when clause. If the limit is exceeded, the rule fires regardless of the condition.
key value |
Groups by |
|---|---|
agent |
The agent's name (event.agent) |
session |
The session ID (event.session_id) |
| Any other string | A field in event.data (e.g., "user_id") |
Disabled Rules¶
Set enabled: false to keep a rule in the file without it firing. Useful for testing, phased rollouts, or temporarily suspending a rule.
- name: strict-output-filter
scope: output
when: "content contains 'confidential'"
then: deny
severity: high
enabled: false # Disabled — will not fire
Matchers¶
Matchers define reusable pattern detection logic. Rules reference them by name in when clauses using the matches operator.
matchers:
prompt_injection:
type: keyword_list
patterns:
- "ignore previous instructions"
- "you are now"
- "disregard above"
- "system prompt"
options:
case_insensitive: true
pii:
type: regex
patterns:
ssn: "\\b\\d{3}-\\d{2}-\\d{4}\\b"
email_addr: "\\b[\\w.-]+@[\\w.-]+\\.\\w+\\b"
phone: "\\b\\d{3}[\\s.-]\\d{3}[\\s.-]\\d{4}\\b"
financial_data:
type: keyword_list
patterns:
- "revenue"
- "profit margin"
- "salary"
options:
case_insensitive: true
Matcher Types¶
| Type | What it does | patterns format |
|---|---|---|
keyword_list |
Substring matching against a list of phrases | List of strings |
regex |
Regular expression matching | Dict of name: pattern or list of patterns |
pii |
Built-in PII detection (SSN, email, phone, credit card, IBAN, IP) | Dict of additional patterns (optional, extends built-in) |
See the Matchers page for details on each type and how to write custom matchers.
Matcher Options¶
| Option | Applies to | Default | Description |
|---|---|---|---|
case_insensitive |
keyword_list, regex |
false |
Match regardless of case |
Complete Example¶
A production-ready policy for a professional services firm:
version: "1.0"
metadata:
name: acme-corp-ai-policy
description: AI agent governance policy for ACME Corporation
author: compliance@acme.com
variables:
company_domain: "acme.com"
internal_domains: ["acme.com", "acme.co.uk", "acme.eu"]
sensitive_domains: ["finance", "legal", "hr"]
profiles:
default:
default_tier: autonomous
sales-agent:
extends: default
allow: [read_crm, draft_email, search_knowledge, schedule_meeting]
deny: [commit_pricing, modify_contract, access_financials]
finance-agent:
extends: default
default_tier: soft
allow: [read_ledger, generate_report, read_invoices]
deny: [approve_payment, modify_budget, wire_transfer]
hr-agent:
extends: default
default_tier: soft
allow: [read_policies, draft_letter, search_handbook]
deny: [modify_salary, terminate_employee, access_medical]
rules:
# === Security ===
- name: block-prompt-injection
description: Block detected prompt injection attempts
scope: input
when: "content matches prompt_injection"
then: deny
reason: "Potential prompt injection detected"
severity: critical
tags: [security, injection]
# === Privacy ===
- name: redact-pii-in-output
description: Redact PII from all agent outputs before delivery
scope: output
when: "content matches pii"
then: redact
patterns: [ssn, email_addr, phone]
severity: high
tags: [privacy, pii, compliance]
# === Communication ===
- name: no-external-email-without-approval
description: Emails to external domains require human review
scope: action
when: "action == 'send_email' and recipient.domain not in $internal_domains"
then: require_approval
tier: soft
severity: medium
tags: [compliance, email]
# === Financial controls ===
- name: financial-writes-need-strong-approval
description: Writing to sensitive domains requires manager approval
scope: action
when: "action == 'write' and resource.domain in $sensitive_domains"
then: require_approval
tier: strong
severity: high
tags: [compliance, finance]
# === Data isolation ===
- name: no-finance-data-to-sales
description: Finance agents cannot share raw financial data with sales
scope: cross_agent
from: finance-agent
to: sales-agent
when: "message matches financial_data"
then: deny
reason: "Financial data sharing restricted between these agent roles"
severity: high
tags: [data-isolation, compliance]
# === Safety ===
- name: rate-limit-actions
description: Prevent runaway agents from flooding external systems
scope: action
rate_limit:
max: 100
window: 60
key: agent
then: deny
reason: "Rate limit exceeded — max 100 actions per minute"
severity: medium
tags: [safety, rate-limit]
matchers:
prompt_injection:
type: keyword_list
patterns:
- "ignore previous instructions"
- "ignore all previous"
- "you are now"
- "disregard above"
- "system prompt"
- "reveal your instructions"
- "override safety"
- "jailbreak"
options:
case_insensitive: true
pii:
type: regex
patterns:
ssn: "\\b\\d{3}-\\d{2}-\\d{4}\\b"
email_addr: "\\b[\\w.-]+@[\\w.-]+\\.\\w+\\b"
phone: "\\b\\d{3}[\\s.-]\\d{3}[\\s.-]\\d{4}\\b"
credit_card: "\\b\\d{4}[\\s-]?\\d{4}[\\s-]?\\d{4}[\\s-]?\\d{4}\\b"
financial_data:
type: keyword_list
patterns:
- "revenue"
- "profit margin"
- "quarterly earnings"
- "salary"
- "compensation"
- "operating income"
options:
case_insensitive: true
Evaluation Order¶
Understanding how the engine evaluates rules is important for writing predictable policies.
- Profile check — if the agent has a profile and the action is in
deny→ immediate DENY - Rate limit check — if any rate limit rule is exceeded → immediate DENY
- Rules evaluated by severity —
criticalfirst, thenhigh,medium,low - Within same severity — declaration order (position in the YAML file)
- First DENY wins — engine short-circuits, no further rules evaluated
- If no DENY — highest-tier REQUIRE_APPROVAL wins
- If no DENY or APPROVAL — all REDACT decisions are merged
- If nothing matches — default is ALLOW
Default is ALLOW
If no rule matches, the event is allowed. Guardrails add restrictions on top of your existing authorization layer — they don't replace it. Your auth system handles "who can do what." Guardrails handle "what should never happen."
Tips for Writing Good Policies¶
Start small. Begin with 3-5 critical rules (injection blocking, PII redaction, external communication approval). Add more as you learn what your agents actually do.
Use severity honestly. Critical means "if this fires, something dangerous was about to happen." Don't make everything critical — it dilutes the signal.
Name rules clearly. The rule name shows up in every decision and audit entry. block-prompt-injection is better than rule-1.
Use tags for organization. Tags let you filter rules in guardrails inspect --tag compliance and in audit queries. Group by concern: security, privacy, compliance, safety.
Use variables for values that change. Company domains, rate limits, sensitive department lists — put them in variables so policy updates don't require editing every rule.
Test with dry run. Before enforcing a new policy in production, run with dry_run=True to see what would be blocked without actually blocking it.
Version in git. The YAML file is the policy. Treat it like code: pull requests, reviews, blame history. When an auditor asks "why was this agent blocked last Tuesday?", the answer is in the git log.