Security¶
theaios-context-router is designed to handle untrusted YAML configurations and fetch content from multiple sources. This page documents the security controls in place and what operators should be aware of.
Threat Model¶
The library operates in environments where:
- YAML configs may be written by different teams (some less trusted than others)
- Data sources include local files, git repos, and external APIs
- Multiple agents with different permission levels query the same router
- Cached data is stored on disk and loaded across restarts
The library assumes the process owner is trusted but config authors and data sources may not be.
Protections¶
SSRF Protection (HTTP API Source)¶
The http_api source validates all URLs before making requests:
- Scheme whitelist: only
http://andhttps://are allowed.file://,gopher://,ftp://are blocked. - Private IP blocking: requests to
127.0.0.1,10.x.x.x,172.16-31.x.x,192.168.x.x,169.254.x.x(link-local), and::1are blocked. - Applied after template substitution: the URL is validated after
{{query}}is replaced, preventing query-based SSRF bypasses.
# This config would be rejected at fetch time:
sources:
evil:
type: http_api
url: "http://169.254.169.254/latest/meta-data" # AWS metadata — BLOCKED
Limitation: DNS rebinding attacks (where a hostname resolves to a private IP) are not blocked. For high-security environments, use a network-level firewall or HTTP proxy.
Command Injection Protection (Git Source)¶
The git_repo source runs git ls-tree and git show via subprocess. All inputs are validated:
- Git refs (branch names, tags, SHAs) are validated against
^[a-zA-Z0-9._/-]+$. Characters like;,$,`,(,)are rejected. - File paths from
git ls-treeoutput are validated against^[a-zA-Z0-9._/- ]+$. Files with special characters in their names are skipped. - No shell=True: all subprocess calls use list syntax, preventing shell metacharacter injection.
- Timeouts:
git ls-treehas a 30-second timeout,git showhas a 10-second timeout.
# This config would be rejected:
sources:
evil:
type: git_repo
path: /repo
ref: "main; cat /etc/passwd" # REJECTED — fails validation
Path Traversal Protection (Directory Source)¶
The directory source reads files from a configured base path:
- Path resolution:
Path.resolve()is called on both the base directory and each file, resolving all symlinks. - Containment check: every resolved file path is verified to start with the resolved base directory path. Files outside the base (via symlinks or
..patterns) are silently skipped. - File size limit:
max_file_size(default 1MB) prevents reading very large files.
# Symlink escape is blocked:
# /data/policies/evil_link -> /etc/passwd
# The router skips this file because resolve() shows it's outside /data/policies/
Limitation: the path field in the YAML config can point to any directory the process can read. Restrict which directories are accessible by running the router with appropriate OS-level permissions or validating configs before deployment.
Atomic Writes (Cache)¶
Cache files are written using the atomic tempfile + rename pattern:
- Content is written to a temporary file in the same directory
- The temporary file is renamed to the final path via
Path.replace() - On most filesystems,
replace()is atomic — readers never see partial writes
This prevents cache corruption from process crashes or concurrent access.
Safe Expression Language¶
Route conditions use a custom recursive descent parser — not Python's eval() or exec():
- The parser only supports: field access, comparisons, boolean operators, string operations, variables, and literals
- No function calls, imports, or arbitrary code execution
- No access to Python builtins or the
osmodule - Parsing errors raise
ExpressionErrorwith the source position
YAML Deserialization¶
All YAML loading uses yaml.safe_load(), which only constructs basic Python types (str, int, float, bool, list, dict, None). It does not instantiate arbitrary Python objects, preventing deserialization attacks.
Environment Variable Safety¶
Config files support ${ENV_VAR} interpolation for secrets (API keys, tokens). The library:
- Validates the YAML structure before interpolating environment variables
- Never includes interpolated values in error messages
- Does not log or cache raw secret values
Recommendations for Operators¶
-
Restrict config write access. The YAML config is the primary attack surface. Only trusted team members should be able to modify it.
-
Run with least privilege. The router process should have read access only to the directories it needs. Don't run as root.
-
Review source paths. The
directorysource can read any path the process has access to. Auditpathvalues in your configs. -
Use HTTPS for API sources. The router allows HTTP, but HTTPS should be used for production API endpoints.
-
Monitor cache directory. Cache files in
.context_router_cache/contain source content. Apply appropriate filesystem permissions. -
Validate configs in CI. Run
context-router validate --config your-config.yamlin your CI pipeline before deploying config changes.
Reporting Vulnerabilities¶
If you find a security vulnerability, please email charafeddine@cohorte.co instead of opening a public issue. We will acknowledge receipt within 48 hours and aim to release a fix within 7 days for critical issues.