What are guardrails?
Guardrails are security policies that sit between your AI assistant and the tools it uses. They inspect, validate, and transform requests and responses to ensure your AI operates within safe boundaries.Why guardrails matter: When AI assistants have access to powerful tools, they need constraints that traditional security models don’t provide. Guardrails protect against data exposure, accidental destructive actions, and prompt injection attacks.

Add guardrails to tools directly from the Nexus UI
Managing guardrails
Manage guardrails through the Nexus UI when adding tools to your toolkit, or through natural conversation with your AI assistant.View available guardrails
Ask your AI assistant:“What guardrail templates are available for Gmail?”
“Show me guardrails I can add for the GitHub search_code tool”
Add a guardrail
“Add a guardrail to block searches containing ‘password’ and ‘secret’”
“Set up PII redaction for Notion responses”The AI will find the appropriate template, ask for any required values, and create the guardrail for your toolkit.
List active guardrails
“What guardrails are currently active for my GitHub server?”
Remove a guardrail
“Remove the guardrail that blocks Gmail searches for passwords”
Why use guardrails?
AI assistants are powerful but need appropriate constraints:| Risk | Guardrail solution |
|---|---|
| AI reads sensitive data it shouldn’t | Request guardrails block access to certain fields or patterns |
| Tool responses contain PII | Response guardrails automatically redact sensitive data |
| Prompt injection attempts | Built-in detection blocks malicious prompts |
| Overwhelming context with large responses | Response processors truncate or transform data |
| Accidental destructive actions | Block or require confirmation for write operations |
The AI is not a human
When a human reads an email, they understand context and exercise judgment. An AI assistant:- Will happily read every email in your inbox if asked
- Cannot distinguish between legitimate requests and prompt injection attacks
- May expose sensitive data by including it in responses
- Could execute destructive operations without understanding consequences
Scale amplifies risk
What takes a human hours to do manually, an AI can do in seconds. A misconfigured tool that exposes one record is an incident. An AI iterating through thousands of records is a breach.How guardrails work
Guardrails are evaluated at two points in the tool execution pipeline:Request guardrails
Evaluated before the tool runs. They can:- Block requests that violate policies
- Validate parameters meet requirements
- Filter which operations are allowed
Response guardrails
Evaluated after the tool runs. They can:- Redact sensitive information from responses
- Transform data (e.g., HTML to Markdown, JSON to CSV)
- Truncate overly long responses
- Remove specific fields from the output
Built-in protection
Every Nexus account includes universal guardrails that are always active:PII detection and redaction
PII detection and redaction
Automatically detects and can redact:
- Social Security Numbers (SSN)
- Credit card numbers
- Email addresses
- Phone numbers (international formats)
- IP addresses
- Passport numbers
- Driver’s license numbers
- Bank account numbers (IBAN)
- Dates of birth
Prompt injection detection
Prompt injection detection
Based on OWASP LLM01:2025, detects:
- Direct instruction override attempts
- Role manipulation patterns
- Context escape attempts
- Jailbreak patterns
Unsafe file format blocking
Unsafe file format blocking
Prevents access to potentially dangerous file types:
- Executables (.exe, .dll, .sh, .bat)
- Scripts (.ps1, .vbs, .js)
- Archives (.zip, .rar, .7z)
- Office files with macros (.xlsm, .docm)
- System files (.msi, .deb, .rpm)
Guardrail hierarchy
Guardrails operate at three levels, each with different scope:| Level | Scope | Example use case |
|---|---|---|
| Account | Applies to all users and toolkits | Company-wide PII redaction policy |
| Toolkit | Applies to a specific toolkit | Production toolkit blocks write operations |
| User | Applies to a specific user | Individual’s custom blocked terms |
/etc/passwd cannot be bypassed by a user-level guardrail.
Common guardrail examples
Block sensitive search terms
Prevent the AI from searching for passwords, secrets, or credentials.
“Add a guardrail to block searches containing ‘password’, ‘secret’, ‘api_key’”
Restrict to specific domains
Only allow web fetching from approved domains.
“Add a guardrail to only allow fetching from docs.example.com”
Redact PII in responses
Automatically replace sensitive data with [REDACTED].
“Enable PII redaction for email addresses in all responses”
Block destructive operations
Prevent accidental data loss.
“Add a guardrail to block the delete_repository tool on GitHub”
Response processors
In addition to security guardrails, you can add response processors that optimize tool outputs for AI consumption. These reduce token usage and improve response quality.Retain specific fields
Keep only the fields you need, removing unnecessary data.
“Add a processor to retain only ‘id’, ‘name’, and ‘status’ from campaign responses”
Convert HTML to Markdown
Transform verbose HTML into compact Markdown.
“Convert HTML responses to Markdown for the web scraper”
Remove metadata fields
Strip internal fields like timestamps and IDs.
“Remove ‘created_at’, ‘updated_at’, and ‘internal_id’ from responses”
Truncate long content
Abbreviate overly long text fields.
“Truncate description fields to 500 characters”
Troubleshooting
You don't have permission to add guardrails
You don't have permission to add guardrails
Adding and removing guardrails requires account management permissions. Contact your account administrator or check your role.
Guardrail not triggering
Guardrail not triggering
Ensure:
- The guardrail is enabled (check with “list active guardrails”)
- The guardrail is scoped to the correct server and tool
- The data matches the expected schema path
Too many false positives
Too many false positives
If guardrails are blocking legitimate requests:
- Review the guardrail’s value/pattern configuration
- Consider using a more specific schema path
- Remove and re-add with adjusted parameters
Best practices
Guardrails complement authentication and authorization by controlling how tools are used, not just who can use them.1
Start restrictive
Begin with tight controls and loosen as needed. It’s easier to relax rules than recover from a breach.
2
Layer defenses
Combine multiple guardrail types for robust protection. Request validation + response redaction provides defense in depth.
3
Monitor triggers
Track when guardrails activate to understand patterns and refine policies.
4
Document policies
Make it clear why each guardrail exists so future team members understand the reasoning.

