Guardrails

What are guardrails?

Guardrails are security policies that sit between your AI assistant and the tools it uses. They inspect, validate, and transform requests and responses to ensure your AI operates within safe boundaries.

Why guardrails matter: When AI assistants have access to powerful tools, they need constraints that traditional security models don’t provide. Guardrails protect against data exposure, accidental destructive actions, and prompt injection attacks.

Nexus UI showing guardrail options for GitHub tools

Managing guardrails

Manage guardrails through the Nexus UI when adding tools to your toolkit, or through natural conversation with your AI assistant.

View available guardrails

Ask your AI assistant:

“What guardrail templates are available for Gmail?”

“Show me guardrails I can add for the GitHub search_code tool”

Add a guardrail

“Add a guardrail to block searches containing ‘password’ and ‘secret’”

“Set up PII redaction for Notion responses”

The AI will find the appropriate template, ask for any required values, and create the guardrail for your toolkit.

List active guardrails

“What guardrails are currently active for my GitHub server?”

Remove a guardrail

“Remove the guardrail that blocks Gmail searches for passwords”

Why use guardrails?

AI assistants are powerful but need appropriate constraints:

Risk	Guardrail solution
AI reads sensitive data it shouldn’t	Request guardrails block access to certain fields or patterns
Tool responses contain PII	Response guardrails automatically redact sensitive data
Prompt injection attempts	Built-in detection blocks malicious prompts
Overwhelming context with large responses	Response processors truncate or transform data
Accidental destructive actions	Block or require confirmation for write operations

The AI is not a human

When a human reads an email, they understand context and exercise judgment. An AI assistant:

Will happily read every email in your inbox if asked
Cannot distinguish between legitimate requests and prompt injection attacks
May expose sensitive data by including it in responses
Could execute destructive operations without understanding consequences

Scale amplifies risk

What takes a human hours to do manually, an AI can do in seconds. A misconfigured tool that exposes one record is an incident. An AI iterating through thousands of records is a breach.

How guardrails work

Guardrails are evaluated at two points in the tool execution pipeline:

Request → [Request guardrails] → Tool execution → [Response guardrails] → Response

Request guardrails

Evaluated before the tool runs. They can:

Block requests that violate policies
Validate parameters meet requirements
Filter which operations are allowed

Response guardrails

Evaluated after the tool runs. They can:

Redact sensitive information from responses
Transform data (e.g., HTML to Markdown, JSON to CSV)
Truncate overly long responses
Remove specific fields from the output

Built-in protection

Every Nexus account includes universal guardrails that are always active:

PII detection and redaction

Automatically detects and can redact:

Social Security Numbers (SSN)
Credit card numbers
Email addresses
Phone numbers (international formats)
IP addresses
Passport numbers
Driver’s license numbers
Bank account numbers (IBAN)
Dates of birth

Prompt injection detection

Based on OWASP LLM01:2025, detects:

Direct instruction override attempts
Role manipulation patterns
Context escape attempts
Jailbreak patterns

Unsafe file format blocking

Prevents access to potentially dangerous file types:

Executables (.exe, .dll, .sh, .bat)
Scripts (.ps1, .vbs, .js)
Archives (.zip, .rar, .7z)
Office files with macros (.xlsm, .docm)
System files (.msi, .deb, .rpm)

Guardrail hierarchy

Guardrails operate at three levels, each with different scope:

Level	Scope	Example use case
Account	Applies to all users and toolkits	Company-wide PII redaction policy
Toolkit	Applies to a specific toolkit	Production toolkit blocks write operations
User	Applies to a specific user	Individual’s custom blocked terms

Higher levels cannot be overridden by lower levels. An account-level guardrail blocking access to /etc/passwd cannot be bypassed by a user-level guardrail.

Common guardrail examples

Block sensitive search terms

Prevent the AI from searching for passwords, secrets, or credentials.

“Add a guardrail to block searches containing ‘password’, ‘secret’, ‘api_key’”

Restrict to specific domains

Only allow web fetching from approved domains.

“Add a guardrail to only allow fetching from docs.example.com”

Redact PII in responses

Automatically replace sensitive data with [REDACTED].

“Enable PII redaction for email addresses in all responses”

Block destructive operations

Prevent accidental data loss.

“Add a guardrail to block the delete_repository tool on GitHub”

Response processors

In addition to security guardrails, you can add response processors that optimize tool outputs for AI consumption. These reduce token usage and improve response quality.

Retain specific fields

Keep only the fields you need, removing unnecessary data.

“Add a processor to retain only ‘id’, ‘name’, and ‘status’ from campaign responses”

Convert HTML to Markdown

Transform verbose HTML into compact Markdown.

“Convert HTML responses to Markdown for the web scraper”

Remove metadata fields

Strip internal fields like timestamps and IDs.

“Remove ‘created_at’, ‘updated_at’, and ‘internal_id’ from responses”

Truncate long content

Abbreviate overly long text fields.

“Truncate description fields to 500 characters”

Response processors use the same management interface as guardrails - just ask your AI assistant to add them.

Troubleshooting

You don't have permission to add guardrails

Adding and removing guardrails requires account management permissions. Contact your account administrator or check your role.

Guardrail not triggering

Ensure:

The guardrail is enabled (check with “list active guardrails”)
The guardrail is scoped to the correct server and tool
The data matches the expected schema path

Too many false positives

If guardrails are blocking legitimate requests:

Review the guardrail’s value/pattern configuration
Consider using a more specific schema path
Remove and re-add with adjusted parameters

Best practices

Guardrails complement authentication and authorization by controlling how tools are used, not just who can use them.

Start restrictive

Begin with tight controls and loosen as needed. It’s easier to relax rules than recover from a breach.

Layer defenses

Combine multiple guardrail types for robust protection. Request validation + response redaction provides defense in depth.

Monitor triggers

Track when guardrails activate to understand patterns and refine policies.

Document policies

Make it clear why each guardrail exists so future team members understand the reasoning.

Getting Started

Supported Clients

Developers

Reference

AI Prompts

Support

What are guardrails?

Managing guardrails

View available guardrails

Add a guardrail

List active guardrails

Remove a guardrail

Why use guardrails?

The AI is not a human

Scale amplifies risk

How guardrails work

Request guardrails

Response guardrails

Built-in protection

Guardrail hierarchy

Common guardrail examples

Block sensitive search terms

Restrict to specific domains

Redact PII in responses

Block destructive operations

Response processors

Retain specific fields

Convert HTML to Markdown

Remove metadata fields

Truncate long content

Troubleshooting

Best practices

Getting Started

Supported Clients

Developers

Reference

AI Prompts

Support

​What are guardrails?

​Managing guardrails

​View available guardrails

​Add a guardrail

​List active guardrails

​Remove a guardrail

​Why use guardrails?

​The AI is not a human

​Scale amplifies risk

​How guardrails work

​Request guardrails

​Response guardrails

​Built-in protection

​Guardrail hierarchy

​Common guardrail examples

Block sensitive search terms

Restrict to specific domains

Redact PII in responses

Block destructive operations

​Response processors

Retain specific fields

Convert HTML to Markdown

Remove metadata fields

Truncate long content

​Troubleshooting

​Best practices

What are guardrails?

Managing guardrails

View available guardrails

Add a guardrail

List active guardrails

Remove a guardrail

Why use guardrails?

The AI is not a human

Scale amplifies risk

How guardrails work

Request guardrails

Response guardrails

Built-in protection

Guardrail hierarchy

Common guardrail examples

Response processors

Troubleshooting

Best practices