Guardrails Overview

Sluice ships with 10 built-in guardrails that analyze every outbound email. Each guardrail independently evaluates the email and produces a risk level. You can enable, disable, and configure each guardrail from Settings > Guardrails in the dashboard.

Risk levels

Every guardrail assigns one of three risk levels to each email:

Risk level	Meaning	What happens
Green	Safe — no issues detected	Auto-forwarded (when tuning mode is off)
Orange	Warning — review recommended	Held for human review
Red	Block — clear violation detected	Held for human review

The overall risk level for an email is the highest risk from any individual guardrail. So if 9 guardrails return green but one returns orange, the email is held for review.

The 10 guardrails

#	Guardrail	What it checks	Default
1	Tone Analysis	Aggressive, threatening, or unprofessional language	Enabled
2	Content Policy	Policy violations, spam, phishing, and hallucinations (with knowledge base)	Enabled
3	Prompt Injection	Embedded instructions designed to manipulate AI agent behavior	Enabled
4	Rate Limiting	Excessive sending volume — prevents runaway agents from damaging domain reputation	Enabled
5	Duplicate Detection	Catches repeated emails — prevents agent loops and accidental re-sends	Enabled
6	PII Detection	Social Security numbers, credit cards, bank accounts, and 20+ other PII types	Disabled
7	Recipient Rules	Blocked/allowed lists and recipient count limits	Disabled
8	Attachment Scanning	Flags emails with file attachments for review	Disabled
9	Compliance	CAN-SPAM requirements and customizable regulatory checks	Disabled
10	Agent Signal	Lets agents self-escalate via a hidden HTML comment when they're uncertain	Always on

How guardrails work together

Guardrails run independently and in parallel. Each one evaluates the email and returns its own risk level. The email's overall risk is the highest individual result.

Example: An AI agent sends a customer support reply. Here's what the guardrail results might look like:

Guardrail	Result	Reason
PII Detection	Red	Credit card number detected (confidence: 0.95)
Tone Analysis	Green	Professional and helpful tone
Content Policy	Green	No policy violations
Prompt Injection	Green	No injection attempts detected

The email is flagged as red because of the PII detection result, even though all other guardrails passed. A reviewer will see exactly which guardrail flagged the email and why.

Recommended setup

Starting out? Keep the defaults — Tone Analysis, Content Policy, Prompt Injection, Rate Limiting, and Duplicate Detection are enabled out of the box and provide strong baseline protection.

Ready for more control? Enable additional guardrails based on your needs:

Turn on PII Detection after configuring whitelists for expected data (e.g., sender contact details) to avoid false positives
Turn on Recipient Rules if your agents should only email certain domains or addresses
Turn on Attachment Scanning if your organization requires review of file attachments
Turn on Compliance if you send commercial or marketing emails subject to CAN-SPAM or similar regulations

Leave Tuning Mode on while you're getting started. Review every email to understand how the guardrails perform on your real traffic, then turn it off when you're confident.

Guardrails Overview

Risk levels

The 10 guardrails

How guardrails work together

Recommended setup

On this page