Sluice Docs

Guardrails Overview

Sluice ships with 10 built-in guardrails that analyze every outbound email. Each guardrail independently evaluates the email and produces a risk level. You can enable, disable, and configure each guardrail from Settings > Guardrails in the dashboard.

Risk levels

Every guardrail assigns one of three risk levels to each email:

Risk levelMeaningWhat happens
GreenSafe — no issues detectedAuto-forwarded (when tuning mode is off)
OrangeWarning — review recommendedHeld for human review
RedBlock — clear violation detectedHeld for human review

The overall risk level for an email is the highest risk from any individual guardrail. So if 9 guardrails return green but one returns orange, the email is held for review.

The 10 guardrails

#GuardrailWhat it checksDefault
1Tone AnalysisAggressive, threatening, or unprofessional languageEnabled
2Content PolicyPolicy violations, spam, phishing, and hallucinations (with knowledge base)Enabled
3Prompt InjectionEmbedded instructions designed to manipulate AI agent behaviorEnabled
4Rate LimitingExcessive sending volume — prevents runaway agents from damaging domain reputationEnabled
5Duplicate DetectionCatches repeated emails — prevents agent loops and accidental re-sendsEnabled
6PII DetectionSocial Security numbers, credit cards, bank accounts, and 20+ other PII typesDisabled
7Recipient RulesBlocked/allowed lists and recipient count limitsDisabled
8Attachment ScanningFlags emails with file attachments for reviewDisabled
9ComplianceCAN-SPAM requirements and customizable regulatory checksDisabled
10Agent SignalLets agents self-escalate via a hidden HTML comment when they're uncertainAlways on

How guardrails work together

Guardrails run independently and in parallel. Each one evaluates the email and returns its own risk level. The email's overall risk is the highest individual result.

Example: An AI agent sends a customer support reply. Here's what the guardrail results might look like:

GuardrailResultReason
PII DetectionRedCredit card number detected (confidence: 0.95)
Tone AnalysisGreenProfessional and helpful tone
Content PolicyGreenNo policy violations
Prompt InjectionGreenNo injection attempts detected

The email is flagged as red because of the PII detection result, even though all other guardrails passed. A reviewer will see exactly which guardrail flagged the email and why.

Starting out? Keep the defaults — Tone Analysis, Content Policy, Prompt Injection, Rate Limiting, and Duplicate Detection are enabled out of the box and provide strong baseline protection.

Ready for more control? Enable additional guardrails based on your needs:

  • Turn on PII Detection after configuring whitelists for expected data (e.g., sender contact details) to avoid false positives
  • Turn on Recipient Rules if your agents should only email certain domains or addresses
  • Turn on Attachment Scanning if your organization requires review of file attachments
  • Turn on Compliance if you send commercial or marketing emails subject to CAN-SPAM or similar regulations

Leave Tuning Mode on while you're getting started. Review every email to understand how the guardrails perform on your real traffic, then turn it off when you're confident.

On this page