Shield

End-to-End Safety Pipeline

Complete walkthrough — configure safety primitives, compose a profile, scan content, and handle results.

This guide builds a complete AI safety pipeline from scratch using Shield.

1. Set up the engine

package main

import (
    "context"
    "log"

    "github.com/xraph/shield"
    "github.com/xraph/shield/engine"
    "github.com/xraph/shield/observability"
    "github.com/xraph/shield/scan"
)

func main() {
    ctx := context.Background()

    // Build engine with metrics
    metricsPlugin := observability.NewMetricsExtension()

    eng, err := engine.New(
        engine.WithConfig(shield.Config{
            EnableShortCircuit: true,
            ScanConcurrency:   10,
        }),
        engine.WithPlugin(metricsPlugin),
    )
    if err != nil {
        log.Fatal(err)
    }

    // Set tenant scope
    ctx = shield.WithTenant(ctx, "acme-corp")
    ctx = shield.WithApp(ctx, "support-bot")

    run(ctx, eng)
}

2. Scan user input

func run(ctx context.Context, eng *engine.Engine) {
    // Scan incoming user message
    input := &scan.Input{
        Text: "Can you help me with my account? My email is john@example.com",
        Context: map[string]any{
            "channel": "chat",
            "user_id": "user-42",
        },
    }

    result, err := eng.ScanInput(ctx, input)
    if err != nil {
        log.Fatal("scan input:", err)
    }

    log.Printf("Decision: %s", result.Decision)
    log.Printf("Findings: %d", len(result.Findings))
    log.Printf("Has PII: %v", result.HasPII())

3. Handle scan results

    // Check if content was blocked
    if result.Blocked {
        log.Printf("Content blocked: %s", result.Decision)
        return
    }

    // Check for PII
    if result.HasPII() {
        log.Printf("PII detected: %d instances", result.PIICount)
        log.Printf("Redacted text: %s", result.Redacted)
        // Use result.Redacted instead of original text
    }

    // Inspect findings
    for _, finding := range result.Findings {
        log.Printf("[%s] %s: %s (score: %.2f)",
            finding.Layer,
            finding.Source,
            finding.Message,
            finding.Score,
        )
    }

4. Scan agent output

    // Before sending agent response to user, scan it
    agentResponse := "Here is your account information..."

    outputResult, err := eng.ScanOutput(ctx, &scan.Input{
        Text: agentResponse,
    })
    if err != nil {
        log.Fatal("scan output:", err)
    }

    if outputResult.Blocked {
        log.Printf("Agent response blocked: %s", outputResult.Decision)
        // Return a safe fallback response
        return
    }

    // Use the (possibly redacted) output
    finalResponse := agentResponse
    if outputResult.HasPII() {
        finalResponse = outputResult.Redacted
    }

    log.Printf("Safe response: %s", finalResponse)

5. Graceful shutdown

    // Stop the engine (notifies all plugins)
    if err := eng.Stop(ctx); err != nil {
        log.Printf("shutdown error: %v", err)
    }
}

6. Adding audit trails

For production, add an audit trail plugin to record all safety events:

import "github.com/xraph/shield/audit_hook"

// Create a recorder that bridges to your audit backend
recorder := audit_hook.RecorderFunc(func(ctx context.Context, event *audit_hook.AuditEvent) error {
    log.Printf("[AUDIT] %s %s: %s (severity: %s)",
        event.Action,
        event.Resource,
        event.Outcome,
        event.Severity,
    )
    return nil
})

auditPlugin := audit_hook.New(recorder)

eng, _ := engine.New(
    engine.WithPlugin(metricsPlugin),
    engine.WithPlugin(auditPlugin),
)

Scan result decisions

DecisionMeaning
allowContent is safe to proceed
blockContent is blocked — do not send
flagContent is flagged for review but allowed
redactContent contains PII that was redacted

Finding layers

Each finding indicates which safety layer produced it:

LayerDescription
instinctPre-conscious threat detection
awarenessPII, topic, intent detection
boundaryHard limit violation
valuesEthical rule violation
judgmentContextual risk assessment
reflexPolicy rule triggered

On this page