Trust & Supervision

The safety layer

Letting an AI agent touch real customer accounts is a trust decision, not a capability decision. Kaarna’s answer is structural: independent supervisors, mechanical gates, and a trace of everything — guarantees written in code, never in a prompt.

Nothing reaches your customer — or your systems — unchecked

GenerateAgent drafts a reply or proposes an action
PolicyYour guardrails, evaluated independently
GroundingEvery factual claim must cite a verified source
Sensitive topicsLegal, medical, distress → forced human handoff
DeliverSend, rewrite, or escalate — with the verdict on record

Supervisors are separate models with one job: judging. They don’t share the generator’s context or its blind spots. If a supervisor fails, the system fails closed — the message is held and a human steps in. Never the other way around.

Three tiers of action

Tool permissions are enforced in the executor — code that runs the same way every time — not by asking the model nicely.

Read

Look up an order, check a subscription. Executes freely, always recorded.

Write, reversible

Update a case, add a note. Requires explicit customer confirmation in-conversation; the undo path is registered before the tool can ship.

Write, irreversible

Issue a refund, cancel an order. Customer confirmation and a supervisor verdict and — when your policy demands it — a human approval, with the approver’s identity on record.

An audit trail your compliance team will actually use

Every decision, replayable

Plans, model calls, retrievals, tool executions, supervisor verdicts — an append-only trace per conversation. Reconstruct exactly why the agent did what it did, months later.

  • Per-decision evidence: which guardrail, which source, which approver
  • Append-only by design — no edits, no gaps
  • GDPR/CCPA erasure without destroying the audit structure

Answers that cite their sources

Grounded mode requires every factual claim to cite an ingested, versioned knowledge chunk. No citation, no claim — the message is blocked and replanned, not “probably fine.”

  • Exact-match entailment on prices, policies, and figures
  • Stale-source protection: citations pin content revisions
  • Unanswerable questions escalate instead of improvising

How we test the safety layer

Security review materials

A technical brief on supervision, tracing, and data handling, written for your security and compliance review.

Request the security brief