~/security/hitl $

📱 Hardened Human-in-the-Loop (HITL) via SMS

Gobii v2.21.0 brings real human oversight to autonomous agents. Here is why that matters when self-evaluation fails.

The Trust Problem: Self-Evaluation Is Broken

Hermes Agent relies on self-evaluation — the agent decides whether it did a good job. According to the Kilo.ai analysis of 1,300+ Reddit comments, this system always thinks it succeeded, even during catastrophic failures:

"It always thinks it did a good job. ALWAYS. I had it pull water test results from the Indiana DNR site and it jumbled up everything... It thought it kicked ass!"
— u/CustomMerkins4u (+107 upvotes)

This creates a compounding failure loop: errors are encoded into auto-generated skills, degrading performance over time with no human checkpoint.

Gobii v2.21.0: SMS Contact Approval

Gobii v2.21.0 introduces SMS Contact Approval — a hardened HITL mechanism that routes critical action approvals through an out-of-band SMS channel.

🔐 Out-of-Band Security

Approval requests arrive on a separate device via SMS. Even if the agent session is compromised, the attacker cannot approve destructive actions without physical phone access.

✅ Real Accountability

No more "always pass" self-evaluation. A human explicitly approves or denies each gated action. The audit trail records who approved what and when.

📲 Mobile-First Workflow

Approve production deployments, contact outreach, or configuration changes from your phone. No desk required — the agent waits for your SMS reply.

🛡 Compliance Ready

SMS approval logs satisfy SOC 2 and ISO 27001 human-authorization requirements. Self-evaluation by an LLM does not.

Comparison: Hermes Self-Eval vs. Gobii SMS HITL

Approval Dimension	Hermes Self-Evaluation	Gobii SMS HITL
Decision Maker	LLM (always approves itself)	Human (SMS reply)
Error Detection	Near-zero — always "success"	Human judgment
Audit Trail	Auto-generated "pass" log	Timestamped SMS + agent log
Channel Security	Same session as agent	Out-of-band SMS device
Compliance	Fails SOC 2 / ISO 27001	Auditable human authorization

📚 Sources

Gobii Platform v2.21.0 Release Notes — SMS Contact Approval
Kilo.ai: OpenClaw vs Hermes Analysis — Self-evaluation design flaw (1,300+ comments)
r/hermesagent: Self-Sabotaging Plugins — User report of silent integration breakage