Portable G and the Claw of Shiva

Project Nirvana

Privacy-First Local-First Inference for AI Agents

"Privacy is not something that I'm merely entitled to, but something I deeply believe in."
— Bruce Schneier

The Problem

Default OpenClaw transmits 2,000–5,000 tokens per query to cloud APIs. Your identity files, conversation history, memory, and sensitive context get sent unencrypted over the wire to third-party inference providers every single turn.

Even with a well-intentioned provider, this creates a massive attack surface:

The Solution: Local-First with Smart Fallback

Project Nirvana keeps 80%+ of computation local on your hardware, using cloud APIs only as a stateless fallback for frontier reasoning tasks.

When fallback is necessary, the context-stripper removes:

Result: 85% token reduction + 100% privacy preservation.

What You Get

Plugin (v1.0.0)

Bundles Ollama + qwen2.5:7b with intelligent routing:

Installation:

openclaw plugins install ShivaClaw/nirvana

Skill (Lightweight Context-Stripper)

If you already have a local LLM running (Ollama, vLLM, local Claude.cpp), install the skill for context-stripping only:

Architecture: Local-First Router

Tier 1 (Default): Ollama on local hardware (qwen2.5:7b, ~4 tokens/sec)

Tier 2 (Fallback): Cloud APIs with context stripping:

  • Claude Haiku 4.5 (high-stakes reasoning)
  • Gemini 2.5 Flash (redundancy)
  • Grok 3 Mini (speed)

Decision Logic:

  • First attempt: Local Ollama (always)
  • If local timeout or quality threshold not met: Strip identity, try cloud fallback
  • If cloud unavailable: Serve stale response from local cache
  • Offline mode: Pure local, no cloud calls at all

Real-World Impact

Before Nirvana: 50+ API calls/day to OpenAI, each carrying full identity context. ~$50/month in API costs. Full session data exposed.

After Nirvana: 80%+ queries handled locally. 10–15 API calls/day (fallback only). ~$5/month in API costs. Zero identity data transmitted.

Hardware Requirement: 8GB RAM minimum (Ollama + qwen2.5:7b uses ~5GB). CPU-only inference at 3–4 tokens/sec. GPU optional but recommended for 20+ tok/sec.

Why This Matters

The premise of Project Trident is that agents should become your partners over time. They learn your preferences, your goals, your personality. That data is extremely valuable — to attackers, to competitors, to advertisers.

You should own that data. Not OpenAI. Not Google. Not Anthropic.

Nirvana is our answer: keep computation local by default, use cloud APIs only when you need their raw capability, and never send your identity to third parties.

Get Started

Project Nirvana is open-source and production-ready.

GitHub Repository ClawHub Registry