/01 the problem

Every agent that ships today leaks by default.

Secrets get pasted into .env files, baked into container images, or passed as request headers to untrusted tool calls. One exfiltration primitive in the wrong chain-of-thought and you're explaining a $10k bill to Stripe.

before the usual way

.env · committed by mistake

23 days ago

# .env

OPENAI_API_KEY=sk-proj-aB7cD9...

ANTHROPIC_API_KEY=sk-ant-api03-...

GROQ_API_KEY=gsk_bL...

✕ scanned by 4 bots within 90s of push

✕ unbounded spend until someone notices

✕ no way to scope access per endpoint

after vaulted by strake

.env · after strake

live

# .env · your real key stays in the vault

OPENAI_BASE_URL=https://abc123.strake.sh/v1

OPENAI_API_KEY=ct_live_7a3f... // disposable bearer

✓ point any OpenAI-compatible client at this URL

✓ bearer is scoped to one endpoint only

✓ rotate the upstream provider key with no redeploy

↻ revoke the bearer any time — upstream key unaffected

/02 how it works

A vault with a front door.

Paste a provider key into Strake. We store it encrypted and hand back a disposable subdomain URL + bearer. Point any OpenAI-compatible client at that URL; requests forward to the provider using the key we're holding. Revoke the endpoint any time — the real key is never touched.

01

your agent

Calls api.strake.sh with a public-safe token. No provider key on the agent runtime.

02

allow-list check

Request path checked against your endpoint's allow-list; rate limits and monthly caps enforced.

03

strake vault

vault + forward

Looks up your real provider key, attaches it to the outbound call, forwards to the provider. The key never leaves this hop.

04

provider

OpenAI, Anthropic, Gemini, xAI, Mistral and more. They see a normal, authenticated request.

05

stream back

Response streams back byte-for-byte. TTFT is identical to calling the provider directly.

/03 built for how you work

Four places your keys already leak.
One way to stop it.

01 cloud

   ┌─ cloud runtime ─┐
   │   agent.py     │
   └────────┬───────┘
            │ ct_live_7a3f…
            ▼
        strake  ──▶  sk-…

Cloud-hosted agents

Your agent runs on a machine you don't control. Your real key shouldn't. Hand the runtime a disposable bearer — revoke it the minute the container dies.

learn more →

02 openclaw

   claw-instance-1  ▶  ct_live_a3f8…
   claw-instance-2  ▶  ct_live_b2e7…
   claw-instance-3  ▶  ct_live_c4d6…
                        │
                        ▼
                     sk- (vault)

OpenClaw deployments

Every OpenClaw instance gets its own Strake endpoint. Scope it, rate-limit it, revoke one without breaking the rest. The underlying key stays put.

learn more →

03 mcp

   // mcp.json
   "GITHUB_TOKEN":
     "ghp_aB7cD9…"   ⨯ leaked
   "GITHUB_TOKEN":
     "ct_live_7a3f…" ✓ scoped

MCP servers

Stop pasting real tokens into plaintext MCP config. Each server gets a scoped URL that signs with your key at the edge — so one rogue tool can't drain the whole account.

learn more →

04 teams

   alice  ▶  ct_live_a3…  ┐
   bob    ▶  ct_live_b2…  ├▶  sk- (shared)
   cara   ▶  ct_live_c4…  ┘
              revoke bob → alice & cara unaffected

Teams on a shared key

Every developer gets their own disposable endpoint on the same underlying key. Granular revocation. Per-seat telemetry. No more pasting sk-proj-… into Slack.

learn more →

/04 what you get

Everything you need to put an agent in production.

Six primitives. No dashboards for features you don't need.

/01

Disposable endpoints, revocable any time.

Every endpoint you create is a throwaway subdomain on strake.sh. Hand it to ChatGPT, Cursor, a teammate, a vendor, a demo. Click revoke and every bearer under it stops working — the upstream key is untouched.

per-use URLone-click revoke

/02

Your real key never leaves the vault.

Paste your provider key once. We store it encrypted and return a disposable endpoint URL + bearer. The real key is never typed into a .env, baked into a container image, or pasted into an agent prompt again.

AES-256-GCM at restdecrypted only in memory

/03

Path allow-list per endpoint.

Each endpoint is scoped to exactly the provider paths it's allowed to hit. chat.completions only. messages only. Anything off-list is refused before it leaves Strake.

per-endpointwildcard + glob

/04

Per-minute rate limits.

Cap requests per minute on every endpoint. A leaked bearer can't empty your wallet overnight — and a runaway tool trips the limit, not your provider bill.

per-endpointinstant 429

/05

30-day usage charts.

Every endpoint has a live chart of request volume over the last 30 days. See at a glance which handoff is still getting traffic and which can be safely retired.

per-endpointrolling 30-day

/06

Works with every OpenAI-compatible tool.

Swap OPENAI_BASE_URL and OPENAI_API_KEY. That's the integration. ChatGPT, Claude, Cursor, scripts, Jupyter, curl — anything that speaks the OpenAI shape just works.

no SDK requireddrop-in base URL

/05 middleware

PII enforcement, jailbreak protection, and semantic cache,
per endpoint.

Add middleware to any endpoint from the dashboard. PII enforcement scrubs sensitive data before it reaches your AI provider. Jailbreak protection blocks adversarial prompt attacks before they reach your model. Semantic cache serves similar prompts from a response store — cutting latency and token spend without any code changes.

block Reject the request before it reaches the AI provider.

response · PII detected

HTTP 400

// AI never sees the message
{
  "error": {
    "code":     "middleware_blocked",
    "message":  "Request blocked: PII detected",
    "detected": [
      "first_name",
      "last_name",
      "street_address"
    ]
  }
}
// actual PII values are never returned

redact Scrub PII inline, then forward. The AI never sees the real values.

response header · fields scrubbed

HTTP 200

// "Jordan Wells, 42 Oak St" becomes:
// "[REDACTED_FIRST_NAME] [REDACTED_LAST_NAME], [REDACTED_STREET_ADDRESS]"

x-strake-redacted: first_name,last_name,street_address

// AI sees scrubbed text — real values never leave Strake

PII categories — enable or disable per group, per endpoint

Personal Identity

first_namelast_nameemailoccupation

Location

street_addresscitystatepostcode

Financial

account_numberswift_bic

Time

time

Adversarial prompts stopped before they reach your model.

Jailbreak protection runs on every proxied request. Powered by NVIDIA NeMo Guard, it detects prompt injection, DAN attempts, role-playing exploits, and other adversarial patterns — and blocks them before they ever reach your AI provider.

blocked Reject the request — the model never sees the adversarial prompt.

response · jailbreak detected

HTTP 400

// "Ignore your previous instructions and..."
{
  "error": {
    "code":     "middleware_blocked",
    "message":  "Request blocked: jailbreak attempt detected",
    "pattern":  "prompt_injection"
  }
}
// model never processes the adversarial content

monitor Log the detection and forward — useful for auditing without blocking.

response header · pattern logged

HTTP 200

// request forwarded, detection recorded

x-strake-jailbreak-detected: true
x-strake-jailbreak-pattern:  dan_attempt

// visible in your usage logs per request

Attack patterns detected — powered by NVIDIA NeMo Guard

Prompt Injection

ignore_instructionssystem_override

Role-Play Exploits

dan_attemptpersona_bypass

Instruction Bypass

constraint_removalpolicy_evasion

Adversarial Inputs

encoded_attacksindirect_injection

Every cache hit is a provider call that never happens.

Agents and assistants repeat themselves constantly — the same queries, slightly reworded. Semantic cache catches them. Enable it per endpoint, tune the similarity threshold, and watch tokens saved accumulate in your dashboard.

tokens consumed · per hit

0

A cache hit never reaches your provider. Zero tokens billed, zero cost incurred.

response time · from cache

~12ms

vs 600ms–2s for a live LLM call. Faster agents, lower latency budgets.

savings · compound over time

↑

The more your endpoint is used, the richer the cache gets — and the higher the hit rate climbs.

miss No match found — request forwarded upstream, response stored.

first request · cache miss

→ upstream

// "What is the capital of France?"
// no match in vector store

POST /v1/chat/completions → provider
// response forwarded and stored in cache
// similarity threshold: 0.92

hit Similar prompt detected — cached response returned in milliseconds.

later request · cache hit

HTTP 200

// "Can you tell me France's capital?"
// similarity: 0.97 · stored: 4 min ago

// returned from cache in ~12ms
// provider never called
// ~400 tokens saved

Similarity threshold

default 0.92range 0.50–1.00

Cache TTL

default 24 hmax 168 h

Savings dashboard

hit rate %tokens saved

/06 security model

Your key never leaves the vault unencrypted.

Here's exactly what happens to a provider key the moment you paste it into Strake.

Envelope encryption (AES-256-GCM) at rest in Cloudflare D1

Master key held as a Cloudflare secret, isolated from app code

Plaintext decrypted only in memory on the proxy hot path

Request + response bodies never inspected or persisted

Tokens shown once; rotate to recover a lost one

PII middleware blocks or redacts sensitive data before it reaches the AI provider

Jailbreak protection blocks adversarial prompts before they reach your model

Audit trail

coming soon

/07 pricing

$3 per endpoint,
per month.

Each endpoint includes 10,000 requests. Five-endpoint floor ($15/mo). Pick 5, 10, 20, 50, 75, or 100 — upgrade or downgrade any time.

strake standard

$3/endpoint · month

10,000 requests per endpoint included. Five-endpoint minimum.

Unlimited labeled bearers per endpoint — rotate & revoke independently
Per-endpoint path allow-list
Per-minute rate limits
30-day usage charts on every endpoint
14-day grace period if a subscription lapses

endpoint plans

5$15 10$30 20$60 50$150 75$225 100$300

Create your first endpoint

need more?

get in touch

More than 100 endpoints, higher per-endpoint request ceilings, or custom deployment.

Volume pricing past 100 endpoints
Higher included request limits
Custom path allow-list policies

talk to us

/08 frequently asked

Questions we get all the time.

Q.01

What is Strake?

Strake is a vault and access layer for API keys used by AI assistants, agents, and scripts. You store your real provider keys in Strake and hand out disposable, revocable endpoints to the tools that need access.

−

Q.02

Who is it for?

+

Q.03

Does the assistant get my real API key?

+

Q.04

Can I revoke access?

+

Q.05

Do I need to change my workflow?

+

Q.06

Which tools does it work with?

+

Q.07

How much does it cost?

+

Q.08

What happens if I cancel?

+

Q.09

How does PII middleware work?

+

Q.10

How does jailbreak protection work?

+

Q.11

How does semantic cache work?

+

/09 writing

From the blog.

Security · Apr 27

Stop Adversarial Prompts Before They Reach Your Model

How jailbreak protection works at the proxy layer, and why that's the right place for it.

Security · Apr 25

Stop Sensitive Data Before It Reaches Your LLM

PII flows into models constantly. Here's how to catch it at the request level before it reaches your provider.

Security · Apr 19

Your API Keys Don't Belong in Environment Variables

Every few months a platform gets breached and the recommendation is "rotate your env vars." That's the wrong fix.

All posts

Stop pasting API keys into agents.

Every agent that ships today leaks by default.

A vault with a front door.

Four places your keys already leak.
One way to stop it.

Cloud-hosted agents

OpenClaw deployments

MCP servers

Teams on a shared key

Everything you need to put an agent in production.

Disposable endpoints, revocable any time.

Your real key never leaves the vault.

Path allow-list per endpoint.

Per-minute rate limits.

30-day usage charts.

Works with every OpenAI-compatible tool.

PII enforcement, jailbreak protection, and semantic cache,
per endpoint.

Adversarial prompts stopped before they reach your model.

Every cache hit is a provider call that never happens.

Your key never leaves the vault unencrypted.

$3 per endpoint,
per month.

Questions we get all the time.

What is Strake?

Who is it for?

Does the assistant get my real API key?

Can I revoke access?

Do I need to change my workflow?

Which tools does it work with?

How much does it cost?

What happens if I cancel?

How does PII middleware work?

How does jailbreak protection work?

How does semantic cache work?

From the blog.

Keys stay in the vault.
Agents stay in production.

Stop pasting API keys into agents.

Every agent that ships today leaks by default.

A vault with a front door.

Four places your keys already leak.One way to stop it.

Cloud-hosted agents

OpenClaw deployments

MCP servers

Teams on a shared key

Everything you need to put an agent in production.

Disposable endpoints, revocable any time.

Your real key never leaves the vault.

Path allow-list per endpoint.

Per-minute rate limits.

30-day usage charts.

Works with every OpenAI-compatible tool.

PII enforcement, jailbreak protection, and semantic cache,per endpoint.

Adversarial prompts stopped before they reach your model.

Every cache hit is a provider call that never happens.

Your key never leaves the vault unencrypted.

$3 per endpoint,per month.

Questions we get all the time.

What is Strake?

Who is it for?

Does the assistant get my real API key?

Can I revoke access?

Do I need to change my workflow?

Which tools does it work with?

How much does it cost?

What happens if I cancel?

How does PII middleware work?

How does jailbreak protection work?

How does semantic cache work?

From the blog.

Keys stay in the vault.Agents stay in production.

Four places your keys already leak.
One way to stop it.

PII enforcement, jailbreak protection, and semantic cache,
per endpoint.

$3 per endpoint,
per month.

Keys stay in the vault.
Agents stay in production.