Hardware encrypted inference.

Zero logs. Zero limits.

An encrypted AI gateway for every model.

Integration

Inference leading LLMs run inside verifiably secure runtimes, powered by Intel TDX and NVIDIA Confidential Computing architectures.

inference.ts
1import OpenAI from "openai"
2
3const zima = new OpenAI({
4  baseURL: "https://www.zima.chat/api/v1",
5  apiKey: process.env.ZIMA_KEY,
6})
7
8const response = await zima.chat.completions.create({
9  model: "qwen3-next-80B-a3B-instruct",
10  messages: [{ role: "user", content: "..." }],
11})
12
13// encrypted in hardware. zero data retained.
www.zima.chat

Seamless migration to secure compute

Use your existing OpenAI client. Swap the base URL and API key to start encrypting every inference request today.

Install

Drop in the OpenAI SDK. No proprietary client, no vendor lock. One dependency you already know.

Point to Zima

Swap the base URL. Every request now routes through our secure enclave — encrypted in hardware, zero data retained.

Ship to Production

Same models, same latency, hardware-level encryption. Your compliance team will thank you.

#Architecture

From SDK request to protected output, without storing prompts or responses.

#Models

Provider costs passed through at-rate. Zero markup.

ModelProviderInput / 1MOutput / 1M
GPT-4oOpenAI$2.50$10.00
Claude Sonnet 4Anthropic$3.00$15.00
Gemini 2.5 ProGoogle$1.25$5.00
Llama 3.1 70BMeta$0.50$0.75
Mistral LargeMistral$2.00$6.00
Secure Enclave

Zero bytes retained

Every request executes inside a hardware-isolated enclave. No prompts, outputs, or metadata survive past completion.

Zima Enclave
Intel TDX v5 · NVIDIA CC
0bytes retained

No prompts, outputs, or metadata survive past completion

AES-256-GCM

Memory Encryption

Data sealed in silicon. Encrypted at rest, in transit, and in use.

Attestation

Cryptographic proof before any key is released.

#FAQ

Security, compatibility, and pricing.

No. Requests exist only for the duration of execution inside the enclave. No prompts, outputs, metadata, or logs are persisted. This is enforced by hardware, not policy.

Access is blocked immediately. If the system cannot cryptographically verify that the runtime environment is running approved, unmodified software, no data is processed.

Yes. Zima uses the chat/completions format. Point your existing OpenAI SDK at our base URL and it works — same request shape, same response shape, zero retention.

Every model in the catalog is labeled with its security tier. TEE-hosted models run entirely inside hardware enclaves.

Per-token, per-model, pay-as-you-go. No platform fees, no hidden tiers, no commitments. Pricing is displayed on every model card before you send a request.

$10 in free credits. Start encrypting your inference today.

Point the OpenAI SDK at Zima, keep model choice and pricing visibility, and move sensitive traffic into attested hardware.