An encrypted AI gateway for every model.

Integration

Secure computation, end-to-end

Inference leading LLMs run inside verifiably secure runtimes, powered by Intel TDX and NVIDIA Confidential Computing architectures.

inference.ts
1import OpenAI from "openai"
2
3const zima = new OpenAI({
4  baseURL: "https://api.zimalabs.io/v1",
5  apiKey: process.env.ZIMA_KEY,
6})
7
8const response = await zima.chat.completions.create({
9  model: "qwen3-next-80B-a3B-instruct",
10  messages: [{ role: "user", content: "..." }],
11})
12
13// encrypted in hardware. zero data retained.
api.zimalabs.io

From SDK request to protected output, without storing prompts or responses.

Verified inference path

Secure by design

From key exchange to model output, every step runs inside hardware-attested enclaves.

Hardware encryption

Intel TDX and NVIDIA confidential compute enclaves protect every request made through Zima.

Zero data retention

No prompts, outputs, or metadata survive past completion. GPU memory gets wiped after every request.

OpenAI SDK compatible

Point your existing OpenAI client at Zima's base URL. Fully encrypted inference with a single line swap.

100+ model catalog

OpenAI, Anthropic, Mistral and more. Access the models you already use, through one secure endpoint.

ModelProviderInput / 1MOutput / 1M
GPT OSS 20BOpenAI$0.020$0.100
GPT-OSS 120BOpenAI$0.100$0.400
Qwen3 Next 80B InstructAlibaba Cloud$0.200$1.600
DeepSeek V3DeepSeek$0.270$1.100
Kimi K2.5Moonshot$0.650$3.100

Zero markup

Provider pricing with zero intermediary markup. Hardware-grade encryption at no extra cost.

Zero bytes retained

Every request executes inside a hardware-isolated enclave. No prompts, outputs, or metadata survive past completion.

Zima Enclave
Intel TDX v5 · NVIDIA CC
0bytes retained

No prompts, outputs, or metadata survive past completion

Memory Encryption

Data sealed in silicon. Encrypted at rest, in transit, and in use.

Attestation

Cryptographic proof before any key is released.

Frequently asked questions

Security, compatibility, and pricing.

No. Requests exist only for the duration of execution inside the enclave. No prompts, outputs, metadata, or logs are persisted. This is enforced by hardware, not policy.

Point the OpenAI SDK at Zima, keep model choice and pricing visibility, and move sensitive traffic into attested hardware.