AKM sits quietly in your backend and makes sure your LLM calls don't randomly collapse — right when it matters. Smart rotation, auto-failover, zero proxy.
$ npm install ai-key-manager
One key hits limit → everything breaks. Fallback logic scattered everywhere. Retry logic = guesswork. Logs accidentally exposing secrets. You've been there. AKM fixes that entire layer — in one wrapper.
Whether you're shipping in 48 hours or in production — AKM keeps your AI layer stable.
Ship fast without worrying about rate limits breaking your demo at 2am.
Judge demos never fail. Multiple keys, multiple providers — fully covered.
Free tier keys spread intelligently. Never lose an experiment to a 429.
Multi-provider setups managed automatically. Focus on the product, not the plumbing.
You write only the generation logic. AKM handles key selection, retries, cooldowns, failover, and health tracking automatically.
Greedy LRU selection picks the least-recently-used healthy key. Rate-limited keys cool in a min-heap and return to pool automatically.
Hit a rate limit? AKM marks that key, rotates to the next healthy one, and retries — all inside withRetry(). You never see the failure.
If an entire provider route breaks (404, 403, UPSTREAM_ERROR), AKM cascades to the next configured provider/model group automatically.
After a successful fallback, AKM remembers that route and prefers it on future calls. Smarter with every request.
API keys wrapped in SecretString. console.log, JSON, inspect — all show [REDACTED]. Zero leaks.
Non-secret state (key IDs, cooldowns, health) persists across restarts via FileStateAdapter. Raw keys never touch
disk.
AKM thinks in two levels — providers and their keys. Each level has independent logic.
Each provider + model pair is a route. AKM tries routes in order, remembers which one works, and falls back if one breaks. You can configure OpenRouter, Google, Vercel AI Gateway, or any custom endpoint side-by-side.
Each provider has multiple keys. AKM picks the least-recently-used healthy key, tracks health scores, and moves rate-limited keys to a cooldown heap sorted by reset time. Rate limits on one key never block another.
Don't specify any provider or model. AKM tries all configured groups in order, cascading on failure. The simplest way — and the most resilient.
93 tests. Battle-tested. Every edge case covered.
Detects 429, "rate limit", "quota", "exhausted" — marks key, waits for cooldown, retries automatically. Uses a min-heap for O(log n) scheduling.
auto-retryEvery API key is wrapped in SecretString.
It redacts itself in console.log, JSON.stringify, String(), and util.inspect. Only .value() reveals it.
Route fails with UPSTREAM_ERROR, 404, or blacklist? AKM automatically tries the next configured provider/model group. No manual fallback logic needed.
failoverKeys degrade on rate limits and recover on success. When multiple keys are equal in LRU, health score breaks the tie — always picks the most reliable key.
intelligenceUse withStreamRetry() for SSE/streaming.
Retries only on startup failures — never mid-stream. Keys marked healthy once the stream begins.
Pass an AbortSignal to cancel at any point
— before acquire, during execute, or while waiting for cooldown. Throws RetryAbortedError.
Cooldowns and health survive restarts via FileStateAdapter. Atomic write (temp +
rename). Only non-secret metadata stored — never raw keys.
Enable identity checks so AKM detects swapped environment variables after restart. Stores only an HMAC fingerprint — never the key or secret.
optionalWorks with any AI SDK — LangChain, Vercel AI, raw fetch, Anthropic SDK, anything.
import { KeyScheduler, FileStateAdapter } from "ai-key-manager" const scheduler = new KeyScheduler({ providers: [ { name: "openrouter", model: "google/gemma-4-26b-a4b-it:free", defaultCooldownMs: 60_000, keys: [ { id: "or-a7f3", value: process.env.OPENROUTER_KEY_A7F3 }, { id: "or-k2m9", value: process.env.OPENROUTER_KEY_K2M9 }, ] }, { name: "google", model: "gemini-2.5-flash", defaultCooldownMs: 60_000, keys: [ { id: "g-b1r8", value: process.env.GOOGLE_KEY_B1R8 }, { id: "g-c5t2", value: process.env.GOOGLE_KEY_C5T2 }, ] } ], state: new FileStateAdapter(".llm-key-state.json") })
// Auto-pick: no provider/model needed. // AKM picks the best available group and cascades on failure. const result = await scheduler.withRetry({ execute: async ({ apiKey, provider, model, signal }) => { // Use the injected values — they change on fallback return generateContent({ apiKey, provider, model, prompt, signal }) } })
// Targeted: start with a specific route, auto-fallback if it breaks const result = await scheduler.withRetry({ provider: "openrouter", model: "google/gemma-4-26b-a4b-it:free", // Optional explicit fallback list fallbacks: [ { provider: "google", model: "gemini-2.5-flash" }, ], execute: async ({ apiKey, provider, model }) => { return callSDK({ apiKey, provider, model, prompt }) } })
import { ChatOpenAI } from "@langchain/openai" import { KeyScheduler } from "ai-key-manager" export async function askWithLangChain(scheduler, prompt) { return scheduler.withRetry({ execute: async ({ apiKey, provider, model }) => { const llm = new ChatOpenAI({ model, apiKey, configuration: { baseURL: "https://openrouter.ai/api/v1" } }) return llm.invoke(prompt) } }) }
import { generateText } from "ai" import { createOpenAICompatible } from "@ai-sdk/openai-compatible" import { KeyScheduler } from "ai-key-manager" export async function askWithVercelAI(scheduler, prompt) { return scheduler.withRetry({ execute: async ({ apiKey, provider, model }) => { const gateway = createOpenAICompatible({ name: provider, apiKey, baseURL: "https://ai-gateway.vercel.sh/v1" }) const { text } = await generateText({ model: gateway(model), prompt }) return text } }) }
Everything you'd eventually build, tested and shipped.
| Feature | AKM ✨ | DIY | AI Proxy |
|---|---|---|---|
| LRU key selection | ✓ Built-in | ✗ You write it | ~ Maybe |
| Rate-limit cooldown heap | ✓ Min-heap, O(log n) | ✗ setTimeout hacks | ~ Basic |
| Provider failover cascade | ✓ Automatic | ✗ Nested try/catch | ~ Varies |
| Secret redaction in logs | ✓ SecretString | ✗ Oops | ✗ Rarely |
| State persistence across restarts | ✓ FileStateAdapter | ✗ Cold start every time | ~ Depends |
| Zero network calls / telemetry | ✓ 100% local | ✓ If you build right | ✗ Your keys travel |
| Works with any SDK | ✓ SDK-agnostic | ~ Per SDK | ✗ Locked in |
| 93 battle tests | ✓ Shipped | ✗ Who has time | ~ Unknown |
Add AKM to your backend in 5 minutes. No proxy. No platform. No lock-in.
$ npm i ai-key-manager