Most infrastructure only thinks about inbound. But your outbound calls have the same problems — rate limits, failures, fairness. We handle both sides.
You're the API provider. One customer hammers your endpoints. Others wait. EZThrottle gives every customer their own queue — noisy neighbors can't starve the rest.
You're calling Stripe, OpenAI, GitHub. 50 distributed workers share one API key. They all fight over the same rate limit. EZThrottle gives each user's API key its own dedicated queue — no fighting, no contention.
This is what fair queuing looks like for HTTP. The same principle that lets your Netflix stream while someone else is downloading — applied to API calls.
customer_a (1000 req) ──┐
customer_b (2 req) ──┤──▶ [ SHARED QUEUE ] ──▶ api.stripe.com
customer_c (5 req) ──┘ ↑
customer_b waits hours
customer_c never runs
The noisy neighbor problem. One heavy customer fills the queue. Everyone else starves.
customer_a ──▶ [Queue A] ──┐ customer_b ──▶ [Queue B] ──┤──▶ api.stripe.com customer_c ──▶ [Queue C] ──┘ customer_b executes at 2 req/sec ✓ customer_c executes at 5 req/sec ✓ customer_a executes at their pace ✓
Fair queuing. Every customer runs at their own pace. One flood never affects another.
Today, your single app hitting a rate limit is annoying. Tomorrow, you have 50 distributed workers all sharing sk_live_abc123 to call Stripe. They each think they have a 10 req/sec budget. They don't — they share it. Race conditions, duplicate requests, and thundering herd follow.
EZThrottle creates one dedicated queue per user, per API key — globally across your cluster. Every worker for that user routes through it. The rate limit is respected. No contention. This is resource contention solved at the infrastructure layer.
A network router doesn't just forward packets. It manages flows, prioritizes traffic, routes around failures, and ensures no single connection starves the rest.
The internet figured this out for TCP/IP in 1988 — fair queuing, QoS, BGP failover. HTTP API calls have never had this. Every app builds its own retry logic, its own rate limiting, its own failover.
EZThrottle is that layer. Between your application and the APIs it depends on. The coordination infrastructure the next generation of the internet needs.
// Network routing layer for HTTP Your App / Agent │ ▼ ┌─────────────────────────────┐ │ EZThrottle │ │ │ │ ┌──────────────────────┐ │ │ │ Per-key queues │ │ │ │ Fair scheduling │ │ │ │ Flow control │ │ │ └──────────────────────┘ │ │ │ │ ┌──────────────────────┐ │ │ │ BGP-style failover │ │ │ │ Multi-region racing │ │ │ │ Health awareness │ │ │ └──────────────────────┘ │ └─────────────────────────────┘ │ ▼ The Internet (Stripe, OpenAI, GitHub...)
Every API can fail. Your workflows shouldn't.
Your agents call dozens of APIs per workflow. One 429 at step 8/10 crashes the entire sequence. Restart from scratch. Lose all context.
Your ETL processes millions of records. Manual rate limiting is slow, brittle. A 6-hour job fails at hour 4. You start over.
You're running an API. One enterprise customer hammers it. Your free-tier customers experience timeouts. You didn't build a DDoS target — but you have one.
BEAM/OTP for distributed coordination. Syn for global process registry. Fly.io for multi-region.
Your app POSTs to EZThrottle with target URL, tier, and optional webhook. We handle the rest.
Has an API key? Routes to a dedicated AccountQueue — one per user, per API key, cluster-wide via Syn. Anonymous traffic uses consensus bidding.
2 req/sec default, adaptive via X-EZTHROTTLE-RPS header. Rate limits respected globally — not per-machine.
429 → backoff + retry. 500 → try another region. Timeout → race fallback URL. All automatic.
Result delivered to your webhook. Multi-region racing, quorum support, on_success workflow chaining.
// Each queue is a BEAM process // Each job is a lightweight process // Syn = distributed process registry // 1M processes, microsecond scheduling AccountQueue["sk_live_abc"] ← syn.whereis() AccountQueue["sk_live_xyz"] ← syn.whereis() AccountQueue["sk_live_..."] ← syn.whereis() // One queue per user, per key, cluster-wide // Machine crashes → syn detects // Next request → queue respawns // Zero coordination code needed
Add two headers to your API responses. EZThrottle reads them and adjusts in real time — no config changes, no dashboard, no coordination. Your API tells the network how fast to go.
Combined with per-user queues, MAX-CONCURRENT becomes a true global limit — enforced per user, per key, across your entire customer base.
Read the full provider guide →# Your API response headers # EZThrottle reads these automatically X-EZTHROTTLE-RPS: 10 → 10 req/sec per user, per key → adjusts in real time X-EZTHROTTLE-MAX-CONCURRENT: 5 → exactly 5 in-flight, globally → enforced per user, per key → not per machine — per user
Python, Node.js, and Go. Integrate in 30 minutes.
from ezthrottle import EZThrottle
client = EZThrottle("your_key")
resp = client.queue_request(
url="https://api.openai.com/...",
webhook_url="https://app.com/hook"
)
View on PyPI →
const { EZThrottle } = require('ezthrottle')
const client = new EZThrottle('key')
const r = await client.queueRequest({
url: 'https://api.stripe.com/...',
webhookUrl: 'https://app.com/hook'
})
View on npm →
client := ez.NewClient("key")
resp, _ := client.QueueRequest(
&ez.QueueRequest{
URL: "https://api.github.com",
WebhookURL: "https://app.com/hook",
},
)
View on GitHub →
Start free. Scale as you grow.
Small projects
$50/mo
Growing teams
$200/mo
Production scale
$499/mo
Custom needs
Custom
Queue per user available on all paid tiers. Free tier uses shared queue. · Overage: $0.0005/request over quota.
Start free. Integrate in 30 minutes. By the time your distributed agents start fighting over rate limits, you'll already have the infrastructure to handle it.
No credit card required.
Questions? support@ezthrottle.network