Honest answers about what EZThrottle does, doesn't do, and why it exists.
Short answer: It depends on what you mean by "faster."
What EZThrottle makes faster:
What EZThrottle does NOT make faster:
Bottom line: If you're hitting rate limits often, EZThrottle makes you way faster by eliminating retries. If you never hit rate limits, don't use EZThrottle — it'll just add latency.
It depends on two things:
1. Queue depth when your request arrives
2. The API's rate limit
Example scenarios:
Scenario A: Light load, fast API
Queue depth: 5 requests | Rate limit: 10 req/sec
Your wait: ~500ms | Total: <1 second
Scenario B: Heavy load, slow API
Queue depth: 1,000 requests | Rate limit: 1 req/sec
Your wait: ~1,000 seconds | Total: 16+ minutes
Key insight: We don't make the API faster. We make the waiting predictable and your code simpler.
Probably not.
EZThrottle is designed for "eventually consistent" workflows where you can wait seconds, minutes, or even hours for results:
If you need responses in <5 seconds (like user-facing API calls), EZThrottle probably isn't the right tool unless you're already severely rate limited and willing to make users wait.
Coming eventually: Priority queues where you can mark urgent requests to skip the line. Not available yet. If you need this, email support@ezthrottle.network and we'll consider building it sooner.
No. Your API's rate limit stays exactly the same.
But we maximize efficiency of your existing quota:
Without EZThrottle:
100 requests → 60 succeed, 40 hit 429 errors
Those 40 retry 2-3 times each = 80-120 wasted attempts
Result: You wasted 40%+ of your quota on retries
With EZThrottle:
100 requests → 100 succeed (we only send when they'll work)
Zero 429 errors = zero wasted attempts
Result: Same quota, 40% more successful requests
We don't retry against rate limits.
Traditional approach (what you're probably doing now):
Request → 429 error (counts against quota)
Wait 1s...
Request → 429 error (counts against quota)
Wait 2s...
Request → 429 error (counts against quota)
Wait 4s...
Request → Success
Result: 4 attempts to get 1 successful request
EZThrottle approach:
Request → (we wait until we know it'll succeed)
Request → Success
Result: 1 attempt for 1 successful request
We coordinate the timing so you never hit 429s in the first place.
2 requests per second per domain — carefully chosen to be safe.
Why 2 req/sec?
The distributed magic:
2 req/sec per machine × 1,000 machines = 2,000 coordinated req/sec
Each machine sends slowly. Together, the network is fast. APIs see distributed organic traffic, not abuse.
Rate limits are NOT user-configurable. Here's why:
Need a higher rate for a specific API?
Email us at support@ezthrottle.network with:
We maintain a curated config of known-safe rate limits. If your API can handle more, we'll add it to the global config and everyone benefits.
APIs can also tell us directly: If an API sends the X-RateLimit-Preferred header, our entire network adapts instantly. (Example: "1 req/sec during peak, 5 req/sec at night")
One request = one API call you send us.
You are NOT charged for:
You ARE charged for:
Example: You send 1,000 requests to us. We process all 1,000. You're charged for 1,000 requests. Simple.
Yes! Case Study Program.
What You Get:
What We Need:
Who Qualifies:
Limited to 5 companies. Apply: support@ezthrottle.network with subject "Case Study Program"
We'll email you at 80% and 100% of your quota.
When you hit 100%: Your requests continue processing, but you're charged overage fees.
Why we charge for overages instead of blocking:
We don't want to be your API gateway handling all traffic. We're your overflow aqueduct. Overage fees discourage using EZThrottle as a proxy for every request. Only send us rate-limited traffic.
Overage pricing:
To avoid overages:
Remember: EZThrottle is for overflow management. If you're hitting quota every month, you might need a higher tier — or you're routing too much traffic through us.
No.
You include your API keys in the request headers/body when you send requests to us. We forward them to the target API and then forget them.
We never:
Think of us like a proxy: We see your API keys in transit (just like any proxy or CDN would), but we don't persist them anywhere.
Metadata only. Not the actual content.
We log:
We do NOT log:
Request/response bodies are processed in-memory only and never persisted to disk or database.
Worst case scenario: Attacker sees metadata (which APIs you call, when, how often).
They CANNOT see:
We're built this way intentionally. Zero persistent storage of sensitive data = minimal breach impact.
That said: We take security seriously. HTTPS everywhere, encrypted connections, regular updates. But I'm also honest that I'm one person building this. If you need SOC2 compliance and penetration testing, wait a year or go with a bigger vendor.
Not yet.
As a solo founder, I can't realistically claim SOC2 compliance right now. That requires:
GDPR: We're better positioned here since we don't store personal data. Metadata is minimal and can be deleted on request.
Timeline: If EZThrottle grows, compliance is planned for 2026. Enterprise customers can request architecture review: support@ezthrottle.network
The honest truth: If you need certified compliance today, EZThrottle isn't ready yet. If you can accept "we're privacy-focused but not yet audited," we're here.
Smart routing: We only handle overflow, not your entire ocean.
The SDK doesn't blindly proxy all your requests through EZThrottle. That would be stupid and expensive.
What actually happens:
Smart Mode (Default):
Your request goes straight to OpenAI/Anthropic/etc.
No EZThrottle involved. Zero latency added. You're under rate limits.
Now you're rate limited. SDK automatically sends request to our aqueduct.
We handle the coordination, you get result later.
The Win-Win:
You can also force-route everything:
client.request(
url="...",
force_route=True # Always go through EZThrottle
)
But why would you? You'd just add latency and burn quota faster.
Bottom line: EZThrottle is your API aqueduct (overflow management), not your API gateway (full proxy). We only route the flood, not the normal flow.
Because distributed request coordination could be weaponized for DDoS attacks.
EZThrottle is a distributed system that:
If I open sourced it, someone could:
I'm not interested in building attack infrastructure.
What IS open source:
What stays closed:
When your request completes, we POST the result to your webhook URL.
Retry schedule:
Cross-region retry (coming soon): If webhook fails from one region, we'll retry from another region. This prevents single-region outages from losing your webhooks.
Webhook payload:
{
"request_id": "req_abc123",
"status": 200,
"body": {...}, // API response
"headers": {...},
"duration_ms": 1234,
"timestamp": "2025-01-15T10:30:00Z"
}
Results are also available via GET /results/:request_id for 7 days as backup.
We retry for 24 hours (see schedule above).
If all retries fail:
Pro tip: Set up monitoring on your webhook endpoint. If it's down, you'll know before results start piling up.
Not easily, and you probably shouldn't try.
EZThrottle is fundamentally async:
To use it synchronously, you'd need to:
This defeats the entire purpose. If you're blocking and polling, just do exponential backoff yourself.
Better approach: Refactor your code to be async. Use webhooks. Embrace "fire and forget." Your app will be more scalable and you'll sleep better.
Cross-cluster routing via database coordination (the magic).
Right now, we run independent clusters per region. Eventually, they'll unite into a coordinated network:
How Cross-Cluster Routing Will Work:
US-East has 1,000 queued OpenAI requests, all waiting.
EU has capacity and can process OpenAI requests faster right now.
20% of US-East overflow → EU cluster (configurable threshold).
Result webhooks to you, request record deleted. Minimal storage cost, minimal data risk.
Why this matters:
Latency consideration:
Cross-region routing adds ~50-100ms latency vs staying in-region. But if your request would wait 10 minutes in US-East queue vs 30 seconds in EU queue, that trade-off is worth it.
Timeline: Cross-cluster coordination coming Q3 2026. Right now, clusters run independently. But the database foundation is being built for this.
I don't promise an SLA yet.
Here's why I'm being honest about this:
I'm a solo founder building infrastructure on BEAM — which is rock-solid technology that's powered telecom switches for 40 years with 99.9999999% uptime.
But: I'm still learning to master it. I'm challenging myself to build something reliable, but there will be mistakes in the first year.
I'd rather be honest about this than promise an SLA I can't guarantee and let you down.
What I CAN promise:
As EZThrottle matures: SLAs will come. But not until I'm confident I can deliver them consistently. Probably 2026.
Goal: Rolling deploys with zero downtime.
Reality: EZThrottle may drop requests during deployments.
How deploys work:
Honest assessment:
In theory, we can achieve 100% reliability with perfect rolling deploys. In practice, there's a small window where requests might be dropped if timing is unlucky.
I'm working to minimize this, but I'd rather be transparent: deploys carry a tiny risk of dropped requests. Most of the time it works perfectly. Sometimes it doesn't.
Mitigation:
As EZThrottle matures: This will get better. Eventually we'll have true zero-downtime deploys. But we're not there yet.
Requests: In-memory only (not stored)
Metadata: PostgreSQL on Fly.io (encrypted at rest)
Regions:
Data retention:
Yes. Email: support@ezthrottle.network
What to expect:
Free Tier:
Community support (GitHub discussions, email when I can). No response time guarantee.
Pro Tier ($499/mo):
Email support. I'll respond within 24 hours (usually much faster). You're helping fund development, so I prioritize you.
Case Study Program:
Direct access to me. Email, call if urgent. I'll help you succeed because your success is the case study.
Me. Rahmi Pruitt. Solo founder.
I'm a burnout-tired engineer who spent years building software for other people's goals. Never got to work on projects that were actually fun — like building infrastructure to help engineers sleep at night.
EZThrottle exists because I believe engineers shouldn't babysit rate limits. You should be able to close your laptop at 6pm and trust your infrastructure works.
I'm building this on BEAM because it's proven technology that lasts. I'm building it as a solo founder because I want to prove you don't need VC money and a team of 20 to build reliable infrastructure.
Connect:
Fair warning: I'm learning as I go. There will be bugs. There will be outages. But I'm committed to building this right, even if it takes longer. If you want polished enterprise software with an SLA from day one, wait a year or use a bigger vendor. If you want to be part of building something real, I'm here.
You probably shouldn't trust blindly. But here's my pitch:
1. I'm incentivized to keep this running
This is my income. If EZThrottle goes down, I don't eat. I'm WAY more motivated than a BigCo engineer who gets paid whether the service works or not.
2. I'm building on proven tech (BEAM)
WhatsApp handled 2 billion users with 50 engineers. Discord scales to millions of concurrent users. BEAM is reliable — I just need to not screw it up.
3. I'm transparent about limitations
I'm not promising SLAs I can't deliver. I'm not claiming enterprise features I don't have. I'm being honest. That builds trust over time.
4. You can talk directly to me
No support tickets that disappear into the void. No 3-day response times. Email me. I'll respond. (Might take 24 hours, but I'll respond.)
Start small: Try the free tier. Send 1,000 requests. See if it works. If it does, scale up. If it doesn't, you risked nothing and learned something.