FAQ - EZThrottle | Questions About The API Aqueduct

Performance & Speed

Does EZThrottle make my requests faster?

Short answer: It depends on what you mean by "faster."

What EZThrottle makes faster:

✓ Eliminates retry waste: Without EZThrottle, a rate-limited request might retry 3-5 times (wasting 15-30 seconds). With EZThrottle, we only send when we know it'll succeed. Zero retries = zero wasted time.
✓ Increases effective throughput: Every 429 error counts against your rate limit quota — even though it failed. We eliminate those wasted attempts, so you get ~40% more successful requests from the same quota.
✓ Removes waiting from your code: Your app doesn't block doing exponential backoff anymore. Fire the request, move on, get webhook later.

What EZThrottle does NOT make faster:

✗ The actual API call time: If OpenAI takes 2 seconds to respond, it still takes 2 seconds. We don't make the API itself faster.
✗ Requests when you're under rate limits: If you're only sending 1 req/sec and the limit is 10 req/sec, you're not rate limited. EZThrottle adds ~100-200ms routing overhead. Don't use it if you're not hitting rate limits.

Bottom line: If you're hitting rate limits often, EZThrottle makes you way faster by eliminating retries. If you never hit rate limits, don't use EZThrottle — it'll just add latency.

What's the average request processing time?

It depends on two things:

1. Queue depth when your request arrives

• Empty queue → processed immediately (~100-200ms latency)
• 100 requests ahead → you wait for those 100 to complete first

2. The API's rate limit

• 10 req/sec limit → each request processes in ~100ms
• 1 req/sec limit → each request takes ~1 second

Example scenarios:

Scenario A: Light load, fast API

Queue depth: 5 requests | Rate limit: 10 req/sec

Your wait: ~500ms | Total: <1 second

Scenario B: Heavy load, slow API

Queue depth: 1,000 requests | Rate limit: 1 req/sec

Your wait: ~1,000 seconds | Total: 16+ minutes

Key insight: We don't make the API faster. We make the waiting predictable and your code simpler.

Can I use this for real-time/synchronous requests?

Probably not.

EZThrottle is designed for "eventually consistent" workflows where you can wait seconds, minutes, or even hours for results:

• Batch processing jobs
• Background data collection
• Overnight research tasks
• Content generation pipelines

If you need responses in <5 seconds (like user-facing API calls), EZThrottle probably isn't the right tool unless you're already severely rate limited and willing to make users wait.

Coming eventually: Priority queues where you can mark urgent requests to skip the line. Not available yet. If you need this, email support@ezthrottle.network and we'll consider building it sooner.

Rate Limiting & Efficiency

Does this increase my rate limit quota?

No. Your API's rate limit stays exactly the same.

But we maximize efficiency of your existing quota:

Without EZThrottle:

100 requests → 60 succeed, 40 hit 429 errors

Those 40 retry 2-3 times each = 80-120 wasted attempts

Result: You wasted 40%+ of your quota on retries

With EZThrottle:

100 requests → 100 succeed (we only send when they'll work)

Zero 429 errors = zero wasted attempts

Result: Same quota, 40% more successful requests

How do retries work?

We don't retry against rate limits.

Traditional approach (what you're probably doing now):

Request → 429 error (counts against quota)
Wait 1s...
Request → 429 error (counts against quota)
Wait 2s...
Request → 429 error (counts against quota)
Wait 4s...
Request → Success

Result: 4 attempts to get 1 successful request

EZThrottle approach:

Request → (we wait until we know it'll succeed)
Request → Success

Result: 1 attempt for 1 successful request

We coordinate the timing so you never hit 429s in the first place.

What rate limit do you use by default?

2 requests per second per domain — carefully chosen to be safe.

Why 2 req/sec?

• Conservative enough to avoid angering APIs
• Fast enough to be useful (7,200 req/hour)
• Proven safe across hundreds of APIs

The distributed magic:

2 req/sec per machine × 1,000 machines = 2,000 coordinated req/sec

Each machine sends slowly. Together, the network is fast. APIs see distributed organic traffic, not abuse.

Rate limits are NOT user-configurable. Here's why:

• Prevents accidental DDoS
• Protects our IP reputation with APIs
• Ensures fair usage across all tenants
• We stay responsible stewards

Need a higher rate for a specific API?

Email us at support@ezthrottle.network with:

1. The API domain
2. Official documentation showing their rate limits
3. Why you need more than 2 req/sec

We maintain a curated config of known-safe rate limits. If your API can handle more, we'll add it to the global config and everyone benefits.

APIs can also tell us directly: If an API sends the X-RateLimit-Preferred header, our entire network adapts instantly. (Example: "1 req/sec during peak, 5 req/sec at night")

Pricing & Trials

What counts as a "request"?

One request = one API call you send us.

You are NOT charged for:

• Internal retries we do (if something fails on our end)
• Webhook delivery attempts
• Dashboard API calls to check status

You ARE charged for:

• Every request you send us, even if it fails (429, 500, timeout, etc.)
• Requests that sit in queue but never get processed (if you cancel them)

Example: You send 1,000 requests to us. We process all 1,000. You're charged for 1,000 requests. Simple.

Can I try a higher tier for free?

Yes! Case Study Program.

What You Get:

✓ 3 months of Pro tier free ($1,497 value)
✓ 2 million requests/month
✓ Direct email access to founder (support@ezthrottle.network)
✓ Priority feature requests
✓ Help shape the roadmap

What We Need:

✓ Let us write a case study about your results
✓ Share metrics (can be anonymized):
- - Requests/month before & after
- - Cost/time savings
- - Problems solved
✓ Testimonial & logo on website (if comfortable)
✓ Honest feedback to improve the product

Who Qualifies:

• You have real rate limiting pain (not just curious)
• Willing to integrate within 2 weeks
• Willing to share results publicly

Limited to 5 companies. Apply: support@ezthrottle.network with subject "Case Study Program"

What if I go over my request limit?

We'll email you at 80% and 100% of your quota.

When you hit 100%: Your requests continue processing, but you're charged overage fees.

Why we charge for overages instead of blocking:

We don't want to be your API gateway handling all traffic. We're your overflow aqueduct. Overage fees discourage using EZThrottle as a proxy for every request. Only send us rate-limited traffic.

Overage pricing:

• Free tier: Requests stop at 10k (upgrade to continue)
• Paid tiers: $0.0005 per request over quota (billed next month)

To avoid overages:

1. Upgrade tier instantly (prorated, takes effect immediately)
2. Use smart routing (SDK only sends 429s, not all requests)
3. Monitor your dashboard (we show real-time usage)

Remember: EZThrottle is for overflow management. If you're hitting quota every month, you might need a higher tier — or you're routing too much traffic through us.

Security & Privacy

Do you store my API keys?

No.

You include your API keys in the request headers/body when you send requests to us. We forward them to the target API and then forget them.

We never:

• Store API keys in a database
• Log API keys
• Cache API keys

Think of us like a proxy: We see your API keys in transit (just like any proxy or CDN would), but we don't persist them anywhere.

Do you log request/response data?

Metadata only. Not the actual content.

We log:

• Timestamp
• Target URL (domain)
• HTTP status code (200, 429, 500, etc.)
• Duration
• Request ID (for debugging)

We do NOT log:

• Request body
• Response body
• Headers (except standard rate limit headers)
• Query parameters

Request/response bodies are processed in-memory only and never persisted to disk or database.

What if EZThrottle gets hacked?

Worst case scenario: Attacker sees metadata (which APIs you call, when, how often).

They CANNOT see:

• Your API keys (not stored)
• Request/response content (not stored)
• Sensitive data you're sending (not stored)

We're built this way intentionally. Zero persistent storage of sensitive data = minimal breach impact.

That said: We take security seriously. HTTPS everywhere, encrypted connections, regular updates. But I'm also honest that I'm one person building this. If you need SOC2 compliance and penetration testing, wait a year or go with a bigger vendor.

SOC2/GDPR compliance?

Not yet.

As a solo founder, I can't realistically claim SOC2 compliance right now. That requires:

• $20k-50k in auditing costs
• Months of preparation
• Multiple employees for separation of duties

GDPR: We're better positioned here since we don't store personal data. Metadata is minimal and can be deleted on request.

Timeline: If EZThrottle grows, compliance is planned for 2026. Enterprise customers can request architecture review: support@ezthrottle.network

The honest truth: If you need certified compliance today, EZThrottle isn't ready yet. If you can accept "we're privacy-focused but not yet audited," we're here.

Technical

How does the SDK actually work?

Smart routing: We only handle overflow, not your entire ocean.

The SDK doesn't blindly proxy all your requests through EZThrottle. That would be stupid and expensive.

What actually happens:

Smart Mode (Default):

1. SDK tries calling API directly first
Your request goes straight to OpenAI/Anthropic/etc.
2. If you get 200 OK → Done
No EZThrottle involved. Zero latency added. You're under rate limits.
3. If you get 429 → Route to EZThrottle
Now you're rate limited. SDK automatically sends request to our aqueduct.
4. EZThrottle queues and webhooks you
We handle the coordination, you get result later.

The Win-Win:

✓ You: Only pay for overflow (rate-limited requests)
✓ EZThrottle: Only process what you actually need help with
✓ APIs: Don't see us as a proxy (we're overflow management)

You can also force-route everything:

client.request(
    url="...",
    force_route=True  # Always go through EZThrottle
)

But why would you? You'd just add latency and burn quota faster.

Bottom line: EZThrottle is your API aqueduct (overflow management), not your API gateway (full proxy). We only route the flood, not the normal flow.

Why isn't this open source?

Because distributed request coordination could be weaponized for DDoS attacks.

EZThrottle is a distributed system that:

• Accepts requests from anyone on the internet
• Coordinates across many machines
• Sends requests to arbitrary URLs
• Can queue millions of requests

If I open sourced it, someone could:

1. Fork the code
2. Remove rate limiting safety checks
3. Deploy 1,000 instances
4. Point them all at victim.com
5. Instant DDoS-as-a-service

I'm not interested in building attack infrastructure.

What IS open source:

✓ SDKs (Python, Node, Go)
✓ Example integrations
✓ Documentation
✓ Workflow definitions (when launched)

What stays closed:

✗ Core coordination logic
✗ Distributed queue implementation
✗ Rate limiting algorithms

How does webhook delivery work?

When your request completes, we POST the result to your webhook URL.

Retry schedule:

• Immediate attempt
• Retry after 1 minute
• Retry after 5 minutes
• Retry after 15 minutes
• Retry after 1 hour
• Give up after 24 hours

Cross-region retry (coming soon): If webhook fails from one region, we'll retry from another region. This prevents single-region outages from losing your webhooks.

Webhook payload:

{
  "request_id": "req_abc123",
  "status": 200,
  "body": {...},  // API response
  "headers": {...},
  "duration_ms": 1234,
  "timestamp": "2025-01-15T10:30:00Z"
}

Results are also available via GET /results/:request_id for 7 days as backup.

What happens if my webhook endpoint is down?

We retry for 24 hours (see schedule above).

If all retries fail:

• Results are still stored for 7 days
• You can fetch them via GET /results/:request_id
• We'll email you about failed webhooks

Pro tip: Set up monitoring on your webhook endpoint. If it's down, you'll know before results start piling up.

Can I use this with synchronous code?

Not easily, and you probably shouldn't try.

EZThrottle is fundamentally async:

• You send request → get "queued" response immediately
• Request processes (seconds/minutes/hours later)
• We webhook you the result

To use it synchronously, you'd need to:

1. Send request
2. Poll GET /results/:request_id in a loop
3. Block your thread waiting
4. Return result when ready

This defeats the entire purpose. If you're blocking and polling, just do exponential backoff yourself.

Better approach: Refactor your code to be async. Use webhooks. Embrace "fire and forget." Your app will be more scalable and you'll sleep better.

How do you coordinate across regions/clusters?

Cross-cluster routing via database coordination (the magic).

Right now, we run independent clusters per region. Eventually, they'll unite into a coordinated network:

How Cross-Cluster Routing Will Work:

1. Request arrives in US-East cluster
US-East has 1,000 queued OpenAI requests, all waiting.
2. Database sees EU cluster is idle
EU has capacity and can process OpenAI requests faster right now.
3. Route a percentage of traffic to EU
20% of US-East overflow → EU cluster (configurable threshold).
4. Request completes, deleted immediately
Result webhooks to you, request record deleted. Minimal storage cost, minimal data risk.

Why this matters:

✓ Faster processing: Your request finds the cluster with capacity, not stuck waiting behind 1,000 others.
✓ Reliability: If entire US-East cluster goes down, EU can pick up the requests automatically.
✓ Low storage cost: Requests deleted immediately after completion. Only metadata persists for 30 days.
✓ Security: Minimal data at rest = minimal breach risk.

Latency consideration:

Cross-region routing adds ~50-100ms latency vs staying in-region. But if your request would wait 10 minutes in US-East queue vs 30 seconds in EU queue, that trade-off is worth it.

Timeline: Cross-cluster coordination coming Q3 2026. Right now, clusters run independently. But the database foundation is being built for this.

Reliability & Uptime

What's your uptime SLA?

I don't promise an SLA yet.

Here's why I'm being honest about this:

I'm a solo founder building infrastructure on BEAM — which is rock-solid technology that's powered telecom switches for 40 years with 99.9999999% uptime.

But: I'm still learning to master it. I'm challenging myself to build something reliable, but there will be mistakes in the first year.

I'd rather be honest about this than promise an SLA I can't guarantee and let you down.

What I CAN promise:

• Built on BEAM/OTP (fault-tolerant by design)
• Running on Fly.io (distributed across regions)
• Supervision trees automatically restart crashed processes
• Requests replicated across machines (minimal data loss)
• I'm monitoring it 24/7 and care deeply about reliability

As EZThrottle matures: SLAs will come. But not until I'm confident I can deliver them consistently. Probably 2026.

What happens during your deploys?

Goal: Rolling deploys with zero downtime.

Reality: EZThrottle may drop requests during deployments.

How deploys work:

1. Deploy new version to one machine
2. Wait for it to be healthy
3. Deploy to next machine
4. Repeat until all machines updated

Honest assessment:

In theory, we can achieve 100% reliability with perfect rolling deploys. In practice, there's a small window where requests might be dropped if timing is unlucky.

I'm working to minimize this, but I'd rather be transparent: deploys carry a tiny risk of dropped requests. Most of the time it works perfectly. Sometimes it doesn't.

Mitigation:

• Deploys happen during low-traffic windows (3-5am ET)
• Requests written to database for cross-region failover
• If request drops, your SDK can retry (it'll hit EZThrottle again)

As EZThrottle matures: This will get better. Eventually we'll have true zero-downtime deploys. But we're not there yet.

Where is my data stored?

Requests: In-memory only (not stored)

Metadata: PostgreSQL on Fly.io (encrypted at rest)

Regions:

• United States (primary)
• Europe (GDPR-friendly)
• Asia-Pacific (coming soon)

Data retention:

• Request/response bodies: 0 days (never stored)
• Metadata (logs): 30 days
• Results (for polling): 7 days

Can I get support?

Yes. Email: support@ezthrottle.network

What to expect:

Free Tier:

Community support (GitHub discussions, email when I can). No response time guarantee.

Pro Tier ($499/mo):

Email support. I'll respond within 24 hours (usually much faster). You're helping fund development, so I prioritize you.

Case Study Program:

Direct access to me. Email, call if urgent. I'll help you succeed because your success is the case study.

About EZThrottle

Who's building this?

Me. Rahmi Pruitt. Solo founder.

I'm a burnout-tired engineer who spent years building software for other people's goals. Never got to work on projects that were actually fun — like building infrastructure to help engineers sleep at night.

EZThrottle exists because I believe engineers shouldn't babysit rate limits. You should be able to close your laptop at 6pm and trust your infrastructure works.

I'm building this on BEAM because it's proven technology that lasts. I'm building it as a solo founder because I want to prove you don't need VC money and a team of 20 to build reliable infrastructure.

Connect:

• LinkedIn: linkedin.com/in/rahmi-pruitt-a1bb4a127
• YouTube: @theblacktechexperience
• Email: support@ezthrottle.network

Fair warning: I'm learning as I go. There will be bugs. There will be outages. But I'm committed to building this right, even if it takes longer. If you want polished enterprise software with an SLA from day one, wait a year or use a bigger vendor. If you want to be part of building something real, I'm here.

Why should I trust a solo founder?

You probably shouldn't trust blindly. But here's my pitch:

1. I'm incentivized to keep this running

This is my income. If EZThrottle goes down, I don't eat. I'm WAY more motivated than a BigCo engineer who gets paid whether the service works or not.

2. I'm building on proven tech (BEAM)

WhatsApp handled 2 billion users with 50 engineers. Discord scales to millions of concurrent users. BEAM is reliable — I just need to not screw it up.

3. I'm transparent about limitations

I'm not promising SLAs I can't deliver. I'm not claiming enterprise features I don't have. I'm being honest. That builds trust over time.

4. You can talk directly to me

No support tickets that disappear into the void. No 3-day response times. Email me. I'll respond. (Might take 24 hours, but I'll respond.)

Start small: Try the free tier. Send 1,000 requests. See if it works. If it does, scale up. If it doesn't, you risked nothing and learned something.

Frequently Asked Questions