Overview API rate limits are restrictions placed on how many requests clients can make to an API within a certain timeframe. Proper rate limiting defends your API from abuse, ensures fair resource distribution, and protects backend performance. In AWS API Gateway, rate limits are configurable at the stage or method level and are enforced automatically.

Current Configuration

Rate Limit: 10 requests per hour per API client

How the Rate Limit Works

Each client is allowed up to 10 requests in any given rolling 1-hour window.
If you require a higher rate limit, consider upgrading your account to an enterprise account by contacting our sales team, enterprise account can give you a rate limit of 240 video per hour or more.
If a client exceeds this threshold, AWS API Gateway will automatically reject subsequent requests with an HTTP 429 Too Many Requests response.
Rate limiting is tracked and enforced independently for each client identity.

Handling Rate Limits: Best Practices

Error Handling in Clients
- Always check for HTTP status code 429 Too Many Requests in your client applications.
- Implement retry logic with exponential backoff—for example, wait 5, 15, 30, then 60 seconds before retrying.
- Respect the Retry-After header (if provided by AWS) which indicates when it is safe to retry.
Client Throttling
- Proactively throttle requests in your client to remain below 10 requests per hour.
- Use a counter or token bucket algorithm to track usage locally.
Communicate Rate Limits
- Document the rate limit for your API consumers.
- Surface appropriate feedback/UI if you’re building apps or dashboards.
Monitoring and Logging
- Monitor for 429 responses to detect clients that are being throttled.
- Use AWS CloudWatch to track request counts and throttling metrics for the API Gateway stage.
Graceful Degradation
- If requests are rejected, degrade functionality smoothly (e.g., display cached or static data, notify the user of limited functionality).
Scaling Up (If Business Allows)
- If higher throughput is needed, consider requesting an increase in the rate limit or segmenting API functionality into multiple endpoints with independent limits.

Code Example: Handling 429 in a Client (Python)

import requests
import time

def make_api_call(url, retries=5):
    for attempt in range(retries):
        response = requests.get(url)
        if response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 60))
            print(f"Rate limit hit, retrying in {retry_after} seconds...")
            time.sleep(retry_after)
        else:
            return response
    raise Exception("Max retries exceeded due to rate limiting.")

Summary Table

Limit Type	Value	Behavior When Exceeded
Requests/hour	10 (per client)	HTTP 429, request rejected

Conclusion API rate limiting is essential for fairness and backend protection. By handling limits proactively with client-side throttling, robust error handling, and clear communication, you ensure reliable API usage under the AWS API Gateway configuration of 10 requests per hour.