Rate limiting algorithm

Posted on Thu 20 April 2023 in Journal

Abstract Rate limiting algorithm
Authors Walter Fan
 Category    learning note  
Status v1.0
Updated 2023-04-20
License CC-BY-NC-ND 4.0

Overview

Rate limiting algorithms are used to control the rate at which requests or events are allowed to occur within a system, to prevent overload or abuse. Here are some commonly used rate limiting algorithms:

  1. Token Bucket Algorithm: In this algorithm, tokens are placed in a bucket at a fixed rate. Each time a request is made, a token is removed from the bucket. If there are no tokens left in the bucket, the request is delayed or rejected.

  2. Leaky Bucket Algorithm: Similar to the token bucket algorithm, the leaky bucket algorithm allows a fixed number of requests per unit of time. However, instead of using tokens, requests are added to a bucket at a constant rate. If the bucket overflows, excess requests are delayed or rejected.

  3. Fixed Window Algorithm: In this algorithm, the number of requests allowed within a fixed time window is limited. For example, a maximum of 100 requests may be allowed within a 10 second time window. If the limit is exceeded, requests are delayed or rejected.

  4. Sliding Window Algorithm: Similar to the fixed window algorithm, the sliding window algorithm also limits the number of requests within a time window. However, the time window slides continuously, rather than being fixed. For example, if a maximum of 100 requests is allowed within a 10-second window, and 50 requests are made in the first 5 seconds, the remaining 50 requests can be made in the next 5 seconds.

  5. Token Bucket with Burst Algorithm: This algorithm is an extension of the token bucket algorithm, which allows a certain number of requests to be made at a higher rate than the fixed rate. This is useful for handling short bursts of traffic. Once the burst limit is reached, the rate returns to the fixed rate.

#include <chrono>
#include <iostream>
#include <thread>

class TokenBucket {
public:
  TokenBucket(int capacity, int refill_rate) :
      capacity_{capacity},
      refill_rate_{refill_rate},
      tokens_{0}
  {}

  bool request(int tokens) {
    // Refill tokens based on elapsed time
    auto now = std::chrono::steady_clock::now();
    int elapsed_ms = std::chrono::duration_cast<std::chrono::milliseconds>(now - last_refill_).count();
    int refill_amount = elapsed_ms * refill_rate_ / 1000;
    tokens_ = std::min(capacity_, tokens_ + refill_amount);
    last_refill_ = now;

    // Check if there are enough tokens in the bucket
    if (tokens <= tokens_) {
      tokens_ -= tokens;
      return true;
    } else {
      return false;
    }
  }

private:
  int capacity_;
  int refill_rate_;
  int tokens_;
  std::chrono::steady_clock::time_point last_refill_ = std::chrono::steady_clock::now();
};

int main() {
  TokenBucket bucket{10, 1};  // Bucket with capacity 10 and refill rate of 1 token per second

  // Make 20 requests, with a delay of 500ms between each request
  for (int i = 0; i < 20; ++i) {
    if (bucket.request(1)) {
      std::cout << "Request " << i << " granted" << std::endl;
    } else {
      std::cout << "Request " << i << " denied" << std::endl;
    }
    std::this_thread::sleep_for(std::chrono::milliseconds(500));
  }
  return 0;
}

本作品来自 chatgpt