logo
All projects
2024 · Active

go-gateway

Lightweight API gateway in pure Go. Load balancing, health checking, and rate limiting with zero external dependencies.

Gonet/httpgoroutinessyncatomic
View on GitHub
Overview

go-gateway is an API gateway I built from scratch using only Go's standard library. It sits in front of backend microservices and handles routing traffic, distributing load across instances, monitoring health, and preventing request abuse. The goal was to understand how these systems work at the protocol level by building one from scratch and running into every edge case along the way.

Key features
01Least-connections load balancing with ping latency as a tiebreaker. Routes each request to the instance with the fewest active requests, breaking ties by measured response time.
02Background health checking every 5 seconds. Instances are marked down after 3 consecutive failures and recover automatically on the first successful ping.
03Token bucket rate limiting per IP and endpoint. 100 tokens max with 1 token per second refill, enforced before the request reaches any backend.
04Dynamic service registry via REST API. Register, update, and remove backend instances at runtime with no configuration files and no restarts required.

Architecture

The gateway is organized into four packages: router (request forwarding and instance selection), services (the registry and health checker), ratelimiter (token bucket middleware), and config (environment loading). There are no framework dependencies. Just net/http, sync, time, encoding/json, and atomic.

All service state lives in a single ServiceStore struct protected by a sync.RWMutex. Read operations like routing decisions and listing services take the read lock without blocking each other. Writes like registering or deleting a service take the exclusive lock. The health checker and request router run concurrently without contention on the happy path.

internal/services/store.go
type ServiceStore struct {
    mu       sync.RWMutex
    services map[string]*Service
}

type Service struct {
    Name      string
    Instances []*Instance
}

type Instance struct {
    Address     string
    Status      InstanceStatus
    ReqCount    int32  // atomic
    FailCount   int
    PingLatency int64  // ms
}

Load Balancing

The selection algorithm is least-connections with ping latency as a tiebreaker. The router sorts all healthy instances by active request count using bubble sort, which is O(n²) and a deliberate choice. For a handful of instances behind a service, the simplicity of bubble sort beats the overhead of a heap or priority queue. For hundreds of instances, you need a different tool.

Request counts are tracked with atomic int32 operations. The counter increments when a request is assigned to an instance and decrements when the response completes. This gives the load balancer an accurate real-time picture with no locking overhead on the hot path.

internal/router/router.go
// Increment before forwarding, decrement when done
atomic.AddInt32(&instance.ReqCount, 1)
defer atomic.AddInt32(&instance.ReqCount, -1)

resp, err := http.Get("http://" + instance.Address + path)

Health Checking

A background goroutine runs on a 5-second ticker and pings the /health endpoint of every registered instance using an HTTP client with a 2-second timeout. It tracks consecutive failures per instance. After 3 failures the instance is marked StatusDown and excluded from routing decisions. The first successful response resets the counter and marks it StatusUp.

Status transitions are logged only on change to avoid flooding the output with repeated lines. After each health check sweep there is a 500ms delay before logging all current statuses, which batches the summary output instead of mixing it with live request logs.

internal/services/health.go
func checkInstance(store *ServiceStore, svcName string, inst *Instance) {
    client := &http.Client{Timeout: 2 * time.Second}
    _, err := client.Get("http://" + inst.Address + "/health")
    if err != nil {
        inst.FailCount++
        if inst.FailCount >= 3 && inst.Status != StatusDown {
            inst.Status = StatusDown
        }
        return
    }
    inst.FailCount = 0
    inst.Status = StatusUp
}

Rate Limiting

Rate limiting uses a token bucket keyed by client IP and request path. Each bucket starts at 100 tokens and refills at 1 token per second based on elapsed time. Requests that arrive when a bucket is empty are rejected immediately with HTTP 429. No queuing, no waiting.

Keying by IP and endpoint rather than just IP means an abusive client on one route cannot use up the allowance for a different endpoint. The refill logic checks elapsed > 0 before updating the timestamp to prevent token leakage under high-frequency bursts.

The middleware intercepts the request before it reaches any backend, so downstream services never see rejected traffic.

internal/ratelimiter/ratelimiter.go
func (rl *RateLimiter) Allow(ip, path string) bool {
    rl.mu.Lock()
    defer rl.mu.Unlock()

    key := ip + ":" + path
    now := time.Now()
    if _, ok := rl.buckets[key]; !ok {
        rl.buckets[key] = &bucket{tokens: maxTokens, last: now}
    }

    b := rl.buckets[key]
    elapsed := now.Sub(b.last).Seconds()
    if elapsed > 0 {
        b.tokens = min(maxTokens, b.tokens+elapsed*refillRate)
        b.last = now
    }

    if b.tokens < 1 {
        return false
    }
    b.tokens--
    return true
}

Service Registry

Services are registered and managed entirely through a REST API with no configuration files and no restarts required. A POST to /services/{name} with a JSON array of addresses registers or replaces the full instance list for a named service. DELETE removes it. GET /services lists everything currently registered.

When a new service is registered, the gateway immediately triggers a health check on all its instances before the first real request can be routed to them. This means the system always knows the initial health state of new backends rather than assuming they are healthy and discovering otherwise under live traffic.

Register a service
# Register two backend instances
curl -X POST http://localhost:8080/services/books \
  -H 'Content-Type: application/json' \
  -d '["localhost:9001", "localhost:9002"]'

# Route a request through the gateway
curl http://localhost:8080/route/books