go-gateway
Lightweight API gateway in pure Go. Load balancing, health checking, and rate limiting with zero external dependencies.
go-gateway is an API gateway I built from scratch using only Go's standard library. It sits in front of backend microservices and handles routing traffic, distributing load across instances, monitoring health, and preventing request abuse. The goal was to understand how these systems work at the protocol level by building one from scratch and running into every edge case along the way.
Architecture
The gateway is organized into four packages: router (request forwarding and instance selection), services (the registry and health checker), ratelimiter (token bucket middleware), and config (environment loading). There are no framework dependencies. Just net/http, sync, time, encoding/json, and atomic.
All service state lives in a single ServiceStore struct protected by a sync.RWMutex. Read operations like routing decisions and listing services take the read lock without blocking each other. Writes like registering or deleting a service take the exclusive lock. The health checker and request router run concurrently without contention on the happy path.
type ServiceStore struct {
mu sync.RWMutex
services map[string]*Service
}
type Service struct {
Name string
Instances []*Instance
}
type Instance struct {
Address string
Status InstanceStatus
ReqCount int32 // atomic
FailCount int
PingLatency int64 // ms
}Load Balancing
The selection algorithm is least-connections with ping latency as a tiebreaker. The router sorts all healthy instances by active request count using bubble sort, which is O(n²) and a deliberate choice. For a handful of instances behind a service, the simplicity of bubble sort beats the overhead of a heap or priority queue. For hundreds of instances, you need a different tool.
Request counts are tracked with atomic int32 operations. The counter increments when a request is assigned to an instance and decrements when the response completes. This gives the load balancer an accurate real-time picture with no locking overhead on the hot path.
// Increment before forwarding, decrement when done
atomic.AddInt32(&instance.ReqCount, 1)
defer atomic.AddInt32(&instance.ReqCount, -1)
resp, err := http.Get("http://" + instance.Address + path)Health Checking
A background goroutine runs on a 5-second ticker and pings the /health endpoint of every registered instance using an HTTP client with a 2-second timeout. It tracks consecutive failures per instance. After 3 failures the instance is marked StatusDown and excluded from routing decisions. The first successful response resets the counter and marks it StatusUp.
Status transitions are logged only on change to avoid flooding the output with repeated lines. After each health check sweep there is a 500ms delay before logging all current statuses, which batches the summary output instead of mixing it with live request logs.
func checkInstance(store *ServiceStore, svcName string, inst *Instance) {
client := &http.Client{Timeout: 2 * time.Second}
_, err := client.Get("http://" + inst.Address + "/health")
if err != nil {
inst.FailCount++
if inst.FailCount >= 3 && inst.Status != StatusDown {
inst.Status = StatusDown
}
return
}
inst.FailCount = 0
inst.Status = StatusUp
}Rate Limiting
Rate limiting uses a token bucket keyed by client IP and request path. Each bucket starts at 100 tokens and refills at 1 token per second based on elapsed time. Requests that arrive when a bucket is empty are rejected immediately with HTTP 429. No queuing, no waiting.
Keying by IP and endpoint rather than just IP means an abusive client on one route cannot use up the allowance for a different endpoint. The refill logic checks elapsed > 0 before updating the timestamp to prevent token leakage under high-frequency bursts.
The middleware intercepts the request before it reaches any backend, so downstream services never see rejected traffic.
func (rl *RateLimiter) Allow(ip, path string) bool {
rl.mu.Lock()
defer rl.mu.Unlock()
key := ip + ":" + path
now := time.Now()
if _, ok := rl.buckets[key]; !ok {
rl.buckets[key] = &bucket{tokens: maxTokens, last: now}
}
b := rl.buckets[key]
elapsed := now.Sub(b.last).Seconds()
if elapsed > 0 {
b.tokens = min(maxTokens, b.tokens+elapsed*refillRate)
b.last = now
}
if b.tokens < 1 {
return false
}
b.tokens--
return true
}Service Registry
Services are registered and managed entirely through a REST API with no configuration files and no restarts required. A POST to /services/{name} with a JSON array of addresses registers or replaces the full instance list for a named service. DELETE removes it. GET /services lists everything currently registered.
When a new service is registered, the gateway immediately triggers a health check on all its instances before the first real request can be routed to them. This means the system always knows the initial health state of new backends rather than assuming they are healthy and discovering otherwise under live traffic.
# Register two backend instances
curl -X POST http://localhost:8080/services/books \
-H 'Content-Type: application/json' \
-d '["localhost:9001", "localhost:9002"]'
# Route a request through the gateway
curl http://localhost:8080/route/books