High‑performance in‑memory HTTP cache & reverse proxy for latency‑sensitive workloads. Built in Go on top of fasthttp, with sharded storage, TinyLFU admission, background refresh, upstream controls, and minimal‑overhead observability (Prometheus + OpenTelemetry).
- Throughput: 160–170k RPS locally; ~250k RPS sustained on 24‑core bare‑metal with a 50GB cache.
- Memory safety: 1.5–3GB overhead at 50GB (no traces); ~7GB at 100% OTEL sampling.
- Hot path discipline: zero allocations, sharded counters, per‑shard LRU, TinyLFU admission.
- Control plane: runtime API for toggles (admission, eviction, refresh, compression, observability).
- Observability: Prometheus/VictoriaMetrics metrics + OpenTelemetry tracing.
- Kubernetes‑friendly: health probes, config via ConfigMap, Docker image.
Edit the CHANGEME fields and run. This is a complete config based on advcache.cfg.yaml, trimmed for a fast start but fully runnable.
cache:
env: prod
enabled: true
logs:
level: debug
runtime:
gomaxprocs: 0
api:
name: advCache.local
port: '8020 # <-- CHANGEME: API port to listen on'
upstream:
backend:
id: example-ams-web
enabled: true
policy: deny
host: service-example:8080
scheme: http
rate: 15000
concurrency: 4096
timeout: 10s
max_timeout: 1m
use_max_timeout_header: ''
healthcheck: /healthcheck
addr: http://127.0.0.1:8081 # <-- CHANGEME: your upstream origin URL
health_path: /health
compression:
enabled: true
level: 1
data:
dump:
enabled: true
dump_dir: public/dump
dump_name: cache.dump
crc32_control_sum: true
max_versions: 3
gzip: false
mock:
enabled: false
length: 1000000
storage:
mode: listing
size: 53687091200
admission:
enabled: true
capacity: 2000000
sample_multiplier: 4
shards: 256
min_table_len_per_shard: 65536
door_bits_per_counter: 12
eviction:
enabled: true
replicas: 32
soft_limit: 0.8
hard_limit: 0.99
check_interval: 100ms
lifetime:
enabled: true
ttl: 2h
on_ttl: refresh
beta: 0.35
rate: 1000
replicas: 32
coefficient: 0.25
observability:
enabled: true
service_name: advCache.local
service_version: dev
service_tenant: star
exporter: http
endpoint: 127.0.0.1:4318 # <-- CHANGEME: your OTEL Collector (http/4318 or grpc/4317)
insecure: true
sampling_mode: ratio
sampling_rate: 0.1
export_batch_size: 512
export_batch_timeout: 3s
export_max_queue: 1024
forceGC:
enabled: true
interval: 6m
metrics:
enabled: true
k8s:
probe:
timeout: 5s
rules:
/api/v2/pagedata:
cache_key:
query:
- project
- language
- timezone
headers:
- Accept-Encoding
cache_value:
headers:
- Vary
- Server
- Content-Type
- Content-Length
- Content-Encoding
- Cache-Control
What to change first:
cache.api.port— the port advCache listens on.cache.upstream.backend.addr— point to your origin.cache.compression.enabled— enable if latency budget allows (runtime‑toggle also available).cache.observability.*— setenabled: trueandendpointof your OTEL Collector; adjust sampling.cache.admission.enabled— true to protect hot set; details of TinyLFU/Doorkeeper in the main config comments.cache.upstream.policy— bothdenyandawaitare production‑ready; choose behavior:deny→ fail‑fast under pressure (good for synthetic load / when back‑pressure is handled elsewhere).await→ apply back‑pressure (preferred default in many prod setups).
Full field descriptions and advanced knobs are documented inline in the canonical
advcache.cfg.yaml.
- Main:
GET /{any:*} - Health:
GET /k8s/probe - Metrics:
GET /metrics(Prometheus/VictoriaMetrics) - Bypass:
/cache/bypass,/on,/off - Compression:
/cache/http/compression,/on,/off - Config dump:
/cache/config - Entry by key:
/cache/entry?key=<uint64> - Clear (two‑step):
/cache/clear→ then/cache/clear?token=<...> - Invalidate:
/cache/invalidate(supportsX-Entries-Removefor remove entries and_path+ any queries) - Upstream policy:
/cache/upstream/policy,/await,/deny - Evictor:
/cache/eviction,/on,/off,/scale?to=<n> - Lifetime manager:
/cache/lifetime-manager,/on,/off,/rate?to=<n>,/scale?to=<n>,/policy,/policy/remove,/policy/refresh - Force GC:
/cache/force-gc,/on,/off,/call - Admission:
/cache/admission,/on,/off - Tracing:
/cache/observability,/on,/off
- Spans:
ingress(server),upstream(client on miss/proxy),refresh(background). - When disabled: fast no‑op provider (atomic toggle only).
- When enabled: stdout exporter → sync; OTLP (
grpc/http) → batch exporter.
Enable quickly: set in YAML and/or toggle at runtime:
GET /cache/observability/on # enable tracing now
GET /cache/observability/off # disable tracinggo build -o advCache ./cmd/main.go
./advCache -cfg ./advcache.starter.yaml
# Docker
docker build -t advcache .
docker run --rm -p 8020:8020 -v "$PWD/public/dump:/app/public/dump" advcache -cfg /app/advcache.starter.yamlRemember that this depends largely on the specifics of your load.
- Local (4–6 CPU, 1–16KB docs, 20–25GB store): 160–170k RPS steady.
- Bare‑metal (24 CPU, 50GB store, prod traffic): ~250k RPS sustained.
- Memory overhead at 50GB: 1.5–3GB (no traces) • ~7GB (100% sampling).
go test ./...go test -bench . -benchmem ./...Apache‑2.0 — see LICENSE.
Maintainer: Borislav Glazunov — [email protected] · Telegram @gl_c137