Unleash Edge — Deep Study

High-performance feature flag cache layer · Rust-built · Kubernetes-ready

UNLEASH · EDGE · RUST · FEATURE FLAGS · K8S · SDK INTEGRATION

1. What Is Unleash Edge?

Caching Proxy

Sits between your apps and the Unleash server. Serves feature flag evaluations from local cache.

🦀

Written in Rust

Replaces the old Node.js Unleash Proxy. ~5MB binary. Microsecond response times. Single-digit MB RAM.

🔌

3 Modes

Edge (live upstream), Offline (static file), or combined. Adapts to any infra constraint.

☸️

K8s Native

Deploy as sidecar, DaemonSet, or standalone Deployment. Health probes, metrics, zero-config TLS.

Core idea: Instead of every app instance hitting the Unleash server for flag evaluation, Edge acts as a local cache. Apps call Edge (same cluster, <1ms), Edge syncs with upstream Unleash (HTTP SSE or polling). The Unleash server only sees Edge connections, not thousands of app connections.

Without Edge

All pods call Unleash server directly

100 pods × 5 SDK connections = 500 persistent connections to Unleash.

Unleash server is a bottleneck

Server must handle flag evaluation + polling + metrics from every app instance.

Network latency on flag reads

Each flag evaluation crosses cluster boundary to Unleash server. Adds 5–50ms per check.

With Edge

Apps call local Edge

100 pods call Edge (same node or cluster). Edge = 1 connection to Unleash server.

Sub-millisecond flag evaluation

Edge serves from in-memory cache. Flags are pre-fetched and stored locally.

Continues working if upstream down

Cache serves stale flags if Unleash server is unreachable. Configurable stale TTL.

2. Why Unleash Edge Was Built (History)

📦

Unleash Proxy (old)

Node.js, deprecated

The original proxy was a Node.js app. Single-mode (online only), limited performance, required a separate deployment. Had no built-in metrics or chaining support.

🦀

Unleash Edge (new)

Rust, open source, v3+

Complete rewrite in Rust. Multi-mode, chainable, lower memory, built-in Prometheus metrics, offline mode, streaming support, and K8s health probes out of the box.

🌍

Multi-Region Scaling

The real driver

Companies running Unleash across multiple regions needed a local cache per region that could chain to a central Unleash instance without every app calling across continents.

🔌

Air-Gapped / Offline

GitOps flags

Some environments cannot call external servers (banking, gov, strict networking). Offline mode lets you ship a static flag snapshot as a file — no upstream connection needed.

3. Architecture — How Edge Fits In

Unleash Server Admin UI + API + PostgreSQL :4242 /api/client/features SSE stream K8s Cluster — Region 1 Unleash Edge :3063 In-memory cache SSE/Polling upstream sync Service A 3 pods Service B 5 pods Service C 2 pods Metrics: /internal-backstage/prometheus Health: /health Grafana scrapes :3063/internal-backstage/prometheus K8s Cluster — Region 2 Unleash Edge :3063 In-memory cache Independent upstream sync Service D 4 pods Service E 6 pods Service F 3 pods SSE stream / polling SSE stream / polling Flag request (apps → Edge) Upstream sync (Edge → Unleash)
Key insight: Each Edge instance maintains exactly one upstream connection to Unleash. All apps in the cluster share that one connection via Edge's local cache. The Unleash server scales to N regions with just N connections instead of N × pods connections.

4. Run Modes — Edge vs Offline vs Hybrid

🔗

Edge Mode

Online — connects to upstream Unleash or another Edge

Edge connects upstream, fetches all feature flags via SSE (Server-Sent Events) or polling, stores them in memory, and serves SDK requests locally. Supports client tokens, frontend tokens, and admin tokens. Syncs continuously.

1

Edge starts with --upstream-url

Connects to https://unleash-server:4242 or another Edge instance. Authenticates with an Edge token.

2

Fetches all flags for registered tokens

For each client token that has connected to Edge, it fetches that token's feature flags from upstream. Lazy: fetches only when a client first connects.

3

Maintains SSE connection for updates

Upstream pushes flag changes immediately via Server-Sent Events. Edge updates its cache within seconds of a flag change in the Unleash UI.

4

Serves SDK requests from cache

All /api/client/features and /api/frontend calls are served from the in-memory cache — no upstream round-trip needed per request.

# Run Edge in online mode unleash-edge edge \ --upstream-url https://unleash.example.com \ --port 3063 \ --metrics-interval-seconds 60 \ --features-refresh-interval-seconds 15 # Or via environment variables (recommended for K8s) UPSTREAM_URL=https://unleash.example.com PORT=3063
🔌

Offline Mode

Static JSON file — no upstream connection

Edge reads a pre-baked JSON file containing feature toggle definitions. No internet connection needed. Perfect for air-gapped, GitOps-driven environments, or testing. The JSON can be generated by unleash-edge offline prepare or committed to git.

1

Generate bootstrap file

Run unleash-edge offline prepare --upstream-url ... --output toggles.json to download current state of all flags into a file.

2

Ship file with deployment

Commit toggles.json to git, bake it into a ConfigMap, or mount as a volume. The file IS the flag state.

3

Edge serves from file

Edge reads the file at startup and serves all SDK requests from it. No upstream connection is ever made.

Tradeoff: Flags are frozen at the time the file was generated. To update flags, you must regenerate the file, commit it, and re-deploy Edge (or reload via hot-reload if configured). This makes offline mode a GitOps-style workflow, not a real-time one.
# Step 1: Generate offline bootstrap file unleash-edge offline prepare \ --upstream-url https://unleash.example.com \ --tokens my-client-token,my-frontend-token \ --output ./toggles.json # Step 2: Run in offline mode unleash-edge offline \ --bootstrap-file ./toggles.json \ --port 3063 \ --tokens my-client-token # must match file
Hybrid mode: Start Edge with a bootstrap file (for immediate serving on boot) AND connect to upstream (to update flags in real-time). Best of both worlds: instant startup without waiting for the first upstream fetch, plus live flag updates.
# Hybrid: bootstrap file + live upstream sync unleash-edge edge \ --upstream-url https://unleash.example.com \ --bootstrap-file ./toggles.json \ --port 3063 # Startup sequence with hybrid mode: # t=0ms → Edge starts, loads toggles.json into cache # t=0ms → Edge begins serving SDK requests immediately # t=500ms → Edge connects upstream, fetches fresh flags # t=500ms → Cache updated with live data; stale entries replaced

Zero cold-start latency

Bootstrap file means Edge can serve flag requests immediately on pod restart — no waiting for upstream sync.

Resilient to upstream outage

If upstream becomes unreachable, Edge continues serving the last-known flag state from cache. No downtime for flag reads.

5. Token Types — Client, Frontend, Edge, Admin

🖥️

Client Token

Server-side SDKs

Used by server-side SDKs (Node.js, Java, Go, Python…). Receives all flag definitions + strategies. SDK evaluates flags locally. Format: *:environment.secret.

🌐

Frontend Token

Browser / mobile SDKs

Used by browser/React/mobile SDKs. Receives pre-evaluated flag results for a specific context (userId, sessionId). Edge or server does the evaluation. Format: *:environment.secret (different prefix).

🔗

Edge Token

Edge → Unleash auth

Special token that grants Edge access to all projects and environments. Edge uses this to authenticate upstream. Never expose this token to apps — it has full read access.

🔑

Admin Token

Management API

Full access to Unleash admin API. Edge does NOT proxy admin calls — only client/frontend endpoints. Admin tokens bypass Edge entirely and call Unleash server directly.

Token validation: Edge validates client tokens locally after the first upstream validation. It does not call Unleash on every request. Unknown tokens are validated upstream once and then cached. This means a revoked token may still work until Edge's cache expires (configurable).
# Token format reference # Client token: project:environment.randomsecret # Wildcard: *:production.abc123 # (all projects, production environment) # Frontend token: different secret, same format # Edge token: generated via Unleash Admin → Access → Edge tokens # How apps use tokens with Edge: # Server SDK (Node.js example): const unleash = initialize({ url: 'http://unleash-edge:3063/api', // ← point to Edge, not server appName: 'my-service', customHeaders: { Authorization: '*:production.my-client-token' } });

6. Data Flow — Request Lifecycle

App Pod SDK calls /api/client /features Unleash Edge Token Validator local cache → upstream if unknown Feature Cache HashMap<token, features> Metrics Aggregator batches usage → upstream Unleash Server /api/client/features full flag payload (all strategies) SSE /api/client/features/streaming push changes in real-time /api/client/metrics receives batched usage stats ① GET /api /client/features ② cached JSON ③ initial fetch ④ SSE push (changes) ⑤ batch metrics flush ① App → Edge: always (sub-ms response from cache) ③ Edge → Unleash: once per token (initial) then SSE ④ for real-time updates
Metrics batching: Apps send flag usage metrics (which flags were evaluated, for which users) to Edge. Edge batches these and forwards them to Unleash every N seconds. This prevents thousands of pods from hammering the metrics endpoint.
Cache warm-up: On first request from a new client token, Edge doesn't have that token's flags yet. It proxies the first request upstream (adds ~100ms), caches the result, then all subsequent requests are served from cache instantly.

7. API Endpoints Exposed by Edge

MethodPathWho calls itDescription
GET /api/client/features Server-side SDKs All feature toggle definitions + strategies for client token. SDK evaluates locally.
GET /api/client/features/streaming Server-side SDKs (SSE) SSE stream of flag changes. SDK holds open connection, receives diffs in real-time.
POST /api/client/metrics Server-side SDKs SDK sends flag usage counts. Edge batches and forwards to Unleash.
POST /api/client/register Server-side SDKs SDK registration (app name, strategies). Edge stores and forwards to Unleash.
GET /api/frontend Frontend / mobile SDKs Pre-evaluated toggles for a context (userId, sessionId). Returns enabled/disabled per flag.
POST /api/frontend/client/metrics Frontend SDKs Frontend impression tracking. Batched and forwarded.
GET /api/proxy Old Proxy clients Legacy Unleash Proxy compatibility endpoint. Same as /api/frontend.
GET /health K8s liveness probe Returns 200 OK when Edge is running. Does NOT check upstream connectivity.
GET /ready K8s readiness probe Returns 200 OK only when cache is populated and Edge is ready to serve traffic.
GET /internal-backstage/prometheus Prometheus / Grafana Prometheus metrics: request count, cache hit/miss, upstream latency, memory.
GET /internal-backstage/tokens Debug / admin Lists tokens Edge knows about. Requires edge token auth. Debug use only.
Note: /health always returns 200 even if upstream is down. Use /ready for readiness probes — it returns 503 until the cache is populated. This distinction is critical for K8s to not route traffic before Edge has flags.

8. SDK Integration — Connecting Apps to Edge

// Node.js server-side SDK → Edge import { initialize } from 'unleash-client'; const unleash = initialize({ url: 'http://unleash-edge.unleash.svc.cluster.local:3063/api', appName: 'my-service', environment: 'production', customHeaders: { Authorization: process.env.UNLEASH_CLIENT_TOKEN }, // SDK fetches flags from Edge, evaluates locally // refreshInterval: how often SDK re-fetches (Edge handles sync) refreshInterval: 15, // seconds metricsInterval: 60, // send usage metrics every 60s }); // Evaluate a flag unleash.on('synchronized', () => { const enabled = unleash.isEnabled('my-feature', { userId: 'user-123', sessionId: 'sess-456', properties: { region: 'asia' } }); });
// React frontend SDK → Edge /api/frontend import { FlagProvider, useFlag } from '@unleash/proxy-client-react'; const config = { url: 'https://edge.example.com/api/frontend', clientKey: process.env.REACT_APP_UNLEASH_FRONTEND_TOKEN, appName: 'web-app', context: { userId: currentUser.id, properties: { plan: currentUser.plan } }, refreshInterval: 30, // seconds }; // Wrap app function App() { return ( <FlagProvider config={config}> <MyApp /> </FlagProvider> ); } // Use a flag anywhere in the tree function NewCheckout() { const enabled = useFlag('new-checkout-flow'); return enabled ? <NewFlow /> : <OldFlow />; }
Frontend SDK difference: The React SDK calls /api/frontend — Edge evaluates the flags server-side using the provided context and returns only { enabled: true/false } per flag. The SDK never sees strategy definitions. This is safer for browsers (strategies may contain PII-adjacent logic).
// Go SDK → Edge import "github.com/Unleash/unleash-client-go/v3" func main() { err := unleash.Initialize( unleash.WithUrl("http://unleash-edge:3063/api"), unleash.WithAppName("go-service"), unleash.WithCustomHeaders(http.Header{ "Authorization": []string{os.Getenv("UNLEASH_CLIENT_TOKEN")}, }), unleash.WithRefreshInterval(15 * time.Second), unleash.WithMetricsInterval(60 * time.Second), ) enabled := unleash.IsEnabled("my-feature", unleash.WithContext(unleash.Context{UserId: "user-123"}), ) }
// Java Spring Boot → Edge UnleashConfig config = UnleashConfig.newBuilder() .appName("java-service") .unleashAPI("http://unleash-edge:3063/api") .customHttpHeader("Authorization", System.getenv("UNLEASH_CLIENT_TOKEN")) .fetchTogglesInterval(15) // seconds .sendMetricsInterval(60) .build(); Unleash unleash = new DefaultUnleash(config); boolean enabled = unleash.isEnabled("my-feature", new UnleashContext.Builder() .userId("user-123") .build() );

9. Kubernetes Deployment Patterns

Recommended for most cases: Deploy Edge as a separate Deployment (2–3 replicas) with a ClusterIP Service. All pods in the cluster call Edge via Service DNS. Simple, easy to scale, update independently.
apiVersion: apps/v1 kind: Deployment metadata: name: unleash-edge namespace: unleash spec: replicas: 2 # HA: 2+ replicas selector: matchLabels: { app: unleash-edge } template: metadata: labels: { app: unleash-edge } annotations: prometheus.io/scrape: "true" prometheus.io/path: "/internal-backstage/prometheus" prometheus.io/port: "3063" spec: containers: - name: unleash-edge image: unleashorg/unleash-edge:latest args: ["edge"] ports: - containerPort: 3063 env: - name: UPSTREAM_URL value: "http://unleash-server.unleash.svc:4242" - name: METRICS_INTERVAL_SECONDS value: "60" - name: FEATURES_REFRESH_INTERVAL_SECONDS value: "15" livenessProbe: httpGet: { path: /health, port: 3063 } initialDelaySeconds: 5 periodSeconds: 10 readinessProbe: httpGet: { path: /ready, port: 3063 } initialDelaySeconds: 2 periodSeconds: 5 failureThreshold: 3 resources: requests: { cpu: 50m, memory: 64Mi } limits: { cpu: 200m, memory: 256Mi } --- apiVersion: v1 kind: Service metadata: name: unleash-edge namespace: unleash spec: selector: { app: unleash-edge } ports: - port: 3063 targetPort: 3063
Sidecar pattern: Run Edge as a sidecar container in every app pod. Apps call localhost:3063. Ultra-low latency (loopback). Tradeoff: Edge process per pod, more memory usage total.
# App pod spec — Edge as sidecar spec: containers: - name: my-app image: my-app:latest env: - name: UNLEASH_URL value: "http://localhost:3063/api" # ← loopback! - name: unleash-edge image: unleashorg/unleash-edge:latest args: ["edge"] env: - name: UPSTREAM_URL value: "http://unleash-server:4242" - name: PORT value: "3063" resources: requests: { cpu: 10m, memory: 32Mi } limits: { cpu: 50m, memory: 64Mi } readinessProbe: httpGet: { path: /ready, port: 3063 }
Best for: Low-latency flag evaluation on the critical path (e.g., evaluating flags on every HTTP request). With sidecar, flag reads are loopback calls — essentially free in terms of network overhead.
DaemonSet pattern: One Edge pod per node. Apps call Edge via the node's internal IP or a hostPort. Middle ground: less total Edge processes than sidecar, lower latency than a central Deployment (same node = no inter-node traffic).
apiVersion: apps/v1 kind: DaemonSet metadata: name: unleash-edge spec: selector: matchLabels: { app: unleash-edge } template: spec: containers: - name: unleash-edge image: unleashorg/unleash-edge:latest args: ["edge"] ports: - containerPort: 3063 hostPort: 3063 # accessible via node IP env: - name: UPSTREAM_URL value: "http://unleash-server:4242" # Apps connect via downward API to get node IP: env: - name: NODE_IP valueFrom: fieldRef: { fieldPath: status.hostIP } - name: UNLEASH_URL value: "http://$(NODE_IP):3063/api"

10. Edge Chaining — Multi-Region & Hierarchical

Chaining: An Edge instance can use another Edge instance as its upstream instead of the Unleash server. This enables hierarchical topologies: central Edge → regional Edge → local Edge. Each hop caches independently.
Unleash Server :4242 Central Edge (HQ) --upstream-url=unleash-server SSE Regional Edge EU --upstream-url=central-edge Regional Edge US --upstream-url=central-edge Regional Edge APAC --upstream-url=central-edge ← EU apps connect here ← US apps connect here ← APAC apps connect here Unleash server: 1 connection (from Central Edge). Regional edges: 3 connections (to Central Edge).
Benefits of chaining:
• Unleash server sees only 1 upstream connection (Central Edge)
• Regional Edges serve traffic with lowest possible latency (local)
• Central Edge is the single sync point — no thundering herd
• Any regional Edge can be taken down without affecting others
Propagation lag with chaining:
Unleash → Central Edge: SSE (near-instant)
Central → Regional: polling interval (e.g. 15s)
Total lag from flag change to regional app: up to 15–30s
Reduce by shortening --features-refresh-interval-seconds

11. Metrics & Observability

📊

Prometheus Endpoint

/internal-backstage/prometheus

Built-in Prometheus metrics. No extra exporter needed. Scrape directly with a ServiceMonitor or annotation.

📈

Key Metrics

What to alert on

edge_request_duration_seconds, edge_upstream_latency, edge_cache_hit_total, edge_upstream_errors_total, edge_tokens_validated_total.

🔔

Flag Usage Metrics

Forwarded to Unleash

Edge batches SDK-reported flag evaluations (impression data) and POSTs them to Unleash every interval. Unleash uses this for flag adoption dashboards.

🔍

Debug Endpoints

/internal-backstage/*

/internal-backstage/tokens shows cached tokens. /internal-backstage/features/{token} dumps cached flags for a token. Restrict access in prod.

# Prometheus ServiceMonitor (if using kube-prometheus-stack) apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: unleash-edge spec: selector: matchLabels: { app: unleash-edge } endpoints: - port: http path: /internal-backstage/prometheus interval: 30s # Key alerts to configure: # 1. edge_upstream_errors_total rate > 0 → upstream connection broken # 2. edge_cache_hit_total / total requests < 0.95 → unexpected cache misses # 3. edge_request_duration_seconds p99 > 10ms → latency regression

12. Offline Mode / GitOps Workflow

1

Developer changes flag in Unleash UI

Flag state updated in Unleash server. All online Edge instances pick it up within seconds via SSE.

2

CI/CD pipeline runs offline prepare

A CI job runs unleash-edge offline prepare --output toggles.json to snapshot current flag state. Commits the file to git (or artifact store).

3

File mounted as ConfigMap in K8s

toggles.json stored as a ConfigMap, mounted into the Edge pod at /flags/toggles.json.

4

Edge serves from file, no upstream needed

Air-gapped clusters, banking environments, strict egress rules — no problem. Edge runs fully isolated.

5

Flag update = commit + redeploy

To update flags, regenerate the file, commit it, update the ConfigMap, rolling restart Edge. This is the GitOps cadence — flag changes are code-reviewed and auditable.

# K8s ConfigMap from file kubectl create configmap unleash-flags \ --from-file=toggles.json=./toggles.json \ -n my-namespace # Mount in pod volumes: - name: unleash-flags configMap: { name: unleash-flags } volumeMounts: - name: unleash-flags mountPath: /flags # Run Edge in offline mode args: ["offline", "--bootstrap-file", "/flags/toggles.json", "--tokens", "$(CLIENT_TOKEN)"]

13. Security Considerations

🔒

Token never sent to app untrusted sources

Apps supply their own token in the Authorization header. Edge validates it upstream once, caches the validation. Apps can only access their own environment's flags.

🔒

Edge token is privileged — protect it

The Edge token (used by Edge to authenticate to Unleash) has full read access. Store in K8s Secret, inject via env. Never expose to apps.

🔒

Restrict /internal-backstage in prod

These debug endpoints expose cached tokens and flag data. Use NetworkPolicy to allow only Prometheus scraping, not app traffic.

🔒

TLS between apps and Edge

Use a service mesh (Istio/Linkerd) for mTLS, or configure Edge with TLS cert (--tls-server-config). In-cluster traffic should be encrypted.

# K8s Secret for Edge token apiVersion: v1 kind: Secret metadata: name: unleash-edge-token type: Opaque stringData: token: "*:*.edge-token-secret-here" --- # NetworkPolicy: only allow app traffic on /api/* # not on /internal-backstage/* apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: unleash-edge-restrict spec: podSelector: matchLabels: { app: unleash-edge } ingress: - from: # allow all cluster traffic - podSelector: {} ports: - port: 3063

14. Edge vs Old Unleash Proxy

FeatureOld Proxy (Node.js)Edge (Rust)
Language Node.js Rust
Memory usage ~120MB typical ~10–30MB
Startup time 2–5s (JIT warm-up) <100ms
Run modes Online only Edge + Offline + Hybrid
Upstream types Unleash server only Unleash server or another Edge
Built-in metrics No Prometheus native
Health/Ready probes Manual /health + /ready built-in
Token validation Always upstream Cached locally after first validation
Offline mode Not supported Full offline from JSON file
Status Deprecated Active (current)

15. Production Readiness Checklist

Deployment

2+ Edge replicas (HA)

At least 2 Edge pods behind a Service. PodDisruptionBudget minAvailable=1.

Readiness probe on /ready

Not /health. Edge is only added to Service endpoints after cache is populated.

Bootstrap file for fast startup

Mount a recent toggles.json snapshot. Edge serves immediately on pod restart.

Resource limits set

Edge is Rust: 50–200m CPU, 64–256Mi RAM is typically sufficient for thousands of flag requests/s.

topologySpreadConstraints

Spread Edge replicas across zones. Zone failure should not take out all Edge instances.

Operations

Prometheus alerts configured

Alert on: upstream errors, high latency (p99 > 10ms), low cache hit rate.

Edge token in K8s Secret

Never hardcode the Edge token. Rotate via Secret + rollout restart.

Restrict /internal-backstage access

NetworkPolicy or ingress rules to prevent app-level access to debug endpoints.

SDK points to Edge, not Unleash server

Every service's Unleash SDK URL must be Edge's ClusterIP DNS. Audit with grep.

Validate revocation latency

Test that a disabled flag reaches all apps within your SLA window (default: seconds for online Edge).

# Quick audit: find services still pointing directly to Unleash server grep -r "unleash-server\|:4242" ./k8s/ ./src/ \ | grep -v "unleash-edge" # Any hit = service bypassing Edge (fix it) # Test Edge is serving correctly curl -s -H "Authorization: *:production.my-token" \ http://unleash-edge:3063/api/client/features | jq '.features | length' # Check Edge is synced (upstream healthy) curl -s http://unleash-edge:3063/internal-backstage/prometheus \ | grep edge_upstream_errors