Unleash Edge — Deep Study: Architecture, Modes, K8s & SDK Integration

1. What Is Unleash Edge?

⚡

Caching Proxy

Sits between your apps and the Unleash server. Serves feature flag evaluations from local cache.

🦀

Written in Rust

Replaces the old Node.js Unleash Proxy. ~5MB binary. Microsecond response times. Single-digit MB RAM.

🔌

3 Modes

Edge (live upstream), Offline (static file), or combined. Adapts to any infra constraint.

☸️

K8s Native

Deploy as sidecar, DaemonSet, or standalone Deployment. Health probes, metrics, zero-config TLS.

Core idea: Instead of every app instance hitting the Unleash server for flag evaluation, Edge acts as a local cache. Apps call Edge (same cluster, <1ms), Edge syncs with upstream Unleash (HTTP SSE or polling). The Unleash server only sees Edge connections, not thousands of app connections.

Without Edge

✗

All pods call Unleash server directly

100 pods × 5 SDK connections = 500 persistent connections to Unleash.

✗

Unleash server is a bottleneck

Server must handle flag evaluation + polling + metrics from every app instance.

✗

Network latency on flag reads

Each flag evaluation crosses cluster boundary to Unleash server. Adds 5–50ms per check.

With Edge

✓

Apps call local Edge

100 pods call Edge (same node or cluster). Edge = 1 connection to Unleash server.

✓

Sub-millisecond flag evaluation

Edge serves from in-memory cache. Flags are pre-fetched and stored locally.

✓

Continues working if upstream down

Cache serves stale flags if Unleash server is unreachable. Configurable stale TTL.

2. Why Unleash Edge Was Built (History)

📦

Unleash Proxy (old)

Node.js, deprecated

The original proxy was a Node.js app. Single-mode (online only), limited performance, required a separate deployment. Had no built-in metrics or chaining support.

🦀

Unleash Edge (new)

Rust, open source, v3+

Complete rewrite in Rust. Multi-mode, chainable, lower memory, built-in Prometheus metrics, offline mode, streaming support, and K8s health probes out of the box.

🌍

Multi-Region Scaling

The real driver

Companies running Unleash across multiple regions needed a local cache per region that could chain to a central Unleash instance without every app calling across continents.

🔌

Air-Gapped / Offline

GitOps flags

Some environments cannot call external servers (banking, gov, strict networking). Offline mode lets you ship a static flag snapshot as a file — no upstream connection needed.

3. Architecture — How Edge Fits In

Key insight: Each Edge instance maintains exactly one upstream connection to Unleash. All apps in the cluster share that one connection via Edge's local cache. The Unleash server scales to N regions with just N connections instead of N × pods connections.

4. Run Modes — Edge vs Offline vs Hybrid

🔗

Edge Mode

Online — connects to upstream Unleash or another Edge

Edge connects upstream, fetches all feature flags via SSE (Server-Sent Events) or polling, stores them in memory, and serves SDK requests locally. Supports client tokens, frontend tokens, and admin tokens. Syncs continuously.

Edge starts with --upstream-url

Connects to https://unleash-server:4242 or another Edge instance. Authenticates with an Edge token.

Fetches all flags for registered tokens

For each client token that has connected to Edge, it fetches that token's feature flags from upstream. Lazy: fetches only when a client first connects.

Maintains SSE connection for updates

Upstream pushes flag changes immediately via Server-Sent Events. Edge updates its cache within seconds of a flag change in the Unleash UI.

Serves SDK requests from cache

All /api/client/features and /api/frontend calls are served from the in-memory cache — no upstream round-trip needed per request.

# Run Edge in online mode
unleash-edge edge \
  --upstream-url https://unleash.example.com \
  --port 3063 \
  --metrics-interval-seconds 60 \
  --features-refresh-interval-seconds 15

# Or via environment variables (recommended for K8s)
UPSTREAM_URL=https://unleash.example.com
PORT=3063
    

🔌

Offline Mode

Static JSON file — no upstream connection

Edge reads a pre-baked JSON file containing feature toggle definitions. No internet connection needed. Perfect for air-gapped, GitOps-driven environments, or testing. The JSON can be generated by unleash-edge offline prepare or committed to git.

Generate bootstrap file

Run unleash-edge offline prepare --upstream-url ... --output toggles.json to download current state of all flags into a file.

Ship file with deployment

Commit toggles.json to git, bake it into a ConfigMap, or mount as a volume. The file IS the flag state.

Edge serves from file

Edge reads the file at startup and serves all SDK requests from it. No upstream connection is ever made.

Tradeoff: Flags are frozen at the time the file was generated. To update flags, you must regenerate the file, commit it, and re-deploy Edge (or reload via hot-reload if configured). This makes offline mode a GitOps-style workflow, not a real-time one.

# Step 1: Generate offline bootstrap file
unleash-edge offline prepare \
  --upstream-url https://unleash.example.com \
  --tokens my-client-token,my-frontend-token \
  --output ./toggles.json

# Step 2: Run in offline mode
unleash-edge offline \
  --bootstrap-file ./toggles.json \
  --port 3063 \
  --tokens my-client-token  # must match file
    

Hybrid mode: Start Edge with a bootstrap file (for immediate serving on boot) AND connect to upstream (to update flags in real-time). Best of both worlds: instant startup without waiting for the first upstream fetch, plus live flag updates.

# Hybrid: bootstrap file + live upstream sync
unleash-edge edge \
  --upstream-url    https://unleash.example.com \
  --bootstrap-file  ./toggles.json \
  --port            3063

# Startup sequence with hybrid mode:
# t=0ms  → Edge starts, loads toggles.json into cache
# t=0ms  → Edge begins serving SDK requests immediately
# t=500ms → Edge connects upstream, fetches fresh flags
# t=500ms → Cache updated with live data; stale entries replaced
    

✓

Zero cold-start latency

Bootstrap file means Edge can serve flag requests immediately on pod restart — no waiting for upstream sync.

✓

Resilient to upstream outage

If upstream becomes unreachable, Edge continues serving the last-known flag state from cache. No downtime for flag reads.

5. Token Types — Client, Frontend, Edge, Admin

🖥️

Client Token

Server-side SDKs

Used by server-side SDKs (Node.js, Java, Go, Python…). Receives all flag definitions + strategies. SDK evaluates flags locally. Format: *:environment.secret.

🌐

Frontend Token

Browser / mobile SDKs

Used by browser/React/mobile SDKs. Receives pre-evaluated flag results for a specific context (userId, sessionId). Edge or server does the evaluation. Format: *:environment.secret (different prefix).

🔗

Edge Token

Edge → Unleash auth

Special token that grants Edge access to all projects and environments. Edge uses this to authenticate upstream. Never expose this token to apps — it has full read access.

🔑

Admin Token

Management API

Full access to Unleash admin API. Edge does NOT proxy admin calls — only client/frontend endpoints. Admin tokens bypass Edge entirely and call Unleash server directly.

Token validation: Edge validates client tokens locally after the first upstream validation. It does not call Unleash on every request. Unknown tokens are validated upstream once and then cached. This means a revoked token may still work until Edge's cache expires (configurable).

# Token format reference
# Client token:   project:environment.randomsecret
# Wildcard:       *:production.abc123
#                 (all projects, production environment)

# Frontend token: different secret, same format
# Edge token:     generated via Unleash Admin → Access → Edge tokens

# How apps use tokens with Edge:
# Server SDK (Node.js example):
const unleash = initialize({
  url: 'http://unleash-edge:3063/api',   // ← point to Edge, not server
  appName: 'my-service',
  customHeaders: { Authorization: '*:production.my-client-token' }
});
  

6. Data Flow — Request Lifecycle

Metrics batching: Apps send flag usage metrics (which flags were evaluated, for which users) to Edge. Edge batches these and forwards them to Unleash every N seconds. This prevents thousands of pods from hammering the metrics endpoint.

Cache warm-up: On first request from a new client token, Edge doesn't have that token's flags yet. It proxies the first request upstream (adds ~100ms), caches the result, then all subsequent requests are served from cache instantly.

7. API Endpoints Exposed by Edge

Method	Path	Who calls it	Description
GET	/api/client/features	Server-side SDKs	All feature toggle definitions + strategies for client token. SDK evaluates locally.
GET	/api/client/features/streaming	Server-side SDKs (SSE)	SSE stream of flag changes. SDK holds open connection, receives diffs in real-time.
POST	/api/client/metrics	Server-side SDKs	SDK sends flag usage counts. Edge batches and forwards to Unleash.
POST	/api/client/register	Server-side SDKs	SDK registration (app name, strategies). Edge stores and forwards to Unleash.
GET	/api/frontend	Frontend / mobile SDKs	Pre-evaluated toggles for a context (userId, sessionId). Returns enabled/disabled per flag.
POST	/api/frontend/client/metrics	Frontend SDKs	Frontend impression tracking. Batched and forwarded.
GET	/api/proxy	Old Proxy clients	Legacy Unleash Proxy compatibility endpoint. Same as /api/frontend.
GET	/health	K8s liveness probe	Returns 200 OK when Edge is running. Does NOT check upstream connectivity.
GET	/ready	K8s readiness probe	Returns 200 OK only when cache is populated and Edge is ready to serve traffic.
GET	/internal-backstage/prometheus	Prometheus / Grafana	Prometheus metrics: request count, cache hit/miss, upstream latency, memory.
GET	/internal-backstage/tokens	Debug / admin	Lists tokens Edge knows about. Requires edge token auth. Debug use only.

Note: /health always returns 200 even if upstream is down. Use /ready for readiness probes — it returns 503 until the cache is populated. This distinction is critical for K8s to not route traffic before Edge has flags.

8. SDK Integration — Connecting Apps to Edge

// Node.js server-side SDK → Edge
import { initialize } from 'unleash-client';

const unleash = initialize({
  url: 'http://unleash-edge.unleash.svc.cluster.local:3063/api',
  appName: 'my-service',
  environment: 'production',
  customHeaders: {
    Authorization: process.env.UNLEASH_CLIENT_TOKEN
  },

  // SDK fetches flags from Edge, evaluates locally
  // refreshInterval: how often SDK re-fetches (Edge handles sync)
  refreshInterval: 15,      // seconds
  metricsInterval: 60,      // send usage metrics every 60s
});

// Evaluate a flag
unleash.on('synchronized', () => {
  const enabled = unleash.isEnabled('my-feature', {
    userId: 'user-123',
    sessionId: 'sess-456',
    properties: { region: 'asia' }
  });
});
    

// React frontend SDK → Edge /api/frontend
import { FlagProvider, useFlag } from '@unleash/proxy-client-react';

const config = {
  url: 'https://edge.example.com/api/frontend',
  clientKey: process.env.REACT_APP_UNLEASH_FRONTEND_TOKEN,
  appName: 'web-app',
  context: {
    userId: currentUser.id,
    properties: { plan: currentUser.plan }
  },
  refreshInterval: 30,   // seconds
};

// Wrap app
function App() {
  return (
    <FlagProvider config={config}>
      <MyApp />
    </FlagProvider>
  );
}

// Use a flag anywhere in the tree
function NewCheckout() {
  const enabled = useFlag('new-checkout-flow');
  return enabled ? <NewFlow /> : <OldFlow />;
}
    

Frontend SDK difference: The React SDK calls /api/frontend — Edge evaluates the flags server-side using the provided context and returns only { enabled: true/false } per flag. The SDK never sees strategy definitions. This is safer for browsers (strategies may contain PII-adjacent logic).

// Go SDK → Edge
import "github.com/Unleash/unleash-client-go/v3"

func main() {
  err := unleash.Initialize(
    unleash.WithUrl("http://unleash-edge:3063/api"),
    unleash.WithAppName("go-service"),
    unleash.WithCustomHeaders(http.Header{
      "Authorization": []string{os.Getenv("UNLEASH_CLIENT_TOKEN")},
    }),
    unleash.WithRefreshInterval(15 * time.Second),
    unleash.WithMetricsInterval(60 * time.Second),
  )

  enabled := unleash.IsEnabled("my-feature",
    unleash.WithContext(unleash.Context{UserId: "user-123"}),
  )
}
    

// Java Spring Boot → Edge
UnleashConfig config = UnleashConfig.newBuilder()
  .appName("java-service")
  .unleashAPI("http://unleash-edge:3063/api")
  .customHttpHeader("Authorization", System.getenv("UNLEASH_CLIENT_TOKEN"))
  .fetchTogglesInterval(15)     // seconds
  .sendMetricsInterval(60)
  .build();

Unleash unleash = new DefaultUnleash(config);

boolean enabled = unleash.isEnabled("my-feature",
  new UnleashContext.Builder()
    .userId("user-123")
    .build()
);
    

9. Kubernetes Deployment Patterns

Recommended for most cases: Deploy Edge as a separate Deployment (2–3 replicas) with a ClusterIP Service. All pods in the cluster call Edge via Service DNS. Simple, easy to scale, update independently.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: unleash-edge
  namespace: unleash
spec:
  replicas: 2                           # HA: 2+ replicas
  selector:
    matchLabels: { app: unleash-edge }
  template:
    metadata:
      labels: { app: unleash-edge }
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path:   "/internal-backstage/prometheus"
        prometheus.io/port:   "3063"
    spec:
      containers:
      - name: unleash-edge
        image: unleashorg/unleash-edge:latest
        args: ["edge"]
        ports:
        - containerPort: 3063
        env:
        - name: UPSTREAM_URL
          value: "http://unleash-server.unleash.svc:4242"
        - name: METRICS_INTERVAL_SECONDS
          value: "60"
        - name: FEATURES_REFRESH_INTERVAL_SECONDS
          value: "15"
        livenessProbe:
          httpGet: { path: /health, port: 3063 }
          initialDelaySeconds: 5
          periodSeconds: 10
        readinessProbe:
          httpGet: { path: /ready, port: 3063 }
          initialDelaySeconds: 2
          periodSeconds: 5
          failureThreshold: 3
        resources:
          requests: { cpu: 50m,  memory: 64Mi  }
          limits:   { cpu: 200m, memory: 256Mi }
---
apiVersion: v1
kind: Service
metadata:
  name: unleash-edge
  namespace: unleash
spec:
  selector: { app: unleash-edge }
  ports:
  - port: 3063
    targetPort: 3063
    

Sidecar pattern: Run Edge as a sidecar container in every app pod. Apps call localhost:3063. Ultra-low latency (loopback). Tradeoff: Edge process per pod, more memory usage total.

# App pod spec — Edge as sidecar
spec:
  containers:
  - name: my-app
    image: my-app:latest
    env:
    - name: UNLEASH_URL
      value: "http://localhost:3063/api"  # ← loopback!

  - name: unleash-edge
    image: unleashorg/unleash-edge:latest
    args: ["edge"]
    env:
    - name: UPSTREAM_URL
      value: "http://unleash-server:4242"
    - name: PORT
      value: "3063"
    resources:
      requests: { cpu: 10m, memory: 32Mi }
      limits:   { cpu: 50m, memory: 64Mi }
    readinessProbe:
      httpGet: { path: /ready, port: 3063 }
    

Best for: Low-latency flag evaluation on the critical path (e.g., evaluating flags on every HTTP request). With sidecar, flag reads are loopback calls — essentially free in terms of network overhead.

DaemonSet pattern: One Edge pod per node. Apps call Edge via the node's internal IP or a hostPort. Middle ground: less total Edge processes than sidecar, lower latency than a central Deployment (same node = no inter-node traffic).

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: unleash-edge
spec:
  selector:
    matchLabels: { app: unleash-edge }
  template:
    spec:
      containers:
      - name: unleash-edge
        image: unleashorg/unleash-edge:latest
        args: ["edge"]
        ports:
        - containerPort: 3063
          hostPort: 3063  # accessible via node IP
        env:
        - name: UPSTREAM_URL
          value: "http://unleash-server:4242"

# Apps connect via downward API to get node IP:
env:
- name: NODE_IP
  valueFrom:
    fieldRef: { fieldPath: status.hostIP }
- name: UNLEASH_URL
  value: "http://$(NODE_IP):3063/api"
    

10. Edge Chaining — Multi-Region & Hierarchical

Chaining: An Edge instance can use another Edge instance as its upstream instead of the Unleash server. This enables hierarchical topologies: central Edge → regional Edge → local Edge. Each hop caches independently.

Benefits of chaining:
• Unleash server sees only 1 upstream connection (Central Edge)
• Regional Edges serve traffic with lowest possible latency (local)
• Central Edge is the single sync point — no thundering herd
• Any regional Edge can be taken down without affecting others

Propagation lag with chaining:
Unleash → Central Edge: SSE (near-instant)
Central → Regional: polling interval (e.g. 15s)
Total lag from flag change to regional app: up to 15–30s
Reduce by shortening --features-refresh-interval-seconds

11. Metrics & Observability

📊

Prometheus Endpoint

/internal-backstage/prometheus

Built-in Prometheus metrics. No extra exporter needed. Scrape directly with a ServiceMonitor or annotation.

📈

Key Metrics

What to alert on

edge_request_duration_seconds, edge_upstream_latency, edge_cache_hit_total, edge_upstream_errors_total, edge_tokens_validated_total.

🔔

Flag Usage Metrics

Forwarded to Unleash

Edge batches SDK-reported flag evaluations (impression data) and POSTs them to Unleash every interval. Unleash uses this for flag adoption dashboards.

🔍

Debug Endpoints

/internal-backstage/*

/internal-backstage/tokens shows cached tokens. /internal-backstage/features/{token} dumps cached flags for a token. Restrict access in prod.

# Prometheus ServiceMonitor (if using kube-prometheus-stack)
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: unleash-edge
spec:
  selector:
    matchLabels: { app: unleash-edge }
  endpoints:
  - port: http
    path: /internal-backstage/prometheus
    interval: 30s

# Key alerts to configure:
# 1. edge_upstream_errors_total rate > 0  → upstream connection broken
# 2. edge_cache_hit_total / total requests < 0.95  → unexpected cache misses
# 3. edge_request_duration_seconds p99 > 10ms  → latency regression
  

12. Offline Mode / GitOps Workflow

Developer changes flag in Unleash UI

Flag state updated in Unleash server. All online Edge instances pick it up within seconds via SSE.

CI/CD pipeline runs offline prepare

A CI job runs unleash-edge offline prepare --output toggles.json to snapshot current flag state. Commits the file to git (or artifact store).

File mounted as ConfigMap in K8s

toggles.json stored as a ConfigMap, mounted into the Edge pod at /flags/toggles.json.

Edge serves from file, no upstream needed

Air-gapped clusters, banking environments, strict egress rules — no problem. Edge runs fully isolated.

Flag update = commit + redeploy

To update flags, regenerate the file, commit it, update the ConfigMap, rolling restart Edge. This is the GitOps cadence — flag changes are code-reviewed and auditable.

# K8s ConfigMap from file
kubectl create configmap unleash-flags \
  --from-file=toggles.json=./toggles.json \
  -n my-namespace

# Mount in pod
volumes:
- name: unleash-flags
  configMap: { name: unleash-flags }

volumeMounts:
- name: unleash-flags
  mountPath: /flags

# Run Edge in offline mode
args: ["offline", "--bootstrap-file", "/flags/toggles.json", "--tokens", "$(CLIENT_TOKEN)"]
  

13. Security Considerations

🔒

Token never sent to app untrusted sources

Apps supply their own token in the Authorization header. Edge validates it upstream once, caches the validation. Apps can only access their own environment's flags.

🔒

Edge token is privileged — protect it

The Edge token (used by Edge to authenticate to Unleash) has full read access. Store in K8s Secret, inject via env. Never expose to apps.

🔒

Restrict /internal-backstage in prod

These debug endpoints expose cached tokens and flag data. Use NetworkPolicy to allow only Prometheus scraping, not app traffic.

🔒

TLS between apps and Edge

Use a service mesh (Istio/Linkerd) for mTLS, or configure Edge with TLS cert (--tls-server-config). In-cluster traffic should be encrypted.

# K8s Secret for Edge token
apiVersion: v1
kind: Secret
metadata:
  name: unleash-edge-token
type: Opaque
stringData:
  token: "*:*.edge-token-secret-here"

---
# NetworkPolicy: only allow app traffic on /api/*
# not on /internal-backstage/*
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: unleash-edge-restrict
spec:
  podSelector:
    matchLabels: { app: unleash-edge }
  ingress:
  - from:            # allow all cluster traffic
    - podSelector: {}
    ports:
    - port: 3063
      

14. Edge vs Old Unleash Proxy

Feature	Old Proxy (Node.js)	Edge (Rust)
Language	Node.js	Rust
Memory usage	~120MB typical	~10–30MB
Startup time	2–5s (JIT warm-up)	<100ms
Run modes	Online only	Edge + Offline + Hybrid
Upstream types	Unleash server only	Unleash server or another Edge
Built-in metrics	No	Prometheus native
Health/Ready probes	Manual	/health + /ready built-in
Token validation	Always upstream	Cached locally after first validation
Offline mode	Not supported	Full offline from JSON file
Status	Deprecated	Active (current)

15. Production Readiness Checklist

Deployment

✓

2+ Edge replicas (HA)

At least 2 Edge pods behind a Service. PodDisruptionBudget minAvailable=1.

✓

Readiness probe on /ready

Not /health. Edge is only added to Service endpoints after cache is populated.

✓

Bootstrap file for fast startup

Mount a recent toggles.json snapshot. Edge serves immediately on pod restart.

✓

Resource limits set

Edge is Rust: 50–200m CPU, 64–256Mi RAM is typically sufficient for thousands of flag requests/s.

✓

topologySpreadConstraints

Spread Edge replicas across zones. Zone failure should not take out all Edge instances.

Operations

✓

Prometheus alerts configured

Alert on: upstream errors, high latency (p99 > 10ms), low cache hit rate.

✓

Edge token in K8s Secret

Never hardcode the Edge token. Rotate via Secret + rollout restart.

✓

Restrict /internal-backstage access

NetworkPolicy or ingress rules to prevent app-level access to debug endpoints.

✓

SDK points to Edge, not Unleash server

Every service's Unleash SDK URL must be Edge's ClusterIP DNS. Audit with grep.

✓

Validate revocation latency

Test that a disabled flag reaches all apps within your SLA window (default: seconds for online Edge).

# Quick audit: find services still pointing directly to Unleash server
grep -r "unleash-server\|:4242" ./k8s/ ./src/ \
  | grep -v "unleash-edge"
# Any hit = service bypassing Edge (fix it)

# Test Edge is serving correctly
curl -s -H "Authorization: *:production.my-token" \
  http://unleash-edge:3063/api/client/features | jq '.features | length'

# Check Edge is synced (upstream healthy)
curl -s http://unleash-edge:3063/internal-backstage/prometheus \
  | grep edge_upstream_errors