The telemetry plugin (running in the enclave alongside your FastAPI application) collects custom metrics and pushes them using the OTLP protocol to an external endpoint (e.g. an OpenTelemetry Collector).
Autoscaling
Autoscaling NitroEnclaveDeployments (NEDs)
This document describes how to enable autoscaling for NitroEnclaveDeployments (NEDs). NEDs allow you to deploy AWS Nitro Enclaves via a custom Kubernetes resource, and this guide explains how to scale them based on custom application metrics.
Nitro Enclaves run in an isolated environment. Standard pod metrics (like CPU or memory) do not apply. Instead, metrics from inside the enclave—pushed by a telemetry plugin using the OpenTelemetry (OTel) protocol—drive autoscaling.
Overview
NitroEnclaveDeployments (NEDs) run workloads inside AWS Nitro Enclaves, which operate in an isolated environment. Because traditional pod resource metrics (like CPU and memory) don’t reflect the actual workload running inside the enclave, autoscaling decisions are based on custom, application-specific metrics.
A telemetry plugin container running inside the enclave collects these metrics and exports them (via OTLP) to your chosen metrics pipeline. The metrics include a label (e.g. oblv_deployment_name) that identifies the specific NED instance. With these metrics available as external metrics, the Kubernetes Horizontal Pod Autoscaler (HPA) can adjust the number of NED replicas based on per-instance load.
Example NitroEnclaveDeployment Manifest
Below is an example NED manifest that includes the telemetry plugin configuration. In this example, the telemetry plugin is set to push its metrics to an OpenTelemetry Collector. The metrics produced will include the NED’s name (for instance, hello-fastapi
) so that they can be filtered by the HPA.
apiVersion: k8s.oblv.com/v1alpha1
kind: NitroEnclaveDeployment
metadata:
name: hello-fastapi
namespace: default
spec:
userPlugins:
- name: fastapi
image: public.ecr.aws/oblivious-ai/oblv-sample-fastapi:latest
ports:
- containerPort: 8001
hostPort: 4455
command:
- "python"
- "/app/uvicorn_runner.py"
plugins:
telemetry:
image: public.ecr.aws/oblivious-ai/oblv-telemetry-dev:dev
name: telemetry-plugin
volumes:
- containerPath: /etc/oblv
readOnly: true
source:
configMap:
name: fastapi-telemetry-configmap
items:
- key: config.yaml
path: config.yaml
name: telemetry-config-vol
env:
- name: HOST_PORT
value: '8100'
- name: EXPORTER_URL
value: "http://otel-collector.monitoring.svc.cluster.local:4318"
outboundConnections:
- fqdn:
value: otel-collector.monitoring.svc.cluster.local
port: 4318
tls: false
redirects: false
replicas: 1
serviceAccount: enclave-pod
hugepages-1Gi: 12Gi
enclaveCpuCount: 2
ingress:
enabled: true
internetFacing: true
dnsHostName: monitoring-test.oblv.com
ingressTlsCertificate: oblv-ingress-tls
ports:
- port: 4455
targetPort: 4455
caCertDetails:
enclaveCertType: ENCLAVE_GENERATED
Additionally, the telemetry plugin’s configuration is provided via a ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: fastapi-telemetry-configmap
namespace: default
data:
config.yaml: 'scrape_configs: []'
In this example, the telemetry plugin will tag exported metrics with the NED’s name (e.g. oblv_deployment_name: hello-fastapi
), so that each deployment’s metrics can be distinguished.
Configuring the HPA with External Metrics
With your telemetry metrics available through your external metrics pipeline, you can configure an HPA to autoscale the NitroEnclaveDeployment. Since the metrics include the NED’s name, the HPA is able to target the correct metric series for each deployment.
Below is an example HPA manifest that uses an external metric (for example, k8s_node_memory_usage_bytes
). This manifest instructs the HPA to scale the NED based on the average value of the external metric, filtering by the oblv_deployment_name
label:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hello-fastapi-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: k8s.oblv.com/v1alpha1
kind: NitroEnclaveDeployment
name: hello-fastapi
minReplicas: 1
maxReplicas: 5
metrics:
- type: External
external:
metric:
name: k8s_node_memory_usage_bytes # Name of the external metric exposed by your adapter
selector:
matchLabels:
ned_name: hello-fastapi # Filter metrics for this specific NED instance
target:
type: AverageValue
averageValue: "5" # Target average metric value per instance
How Autoscaling Works
Your metrics pipeline (managed separately) receives these metrics and exposes them through the Kubernetes Custom Metrics API. This makes the custom metric data accessible to Kubernetes HPA. Read more about telemetry in OBLV Deploy here.
When autoscaling is enabled , an HPA is deployed that references your NED resource. The HPA queries the custom metrics and adjusts the replica count (through the Resource's scale subresource) based on the defined thresholds.
When the custom metric exceeds the target threshold, the HPA increases the number of replicas; when it drops below the threshold, the HPA scales down, ensuring your enclave-based workload scales with demand.