Kubernetes
Lifecycle
This page describes the lifecycle of a Nitro Enclave. Nitro Enclaves follow a defined lifecycle, starting with the PENDING
phase, moving through INPROGRESS
and finally reaching an ERROR
or a RUNNING
phase, depending upon the final stage of the enclave.
Whilst the enclave is running, the orchestrator inside the enclave can restart containers to handle some faults.
Enclaves are scheduled only once in their lifetime. Once an enclave is assigned to an EC2, it runs on that EC2 until it stops or is terminated.
Enclave Lifetime
Enclaves are considered to be ephemeral entities. Enclaves are created, assigned a unique ID, and scheduled to EC2 Instances where they remain until termination. If an EC2 Instance fails, the Nitro Enclave scheduled to that EC2 is scheduled for deletion after a timeout period. OBLV Deploy uses a higher-level abstraction, called a controller, that handles the work of managing these Nitro Enclaves.
A given Nitro Enclave is never rescheduled to a different EC2 instance.
Nitro Enclave Bootup
Since Nitro Enclaves are backed by EC2 Instances, when a Nitro Enclave is created, an EC2 has to come up to host the enclave. The lifecycle of this EC2 is maintained by ACK EC2 Controller. The ACK controller brings up the EC2 with the AMI ID given by the user (the user obtains it from the marketplace). The OBLV Kubernetes controller then connects to the EC2 and brings up the enclave along with the base services inside it.
Once the Nitro Enclave is ready with the base services running, the controller syncs a manifest that is derived from the NitroEnclaveDeployment
spec, which includes all the information required for the enclave to spin up the user containers or plugins.
The orchestrator or kubelet running inside the enclave then spins up all the services as per the manifest.
Nitro Enclave Phases
The phase of a Nitro Enclave is a simple, high-level summary of where the Nitro Enclave is in its lifecycle. The phase of the Nitro Enclave is stored as Nitro EnclaveStatus
in the status object of the Nitro Enclave resource.
Value | Description |
---|---|
PENDING | The Nitro Enclave has been accepted by the Kubernetes Cluster, but the heartbeat of the base enclave services is not yet seen. This includes the time a Nitro Enclave spends waiting to be scheduled to an EC2, as well as time spent bringing up the EC2. |
INPROGRESS | The orchestrator inside the Nitro Enclave is spinning up some containers. These could be the base services themselves or the containers defined by the user in the NitroEnclaveDeployment . The Nitro Enclave usually goes to this state twice, once when it is bringing up the base services and also when it is pulling the user-defined containers or plugins and bringing them up. |
RUNNING | All the base services, user containers, and plugins are running as expected. |
ERROR | One or more containers are down or the Nitro Enclave health check is failing from the Kubernetes cluster. |
Manifest Sync States
As well as the states of the Nitro Enclaves overall, Kubernetes also tracks the state of the Manifest Sync. Manifest sync is the process where once the Nitro Enclave comes up, the OBLV Kubernetes controller syncs a manifest that is derived from the NitroEnclaveDeployment
manifest itself with the Nitro Enclave.
Only once the Manifest Sync is complete do the user-defined controllers and plugins come up.
Value | Description |
---|---|
PENDING | The controller is waiting for the base services inside the enclave to come up. |
SUCCESS | The manifest is synced with the Nitro Enclave. |
FAILED | Manifest sync failed. |
Handling Problems with Container
OBLV controller manages container failures within Enclaves using a restartPolicy defined for a container or plugin in the NitroEnclaveDeployment
resource. This is done by the orchestrator within the enclave.
But if the Nitro Enclave is unhealthy, it is taken out of the list of healthy Nitro Enclaves and no traffic is routed to it till it becomes healthy.
If the application continues to be unhealthy for a while, the Nitro Enclave along with its host EC2 is terminated and another one is brought up to take its place. But this is done at a higher level than the Nitro Enclave itself.
To investigate the root cause of a Nitro Enclave error, a user can:
- Inspect Events: Use
kubectl describe ne <name-of-Nitro-Enclave>
to see events for the Nitro Enclave, which can provide hints about configuration or runtime issues. - Check Enclave Logs: Check the enclave orchestrator logs on your configured log backend. This is often the most direct way to diagnose the issue causing crashes.
- Check Controller Logs: Use
kubectl logs <name-of-the-oblv-controller-pod>
to check logs of the controller. The controller frequently runs health checks on the enclaves and logs any issues it finds. - Review Configuration: Ensure that the
NitroEnclaveDeployment
configuration, including environment variables and mounted volumes, is correct and that all required external resources are available. - Review Controller Permissions: OBLV Deploy takes the help of other controllers to spin up the enclaves and set up networking to them. Ensure that necessary permissions are available to these controllers.
Termination of Nitro Enclaves
Termination of the Nitro Enclaves happens either when the NitroEnclaveDeployment
is deleted or when the user manually deletes a Nitro Enclave. Any Nitro Enclave marked for deletion will be removed from the list of Nitro Enclaves that can take user traffic.
As the Nitro Enclave is deleted, Kubernetes deletes the underlying EC2 instance as well.