2 years ago (2019-01-17)  Technology |   First to comment  92 
post score 0 times, average 0.0

Health Check

Health check can be used for state monitoring of service operation, such as the D monitoring of Tencent's DNSPOD, which requires the configuration of an access path to determine whether a website is properly accessible is actually a health check, When a health check is found to have failed, an email notification or text message is sent to inform the webmaster of the repair.

K8S use ready and survival probes to configure health checks

In some modern distributed systems, user access is no longer a single host, but a cluster of hundreds of instances, user requests distributed to different instances through the load balancer, load balancing helps to solve the access pressure of a single server, while improving the system's high availability, and health checks are often used as current instances to see if "Available" criteria for judging.That is, when the system finds that an instance health check does not pass, the load balancer will not direct traffic to the instance. Today's cloud service vendors, such as AWS, are generally equipped with health checks for load balancing, while Kubernetes provides two probes to check the status of containers, Liveliness and Readiness, and according to official documents, Liveliness probes are designed to see if the container is running, Translated as a survival probe (livenessProbe), the Readiness probe is designed to see if the container is ready to accept HTTP requests and translated into a ready probe (readinessProbe). In Kubernetes, Pod is the smallest deployable unit of computing that Kubernetes creates and manages, and a Pod consists of one or more containers (docker,rocket, and so on) that share memory, the network, and how containers are run. Surviving probes and ready probes in the Kubernetes context are called health checks.These container probes are small processes that run periodically, and the results returned by these probes (success, failure, or unknown) reflect the state of the container in Kubernetes.Based on these results, Kubernetes will determine how each container is handled to ensure elasticity, high availability, and longer uptime.

Ready Probes

The ready probe is designed to let Kubernetes know if your app is ready to service the request.Kubernetes traffic is forwarded to the Pod only when the ready probe passes.If the ready probe detection fails, the Kubernetes stops sending traffic to the container until it passes.

Survival probes

The liveness detector is to let Kubernetes know if your app is alive.If your app is still alive, then Kubernetes let it continue to exist.If your application is dead, Kubernetes will remove the Pod and restart one to replace it.

Working process

Let's take a look at two scenarios to see how ready probes and survival probes can help us build higher-availability systems.

Ready Probes

An app often takes a while to preheat and start, such as the startup of a back-end project that requires a connection to the database to perform a database migration, and so on, and the startup of a Spring project also depends on the Java virtual machine.Even if the process is started, your service will not run until it starts and runs.The app should not receive traffic until it is fully ready, but by default, Kubernetes will start sending traffic as soon as the process inside the container starts.Probe through the ready probe until the application is fully started before allowing traffic to be sent to the new replica.

K8S use ready and survival probes to configure health checks
The working process of the Ready probe

Survival probes

Let's imagine another situation where our app "goes down" for some reason after it starts successfully, or encounters a deadlock condition that makes it unable to respond to user requests. By default, Kubernetes continues to send requests to the POD, which is detected by using a survival probe that restarts the problematic pod when the discovery service cannot process the request (request error or timeout) within a limited period of time.

K8S use ready and survival probes to configure health checks
The working process of the survival probe

Probe type

Probe type refers to the way in which health checks are carried out, K8S has three types of probes: Http,command and TCP.HTTP HTTP detection is probably the most common type of probe.Even if the app is not an HTTP service, you can create a lightweight HTTP server to respond to probes.For example, let Kubernetes access a URL over HTTP, and if the return code is within the range of 200 to 300, mark the application as healthy, otherwise it is marked as unhealthy. More about HTTP probes can be found here. Commands for command probing refer to Kubernetes running commands within a container.If the command is returned with exit code 0, the container is marked as normal.Otherwise, it is marked as unhealthy. More about command probes can be found here. The last type of detection for TCP is TCP detection, and Kubernetes attempts to establish a TCP connection on the specified port.If it can establish a connection, the container is considered healthy; If it cannot be considered unhealthy.This is often used for probing gRPC or FTP services. More about TCP detection can be found here.

Initial detection delay

We can configure the frequency of the K8S health check run, check the conditions for success or failure, and the timeout for the response.Refer to the documentation for configuring the probe. Failure to detect a survival probe can cause the pod to restart, so it is important to configure the initial detection delay initialDelaySeconds to ensure that the probe does not start until the application is ready.Otherwise, the app will restart indefinitely! I recommend using p99 startup time as initialDelaySeconds, or taking an average startup time plus a buffer.This value is also updated based on the application's startup time.


Take, for example, one of the following K8S configuration codes,


  • K8S will use HTTP to access the/actuator/health of port 8080 after the Pod starts 120s (initialDelaySeconds), and if it exceeds 10s or the return code is not in 200~300, the Ready check fails
  • Similarly, during Pod operation, K8S will still detect 8080 ports every 5s (periodSeconds/actuator/health



Follow my WeChat to get an article update

If you find this article useful to you, you can click on the "sponsor author" below to reward the author!

Reprint indicating the original source:Baiyuan's Blog>>https://wangbaiyuan.cn/en/k8s-health-examination-with-ready-survival-probes-2.html

Post comment


No Comment


Forget password?