If you’re running workloads on GKE Autopilot and see your Ingress resources stuck on UNHEALTHY, you’re not alone. This problem can be frustrating, especially when your pods and services appear to be healthy, but your Google Cloud Load Balancer refuses to send traffic to your app. In this article, we’ll break down what’s happening, why it happens, and how to fix it — with diagrams and clear explanations.
Table of Contents
- What’s the Problem?
- The GKE Load Balancing Path (with Diagram)
- Why Does This Happen on GKE Autopilot?
- The Role of Health Checks (Layer 4 vs Layer 7)
- How to Fix: Adding a Layer 7 (HTTP) Health Check
- Example: Working Ingress & BackendConfig YAML
- Debugging Tips
- Takeaways & Further Reading
What’s the Problem?
When you deploy an Ingress in GKE (especially Autopilot clusters), you might see your Service endpoints marked as UNHEALTHY under the annotations of your Ingress:
ingress.kubernetes.io/backends:
{"...default-your-service-80-...":"UNHEALTHY"}
Your app’s pods are up, they respond on /healthz and the endpoints look good, but the Load Balancer says „NOPE“ — 502 errors, failed TLS handshakes, or „Error syncing to GCP: your Ingress will not be able to serve any traffic.“
Why?
The GKE Load Balancing Path (with Diagram)
Let’s follow the packet from the outside to your pod:
┌─────────────┐
Internet → │ Google │
│ Load │
│ Balancer │
└─────┬───────┘
│ (HTTPS/TCP/HTTP)
▼
┌─────────────┐
│ Ingress │
│ Controller │
└─────┬───────┘
│
▼
┌─────────────┐
│ Service │
└─────┬───────┘
│
▼
┌─────────────┐
│ Pods │
└─────────────┘
- The Google Load Balancer (L7) is provisioned via the GKE Ingress Controller.
- It needs to know whether the backend (your Pods) are alive and able to respond — not just „reachable“ but actually able to process HTTP(S) requests.
- The health check is performed from outside your pods, using a separate probe path (by default, GCP uses
/or/healthz, depending on your config).
Why Does This Happen on GKE Autopilot?
On GKE Autopilot, Google manages the nodes and network plumbing. You cannot modify the node pools or certain low-level firewall settings. The Load Balancer strictly requires that your pods pass a Layer 7 (HTTP) health check.
In „regular“ (Standard) GKE clusters, a default health check might be created for you, or TCP (Layer 4) health checks might suffice. But on Autopilot, unless you explicitly define a Layer 7 Health Check in your BackendConfig, GKE cannot verify your app’s health, and your Ingress will be stuck on UNHEALTHY.
The Role of Health Checks (Layer 4 vs Layer 7)
- Layer 4 (Transport, TCP/UDP): Checks if the port is open and can accept TCP connections. But it does not know if your app is actually alive — just that something is listening.
- Layer 7 (Application, HTTP/S): Checks if your app can answer real HTTP requests (like
GET /healthzwith200 OK). This confirms your app is fully functional.
On Autopilot, the load balancer requires a successful HTTP(S) response from your pods to consider them healthy.
How to Fix: Adding a Layer 7 (HTTP) Health Check
You must create a BackendConfig with an HTTP health check, and link it to your Ingress:
- Create BackendConfig YAML:
Specify the path and port where your app responds with HTTP 200.
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: kreminder-backend-config
spec:
healthCheck:
checkIntervalSec: 15
timeoutSec: 5
healthyThreshold: 1
unhealthyThreshold: 3
port: 8080
type: HTTP
requestPath: /healthz
- Reference it in your Service (annotation):
metadata:
name: kreminder-service
annotations:
cloud.google.com/backend-config: '{"default": "kreminder-backend-config"}'
- Ensure your Ingress also uses HTTPS and the correct ManagedCertificate (if you use TLS).
- Make sure your app actually responds to
/healthzon the correct port with HTTP 200!
Example: Working Ingress & BackendConfig YAML
Here’s a complete minimal example for GKE Autopilot:
backendconfig.yaml
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: kreminder-backend-config
spec:
healthCheck:
port: 8080
type: HTTP
requestPath: /healthz
service.yaml
apiVersion: v1
kind: Service
metadata:
name: kreminder-service
annotations:
cloud.google.com/backend-config: '{"default": "kreminder-backend-config"}'
spec:
selector:
app: kreminder
ports:
- port: 80
targetPort: 8080
ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: kreminder-ingress
annotations:
networking.gke.io/managed-certificates: kreminder-cert
cloud.google.com/backend-config: '{"default":"kreminder-backend-config"}'
kubernetes.io/ingress.allow-http: "false"
kubernetes.io/ingress.global-static-ip-name: "kreminder-ip"
spec:
tls:
- hosts:
- kreminder.example.com
rules:
- host: kreminder.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kreminder-service
port:
number: 80
Debugging Tips
- Run
kubectl describe ingress ...and check for „UNHEALTHY“ backends in annotations. - Run
kubectl get endpoints <your-service>to ensure the Endpoints are correct. - Use
kubectl logson your app pod to see if/healthzis being hit. - Use
gcloud compute backend-services get-health ...for the real health check status in GCP.
If you see errors like:
„invalid configuration: both HTTP and HTTPS are disabled (kubernetes.io/ingress.allow-http is false and there is no valid TLS configuration); your Ingress will not be able to serve any traffic“
Or:
„Error syncing to GCP: … health check failed …“
It’s almost always a missing or misconfigured Layer 7 health check.
Takeaways & Further Reading
- On GKE Autopilot, Layer 7 health checks are mandatory for HTTP/S Ingress traffic.
- Missing a proper BackendConfig will leave your Ingress stuck with „UNHEALTHY“ backends, and no traffic to your app.
- Always make sure your app can respond to the health check endpoint (like
/healthz) with HTTP 200, and that the path and port match your BackendConfig. - Remember: Deleting and reapplying Ingress/Service is not enough if your BackendConfig or health check path is wrong!
Let me know if you want to tweak or personalize this further!