Solving the “UNHEALTHY Backends” Problem in Google Kubernetes Engine (GKE) Ingress with Layer 7 Health Checks

If you’re running workloads on GKE Autopilot and see your Ingress resources stuck on UNHEALTHY, you’re not alone. This problem can be frustrating, especially when your pods and services appear to be healthy, but your Google Cloud Load Balancer refuses to send traffic to your app. In this article, we’ll break down what’s happening, why it happens, and how to fix it — with diagrams and clear explanations.

What’s the Problem?
The GKE Load Balancing Path (with Diagram)
Why Does This Happen on GKE Autopilot?
The Role of Health Checks (Layer 4 vs Layer 7)
How to Fix: Adding a Layer 7 (HTTP) Health Check
Example: Working Ingress & BackendConfig YAML
Debugging Tips
Takeaways & Further Reading

What’s the Problem?

When you deploy an Ingress in GKE (especially Autopilot clusters), you might see your Service endpoints marked as UNHEALTHY under the annotations of your Ingress:

ingress.kubernetes.io/backends:
  {"...default-your-service-80-...":"UNHEALTHY"}

Your app’s pods are up, they respond on /healthz and the endpoints look good, but the Load Balancer says „NOPE“ — 502 errors, failed TLS handshakes, or „Error syncing to GCP: your Ingress will not be able to serve any traffic.“

Why?

The GKE Load Balancing Path (with Diagram)

Let’s follow the packet from the outside to your pod:

           ┌─────────────┐
Internet → │  Google     │
           │  Load       │
           │  Balancer   │
           └─────┬───────┘
                 │ (HTTPS/TCP/HTTP)
                 ▼
           ┌─────────────┐
           │ Ingress     │
           │ Controller  │
           └─────┬───────┘
                 │
                 ▼
           ┌─────────────┐
           │  Service    │
           └─────┬───────┘
                 │
                 ▼
           ┌─────────────┐
           │   Pods      │
           └─────────────┘

The Google Load Balancer (L7) is provisioned via the GKE Ingress Controller.
It needs to know whether the backend (your Pods) are alive and able to respond — not just „reachable“ but actually able to process HTTP(S) requests.
The health check is performed from outside your pods, using a separate probe path (by default, GCP uses / or /healthz, depending on your config).

Why Does This Happen on GKE Autopilot?

On GKE Autopilot, Google manages the nodes and network plumbing. You cannot modify the node pools or certain low-level firewall settings. The Load Balancer strictly requires that your pods pass a Layer 7 (HTTP) health check.

In „regular“ (Standard) GKE clusters, a default health check might be created for you, or TCP (Layer 4) health checks might suffice. But on Autopilot, unless you explicitly define a Layer 7 Health Check in your BackendConfig, GKE cannot verify your app’s health, and your Ingress will be stuck on UNHEALTHY.

The Role of Health Checks (Layer 4 vs Layer 7)

Layer 4 (Transport, TCP/UDP): Checks if the port is open and can accept TCP connections. But it does not know if your app is actually alive — just that something is listening.
Layer 7 (Application, HTTP/S): Checks if your app can answer real HTTP requests (like GET /healthz with 200 OK). This confirms your app is fully functional.

On Autopilot, the load balancer requires a successful HTTP(S) response from your pods to consider them healthy.

How to Fix: Adding a Layer 7 (HTTP) Health Check

You must create a BackendConfig with an HTTP health check, and link it to your Ingress:

Create BackendConfig YAML:
Specify the path and port where your app responds with HTTP 200.

apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: kreminder-backend-config
spec:
  healthCheck:
    checkIntervalSec: 15
    timeoutSec: 5
    healthyThreshold: 1
    unhealthyThreshold: 3
    port: 8080
    type: HTTP
    requestPath: /healthz

Reference it in your Service (annotation):

metadata:
  name: kreminder-service
  annotations:
    cloud.google.com/backend-config: '{"default": "kreminder-backend-config"}'

Ensure your Ingress also uses HTTPS and the correct ManagedCertificate (if you use TLS).
Make sure your app actually responds to /healthz on the correct port with HTTP 200!

Example: Working Ingress & BackendConfig YAML

Here’s a complete minimal example for GKE Autopilot:

backendconfig.yaml

apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: kreminder-backend-config
spec:
  healthCheck:
    port: 8080
    type: HTTP
    requestPath: /healthz

service.yaml

apiVersion: v1
kind: Service
metadata:
  name: kreminder-service
  annotations:
    cloud.google.com/backend-config: '{"default": "kreminder-backend-config"}'
spec:
  selector:
    app: kreminder
  ports:
    - port: 80
      targetPort: 8080

ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: kreminder-ingress
  annotations:
    networking.gke.io/managed-certificates: kreminder-cert
    cloud.google.com/backend-config: '{"default":"kreminder-backend-config"}'
    kubernetes.io/ingress.allow-http: "false"
    kubernetes.io/ingress.global-static-ip-name: "kreminder-ip"
spec:
  tls:
    - hosts:
        - kreminder.example.com
  rules:
    - host: kreminder.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: kreminder-service
                port:
                  number: 80

Debugging Tips

Run kubectl describe ingress ... and check for „UNHEALTHY“ backends in annotations.
Run kubectl get endpoints <your-service> to ensure the Endpoints are correct.
Use kubectl logs on your app pod to see if /healthz is being hit.
Use gcloud compute backend-services get-health ... for the real health check status in GCP.

If you see errors like:

„invalid configuration: both HTTP and HTTPS are disabled (kubernetes.io/ingress.allow-http is false and there is no valid TLS configuration); your Ingress will not be able to serve any traffic“

Or:

„Error syncing to GCP: … health check failed …“

It’s almost always a missing or misconfigured Layer 7 health check.

Takeaways & Further Reading

On GKE Autopilot, Layer 7 health checks are mandatory for HTTP/S Ingress traffic.
Missing a proper BackendConfig will leave your Ingress stuck with „UNHEALTHY“ backends, and no traffic to your app.
Always make sure your app can respond to the health check endpoint (like /healthz) with HTTP 200, and that the path and port match your BackendConfig.
Remember: Deleting and reapplying Ingress/Service is not enough if your BackendConfig or health check path is wrong!

Let me know if you want to tweak or personalize this further!