Debugging Kubernetes Ingress controllers for HTTP 502 or 504 errors involves a systematic approach to identify the root cause. These HTTP status codes typically indicate communication issues between the Ingress controller and the backend services or upstream servers. Here’s a detailed step-by-step guide to troubleshoot these errors:
1. Understand HTTP 502 and 504 Errors
- HTTP 502 (Bad Gateway): The Ingress controller successfully communicates with the backend server, but the backend server returned an invalid response.
- HTTP 504 (Gateway Timeout): The Ingress controller did not receive a response from the backend server within the expected time frame.
2. Check the Ingress Resource Configuration
- Inspect the Ingress manifest:
bash
kubectl describe ingress <ingress-name>
Ensure that:- The host and path rules match your desired configuration.
- The
serviceName
andservicePort
are correctly defined.
- Verify the annotations (e.g., for timeouts, load balancing, etc.), as these can affect behavior:
yaml
annotations:
nginx.ingress.kubernetes.io/proxy-connect-timeout: "60"
nginx.ingress.kubernetes.io/proxy-read-timeout: "60"
3. Check the Backend Service
- Verify the associated service configuration:
bash
kubectl get service <service-name> -o yaml - Ensure:
- The service type (
ClusterIP
,NodePort
, etc.) is appropriate. - The service port matches the one configured in the Ingress resource.
- The service type (
- If the service uses a
selector
, ensure it matches the labels of the target pods.
4. Inspect the Backend Pods
- Check the status of the backend pods:
bash
kubectl get pods -l <label-selector> - Ensure:
- The pods are running and ready.
- The pods’ containers are healthy (check readiness/liveness probes).
- View pod logs to identify any issues:
bash
kubectl logs <pod-name> - If the application is exposing a specific port, test connectivity within the cluster:
bash
kubectl exec -it <pod-name> -- curl http://<service-name>:<service-port>
5. Inspect the Ingress Controller
- Get logs for the Ingress controller pod:
bash
kubectl logs -n <ingress-namespace> <ingress-controller-pod>
Look for errors or warnings related to the HTTP 502/504. - Check the Ingress controller’s deployment and configuration:
bash
kubectl describe deployment -n <ingress-namespace> <ingress-controller-deployment>
6. Check Networking and DNS
- Ensure the DNS resolution is working correctly for the backend services:
bash
kubectl exec -it <pod-name> -- nslookup <service-name> - Test connectivity from the Ingress controller pod to the backend service:
bash
kubectl exec -n <ingress-namespace> <ingress-controller-pod> -- curl http://<service-name>:<service-port>
7. Verify Load Balancer and Firewall Rules
- If using an external load balancer:
- Check that it is forwarding traffic to the Ingress controller.
- Ensure health checks on the load balancer are passing.
- Verify any firewall or network policies that might block traffic between the Ingress controller and the backend services.
8. Check Timeouts
- HTTP 504 errors might occur due to timeouts in the Ingress controller or backend service. Increase timeout values in the Ingress annotations, for example:
yaml
annotations:
nginx.ingress.kubernetes.io/proxy-read-timeout: "120"
nginx.ingress.kubernetes.io/proxy-send-timeout: "120" - Check the application’s processing time to ensure it can respond within the timeout window.
9. Debug with Tools
- Use
kubectl port-forward
to directly access the backend service and verify its behavior:
bash
kubectl port-forward svc/<service-name> <local-port>:<service-port>
curl http://localhost:<local-port> - Use
curl
with detailed output to see the response headers and status:
bash
curl -v http://<ingress-host>/<path>
10. Inspect Custom Configurations
- If you are using a custom Ingress controller (e.g., Traefik, HAProxy, Istio), review its configuration and logs.
- Some controllers may require specific annotations or CRDs to behave as expected.
11. Monitor Metrics
- Enable monitoring for the Ingress controller using tools like Prometheus and Grafana.
- Look for metrics that indicate high latency, connection errors, or dropped requests.
12. Test with Simplified Configuration
- Create a minimal Ingress resource and a simple backend service (e.g., an
nginx
pod) to rule out complex configurations as the cause of the issue.
13. Check for Known Issues
- Review the documentation and GitHub issues for your specific Ingress controller (e.g., NGINX, Traefik, HAProxy) for any known bugs or limitations.
14. Work with Load Balancer Logs
- If using a cloud provider’s load balancer, inspect its logs for errors or misconfigurations.
By systematically following these steps, you should be able to identify and resolve the root cause of the HTTP 502 or 504 errors in your Kubernetes Ingress setup.