Today we stumbled upon an interesting case which I want to share as it might help you in your debugging journey.
Let's assume you have the following infrastructure setup:
- Kubernetes Cluster on the Google Kubernetes Engine
- Workload Identity enabled.
-
egress
Network Policies in use.
Problem
You might think everything is fine, but your service which is communicating with Google APIs (like Google Cloud Storage Client Libraries, etc.) complains with something like:
Could not load the default credentials.
Browse to https://cloud.google.com/docs/authentication/getting-started for more information.
The first reaction is: How can this be the case when I configured Workload Identity as recommended? When looking behind the curtain it is likely that the culprit here is not Workload Identity at all.
Cause
In our case, the root cause was a restrictive network policy which blocked the egress
traffic to the GKE metadata server.
Solution
The solution depends slightly on the Kubernetes version you're operating.
1.21.0-gke.1000 and later
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: your-network-policy
namespace: default
spec:
egress:
- ports:
- port: 988
protocol: TCP
to:
- ipBlock:
cidr: 169.254.169.252/32
policyTypes:
- Egress
Prior versions
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: your-network-policy
namespace: default
spec:
egress:
- ports:
- port: 988
protocol: TCP
to:
- ipBlock:
cidr: 127.0.0.1/32
policyTypes:
- Egress
Top comments (0)