The Watcher

movq

www.uninformativ.de

Spent the better part of the day debugging sporadic network failures in a kubernetes cluster.

TIL:

- k8s uses lots of iptables magic under the hood.
- iptables has a mechanism to apply rules *based on probability* and that’s how k8s does load balancing (e.g., if you have a service that points to several pods): https://man.archlinux.org/man/iptables-extensions.8#statistic
- The root cause of our sporadic failures were stale iptables rules: Some of them pointed to no longer existing pods (but because probabilities are involved, they didn’t always trigger).
- This isn’t Sparta, this is madness. And probably a k8s bug.

movq

www.uninformativ.de

14 Mar 22 14:51 UTC

View Thread

movq

www.uninformativ.de

14 Mar 22 14:51 UTC

View Thread

movq

www.uninformativ.de

14 Mar 22 14:57 UTC

View Thread

Well, the question is: What’s the root cause of the root cause? Why did those rules become stale? I’ll never know.

movq

www.uninformativ.de

14 Mar 22 14:57 UTC

View Thread

Well, the question is: What’s the root cause of the root cause? Why did those rules become stale? I’ll never know.

movq

www.uninformativ.de

14 Mar 22 14:57 UTC

View Thread

Well, the question is: What’s the root cause of the root cause? Why did those rules become stale? I’ll never know.

david

netbros.com

14 Mar 22 11:03 UTC-0400

View Thread

@movq it surely looks like a k8s bug. One would expect residual iptables to be flushed once pods are destroyed. If they are not, that's a bug. Sadly, I don't know much about k8s. Have learning about it on my TODO list.

mutefall

twtxt.net

14 Mar 22 15:48 UTC

View Thread

@movq what's your cni plugin you're using? calico, cillium, etc?

rules should flush when masters get in sync. if you have any drift between the masters and/or latency/divergence in state this can happen.

k8s is a nasty bit of kit. i do quite a bit of this at the dayjob

@david worth learning if you use it or simply interested in distributed computing :-)

movq

www.uninformativ.de

14 Mar 22 15:58 UTC

View Thread

@mutefall I’ll have to check tomorrow. This is “managed kubernetes” at Google, so I don’t know if I know which plugin it is. :-) We have zero control over the master(s?).

movq

www.uninformativ.de

14 Mar 22 15:58 UTC

View Thread

@mutefall I’ll have to check tomorrow. This is “managed kubernetes” at Google, so I don’t know if I know which plugin it is. :-) We have zero control over the master(s?).

movq

www.uninformativ.de

14 Mar 22 15:58 UTC

View Thread

@mutefall I’ll have to check tomorrow. This is “managed kubernetes” at Google, so I don’t know if I know which plugin it is. :-) We have zero control over the master(s?).

mutefall

twtxt.net

14 Mar 22 16:07 UTC

View Thread

@movq that's one of the main gripes i have over managed k8s. no true control of your masters.

mutefall

twtxt.net

16 Mar 22 00:56 UTC

View Thread

@movq what'd you make of this? any progress?