DEV Community

mikkergimenez
mikkergimenez

Posted on • Edited on

What :really: is Keda?

A while ago, I wrote an article called "Why is Prometheus Pull-Based". The inspiration of the article was this: If Prometheus calls itself pull-based, but a common pattern from Prometheus was to push metrics up to another server in a hierarchical manner. I.E. If Prometheus can both push and pull metrics, what specifically about the code makes Prometheus "pull based"?

So, as I undergo a project to implement Keda, I want to ask a similar type of question as I learn the software. Under the hood, what really is Keda? The summary on Keda's github repo says:

"KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes."

So, a few background notes. Keda is written in go, and can be deployed to Kubernetes via a helm chart. Keda has it's own custom resources, mostly the "ScaledObject" and "ScaledJob" which monitor some type of infrastructure component (like a RabbitMQ queue, called a "trigger" in Keda parlance) and scales a workload horizontally accordingly. It does this by generating a custom metric, hosted by the "keda-operator-metrics-apiserver" component. Keda then creates a Horizontal Pod Autoscaler which monitors this metric. Here's an architecture diagram from the website:

Image description

My instinct is that Keda is a relatively simple piece of software. There's basically one file for managing the scaledobject and scaledjob

The metrics server exposes one endpoint, the GetMetrics endpoint, that doesn't even seem to store metrics data, just exposes directly from the scaledobject which probably caches the data in etcd? Interesting architectural question about Kubernetes, how much data can a custom resource store in etcd?

Caching:

So, I'm curious at how Keda defines Caching. Is their a redis cluster backing it? Or is this just an in-memory cache. I do find it somehow amusing that "in-memory cache" is just a fancy word for "Some hashmap I fill with values i can retrieve later." I'm not sure what I expect it to be, but I can recall early in my tech career hearing all these fancy terms and thinking, there much be more to it than this.

So, if I follow the Keda source code, I get to the data structure for caches, that uses essentially the name/namespace of the scaledobject/scaledjob as it's key.

The value stored in the cache is defined in the scalers_cache code here:
https://github.com/kedacore/keda/blob/main/pkg/scaling/cache/scalers_cache.go#L37. For reference when browsing their documentation, the scaler is also what keda calls a "trigger".

In the "scalehandler" there's a MetricsCache , which here is the code that basically says if it is in the cache, pull it from cache, else pull it from the scaler itself. And then the MetricsCache itself looks just to be a map that stores the metric in memory?

Conclusion

I'm glad to dig into Keda, to learn more about how it works, but I don't know if there's an opportunity to really hack it to get a clearer understanding. Based on what I've learned I suppose these are a few of the projects you could undertake:

  1. Write a custom trigger that say looks at something fun like the current temperature or the number of people logged into Fortnight, but those feel like normal use cases, not really digging into the internals of Keda.

  2. Use Keda as a monitoring system. That's basically what it is, and you could use a number of triggers than write an integration for Grafana to visualize them. This might be interesting for a deeper understanding of how the cache works.

  3. This would be a bigger project, but I guess at it's core, what Keda is is a tool that automates or orchestrates infrastructure based on a metric. Rather than orchestrating a HPA, you could swap it out with a Vertical Pod Autoscaler? Or there is probably no reason you couldn't say orchestrate Database failovers.

So, in summary what really is Keda? I'd say it has three prongs.

  1. It monitors services / API's to get some kind of datapoint.
  2. It exposes those metrics using a very lightweight internal monitoring system.
  3. It manages a service (natively horizontalpodautoscaler) that can use those metrics to operate.

Top comments (1)

Collapse
 
shafayeat profile image
Shafayet Hossain

Nice article but just to let you know that when i tried to go into your website, It says "ERR_SSL_PROTOCOL_ERROR"