I have read about Kubernetes CronJobs, but I'm looking for a more flexible scheduling solution (I'm using GKE). In particular, I have a web app that, upon some user setting a checkbox on a dashboard, I want to trigger some service every X minutes. If the user clears the checkbox, the trigger will stop. I was hoping there are such ready-made services. What's the best approach here?
CodePudding user response:
I want to trigger some service every X minutes. If the user clears the checkbox, the trigger will stop
The simplest way to do this would be to have your web app create a CronJob in kubernetes API when user enables this, and deletes that object when he disables it.
Though I'm not sure something like this would scale very well. Depends on your app. Going with Kubernetes Cronjobs, each job would create a pod, allocating resources, pull image, start container, run stuff, terminate. There's some overhead that could be avoided -- depending on what you're doing, this may or might not make sense. Another way to do this would be to implement some jobs queue in your application.
Eg: in NodeJS, I would use something like bee-queue, bull or kue. A single "worker" could then process jobs from multiple users, in parallel, and/or with some concurrency limit, ... A timer (eg: node-schedule) could trigger jobs. Web frontend deals with enabling or disabling timers one behalf of users, user selection may be kept in whatever SGBD/noSGBD you have available. Or even in a ConfigMap (data has sizes limitation!).
With a couple workers (running as Deployments or StatefulSet), some master/slave redis setup, I should be able to deal with lots of different jobs. Maybe add some HorizontalPodAutoscaler, allowing for adding/removing workers depending on CPU or memory usage of your workers.
While if I were to create kubernetes CronJobs for each user requesting something, that could make for a lot of Pods to schedule, potentially waisting resources or testing my cluster limits.
CodePudding user response:
Triggering schedules is a typical use case of Google cloud functions, that is the serverless approach. I think it's also cost effective, instead of GKE. Look at these docs:
https://cloud.google.com/scheduler/docs/tut-pub-sub
You might use a cloud function to invoke a GKE CronJob, or a kubernetes replica set creation with replicas 1 using an image for the scheduled job. It might be a spring boot micro-service with the @Scheduled and actual schedule loaded from parameters. To disable the schedule you scale down the pod to 0 replicas.
Remember that in order to access the VPC of GKE nodes you need a VPC access because cloud functions are serverless.
Anyway you can understand that GKE is a cumbersome and costly approach.