Kubernetes Workload

Today, we're tackling Kubernetes workloads. Stay tuned, and learn more.

When describing the Kubernetes architecture we often hear the term pods. Pods group the containers logically in a single cluster. These logically grouped containers are then controlled by the controllers. Controllers are responsible for monitoring and maintaining the state of the Kubernetes resources and any changes required in the cluster are requested through controllers.

‍

Similarly, workload refers to the application running in one or more Kubernetes pods. These workloads together with controllers run the application by making sure that the correct pods are running.

Let’s look at a few use cases of workload resources:

ReplicaSets and Deployments for stateless applications like API gateways is a good use case. In deployment, ReplicaSets are managed by declarative statements.
StatefulSets can be used for stateful applications that require multiple pods and server leaders such as RabbitMQ messaging service.
For node monitoring and log collection, DaemonSets are used. Popular log collection stacks like Fluentd, Kibana, and Elasticsearch can be used for log collection
CronJobs and Jobs can be used for applications that require scheduled runs such as a batch pipeline to update databases.
Custom resources provide support for managing and updating new resources. For instance, adding a certificate manager to enable HTTPS and TLS support.

Deployments

ReplicaSets are provisioned by deployment which then allocates the pods as per the desired state. If we make any changes in the deployment then a new ReplicaSet is allocated replacing the previous one. Since ReplicaSets are used in stateless applications the process is smooth and the application faces no downtime. Stateless applications don’t save any information about previous operations and always start from scratch.

Let’s look at an example of deployment workload:

<pre class="codeWrap"><code>apiVersion: apps/v1 kind: Deployment metadata: labels: app: grafana name: grafana spec: selector: matchLabels: app: grafana template: metadata: labels: app: grafana spec: securityContext: fsGroup: 472 supplementalGroups: - 0 containers: - name: grafana image: grafana/grafana:7.5.2 imagePullPolicy: IfNotPresent ports: - containerPort: 3000 name: http-grafana protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /robots.txt port: 3000 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 30 successThreshold: 1 timeoutSeconds: 2 livenessProbe: failureThreshold: 3 initialDelaySeconds: 30 periodSeconds: 10 successThreshold: 1 tcpSocket: port: 3000 timeoutSeconds: 1 resources: requests: cpu: 250m memory: 750Mi volumeMounts: - mountPath: /var/lib/grafana name: grafana-pv volumes: - name: grafana-pv persistentVolumeClaim: claimName: grafana-pvc --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: grafana-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi</code></pre>

In K8s deployment workloads are connected to databases that hold users and applications, here the workload is stateless. A common example is hosting Grafana in k8s as deployment.

StatefulSet

StatefulSet maintains an identifier for each pod by allocating an identifier. This allows you to create pods from the same specification but can’t be used interchangeably during rescheduling due to the identifier. Pod identifiers can be used to allocate and match the volumes even when pods are restarted, rescheduled, or in case of event failure.

‍

StatefulSets are deployed, deleted, and scaled in an orderly manner. By deleting the StatefulSet its volumes aren’t deleted ensuring data security. In addition to this, pods can be affected and can get terminated when StatefulSet is deleted.

Let’s take an example of StatefulSet workload resource:

<pre class="codeWrap"><code>--- apiVersion: v1 kind: Service metadata: # Expose the management HTTP port on each node name: rabbitmq-management labels: app: rabbitmq spec: ports: - port: 15672 name: http selector: app: rabbitmq type: NodePort # Or LoadBalancer in production w/ proper security --- apiVersion: v1 kind: Service metadata: # The required headless service for StatefulSets name: rabbitmq labels: app: rabbitmq spec: ports: - port: 5672 name: amqp - port: 4369 name: epmd - port: 25672 name: rabbitmq-dist clusterIP: None selector: app: rabbitmq --- apiVersion: apps/v1 kind: StatefulSet metadata: name: rabbitmq spec: serviceName: "rabbitmq" replicas: 3 selector: matchLabels: app: rabbitmq template: metadata: labels: app: rabbitmq spec: containers: - name: rabbitmq image: rabbitmq:3.6.6-management-alpine lifecycle: postStart: exec: command: - /bin/sh - -c - > if [ -z "$(grep rabbitmq /etc/resolv.conf)" ]; then sed "s/^search ([^ ]+)/search rabbitmq.1 1/" /etc/resolv.conf > /etc/resolv.conf.new; cat /etc/resolv.conf.new > /etc/resolv.conf; rm /etc/resolv.conf.new; fi; until rabbitmqctl node_health_check; do sleep 1; done; if [[ "$HOSTNAME" != "rabbitmq-0" && -z "$(rabbitmqctl cluster_status | grep rabbitmq-0)" ]]; then rabbitmqctl stop_app; rabbitmqctl join_cluster rabbit@rabbitmq-0; rabbitmqctl start_app; fi; rabbitmqctl set_policy ha-all "." '{"ha-mode":"exactly","ha-params":3,"ha-sync-mode":"automatic"}' env: - name: RABBITMQ_ERLANG_COOKIE valueFrom: secretKeyRef: name: rabbitmq-config key: erlang-cookie ports: - containerPort: 5672 name: amqp volumeMounts: - name: rabbitmq mountPath: /var/lib/rabbitmq volumeClaimTemplates: - metadata: name: rabbitmq annotations: volume.beta.kubernetes.io/storage-class: openebs-jiva-default spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 5G</code></pre>

In comparison with deployments StatefulSets can form clusters and the pod identifiers allow it to form a cluster. User information and actions can be stored by Stateful applications. Cluster formation comes with its own benefits like application resilience and data safety by synchronizing data across all nodes. It provides self-healing properties as clusters can still be functional if a single pod is deleted. A popular example of an application that can form clusters is RabbitMQ.

‍

RabbitMQ is a messaging service that is used for communication by different services within a Kubernetes cluster. RabbitMQ runs as a cluster in production to avoid a single point of failure. If we run RabbitMQ on a single cluster and something goes wrong we can face downtime. Just like RabbitMQ, there are other StatefulSet applications that run as stable stateful clusters and are stable.

DaemonSet

DaemonSet allocates a pod copy on all nodes using taints and labels. On new nodes, pods are allocated and pods are collected from nodes removed from the cluster. DaemonSet is popular for collecting logs and metrics in a cluster. The downside is that the deletion of DaemonSet can result in the deletion of all pods.

DaemonSet workload example:

<pre class="codeWrap"><code>apiVersion: v1 kind: Namespace metadata: name: monitoring --- apiVersion: apps/v1 kind: DaemonSet metadata: labels: app.kubernetes.io/component: exporter app.kubernetes.io/name: node-exporter name: node-exporter namespace: monitoring spec: selector: matchLabels: app.kubernetes.io/component: exporter app.kubernetes.io/name: node-exporter updateStrategy: type: RollingUpdate rollingUpdate: maxUnavailable: 2 template: metadata: labels: app.kubernetes.io/component: exporter app.kubernetes.io/name: node-exporter annotations: prometheus.io/scrape: "true" prometheus.io/path: '/metrics' prometheus.io/port: "9100" spec: hostPID: true hostIPC: true hostNetwork: true enableServiceLinks: false containers: - name: node-exporter image: prom/node-exporter imagePullPolicy: IfNotPresent securityContext: privileged: true args: - '--path.sysfs=/host/sys' - '--path.rootfs=/root' - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/) - --collector.netclass.ignored-devices=^(veth.*)$ ports: - containerPort: 9100 protocol: TCP resources: limits: cpu: 100m memory: 100Mi requests: cpu: 50m memory: 50Mi volumeMounts: - name: sys mountPath: /host/sys mountPropagation: HostToContainer - name: root mountPath: /root mountPropagation: HostToContainer tolerations: - operator: Exists effect: NoSchedule volumes: - name: sys hostPath: path: /sys - name: root hostPath: path: /</code></pre>

The node exporter collects the matrices like memory consumption, disk IOPS, and CPU usage.

Job and CronJob

Job runs the pods to completion, it usually creates one or more pods and keeps on trying until it has achieved the required number of successful runs. CronJob however schedules the tasks and completes the jobs on a schedule.

The CronJob workload looks like this:

<pre class="codeWrap"><code>apiVersion: batch/v1beta1 kind: CronJob metadata: name: database-backup-job namespace: default spec: schedule: "0 0 * * *" jobTemplate: spec: template: spec: restartPolicy: OnFailure containers: - name: db-backup image: "ghcr.io/omegion/db-backup:v0.9.0" imagePullPolicy: IfNotPresent args: - s3 - export - --host=$DB_HOST - --port=$DB_PORT - --databases=$DATABASE_NAMES - --username=$DB_USER - --password=$DB_PASS - --bucket=$BUCKET_NAME env: - name: DB_HOST value: "" - name: DB_PORT value: "" - name: DB_USER value: "" - name: DB_PASS value: "" - name: DATABASE_NAMES value: "" - name: BUCKET_NAME value: "" - name: AWS_ACCESS_KEY_ID value: "" - name: AWS_SECRET_ACCESS_KEY value: "" - name: AWS_REGION value: ""</code></pre>

The above CronJob has successfully taken the backup for the database.

Custom Resources

The Custom Resources are an extension of Kubernetes API. In kubernetes API objects like DaemonSets are stored. Custom Resources can be used to extend the features of the API that are not available in the default version. You can add and delete the Custom Resources from a cluster. Kubectl can be used to access Custom Resources.

For instance, the Custom Resource such as Certificate Manager can be installed using the following command:

<pre class="codeWrap"><code>kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.4.0/cert-manager.yaml</code></pre>

You can view the above installed Certificate Manager using the kubectl get certificates command. Custom controllers can be added through custom resource definitions. Combining custom resources with controllers creates a declarative API just like existing workflows like Deployments and DaemonSets.

‍

This enables the users to declare the desired state and it is the responsibility of the control loop to maintain the actual state and allocated resources to maintain the desired state. Custom controllers work with any type of resource and can be added or deleted independently of the cluster.

Kubernetes Workloads Best Practices

Let’s discuss some of the best practices for using Kubernetes workloads:

Always try to use smaller container images. Avoid using mainstream Linux distribution images and start from a smaller Alpine image. Alpine image is 80% smaller than the rest and you can add packages to it later as per your requirement.
Download the latest version of K8s as it is going to have updated features and security patches.
Try to set the request and resource limit at the beginning. It’s good to limit the CPU usage and memory resources a container can use. Kubernetes offers a Vertical Pod Autoscaling feature to dynamically manage the resources. The best practice here is to use machine learning algorithms to measure accurate resource consumption and avoid wasting resources.
Use readinessProbe and livenessProbe checks to determine the health of the pods and manage the pod traffic.
Define access policies using Role-Based Access Controls. This will allow us to effectively manage resources.

Conclusion

Kubernetes Workloads play a key role in managing applications and resources on clusters. All the use cases discussed in this article along with best practices can help reduce application downtime, effective monitoring, and overall improvement of the architecture. In a dynamic work of microservices architecture, Kubernetes is a powerful tool for managing applications and its workload is vital to fully utilize its capabilities.

‍