EKS cluster helper helm chart

Helper required for EKS

shubham kumar singh
DevOps-Journey

--

I have been using EKS for a while now and realized that even though EKS is one of the most used Kubernetes platforms on the Cloud, it seems a little incomplete. We have set up terraform to spun the cluster which solves one of the problems. I will construct this document 3 stages, problem, impact, and solution. Let's start with the Problem:

Problem & Impact

  1. There is no dashboard to check the basic details. Like how many workloads are running. This problem is solved in GKE.
  2. There is no clear way to understand networking requirements without a cloud-native dashboard. This is a requirement that I don't have a solution yet. We need to check Loadbalcing in a separate section of the Ec2 and no clear way to know which service is referring to which LB.
  3. Understanding basic metric needs and Kubernetes event system. We need to capture the output of kubectl get events and use that to our advantage to debug certain problems. I am not sure if this is solved even in GKE.
  4. Cluster auto-scaler: One selects the cloud for Elasticity and expects that to be available out of the box. It is not the same case with EKS. We need to deploy cluster-autoscaler by ourselves.
  5. DNS resolution: We have seen DNS issues while running a huge cluster which makes a lot of network calls (40k/s). At this stage, we see that DNS resolution goes up to 2 seconds some time. The solution is to deploy a nodelocaldnswhich is not given by default. I suppose this is not an issue with EKS but an application-specific networking requirement.

Solution

The solution is something that should come from AWS; meanwhile, we are trying to solve this using a helm chart to deploy a stack that has a cluster-auto-scaler, metric server, and dashboard. We are not considering nodelocaldns at this stage.

I have set up a git-repo for this which has dependent charts. The objective is to be able to set up the cluster as easily as possible. Although, we can run all these commands separately; however, running them from a single helm chart is simply easier.

Chart components

The chart is pretty simple, it refers to the requirements from other charts. We are referring to the Kubernetes Dashboard chart and Metric server chart and calling them from our chart.

Requirements.yaml

Below is a simple requirement file to refer to other charts. I keep local copies of the chart here to ensure that I am using the same version and no internet access is required once we have packaged everything.

dependencies:- name: kubernetes-dashboardversion: "2.7.1"repository: "file://helper_charts/kubernetes-dashboard"- name: metrics-serverversion: "2.11.2"repository: "file://helper_charts/metrics-server"

Values.yaml

We are using the below values.yaml to override the params from internal charts.

# Default values.yaml# # Default values for eks-helper.# # This is a YAML-formatted file.# # Declare variables to be passed into your templates.ingress: enabled: falsemetrics-server: rbac.create: true replicas: 1kubernetes-dashboard: {}clusterAutoscaler: clusterName: eks-default-clustername image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.17.3 updateEKSCommandOption: true annotations:  cluster-autoscaler.kubernetes.io/safe-to-evict: "false"

ClusterAutoscaler

I was not able to find a decent chart for cluster auto-scaler, so I wrote that in the template.

---apiVersion: v1kind: ServiceAccountmetadata:labels:k8s-addon: cluster-autoscaler.addons.k8s.iok8s-app: {{ template "eks-helper.name" . }}-cluster-autoscalername: {{ template "eks-helper.name" . }}-cluster-autoscalernamespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRolemetadata:name: {{ template "eks-helper.name" . }}-cluster-autoscalerlabels:k8s-addon: cluster-autoscaler.addons.k8s.iok8s-app: {{ template "eks-helper.name" . }}-cluster-autoscalerrules:- apiGroups: [""]resources: ["events", "endpoints"]verbs: ["create", "patch"]- apiGroups: [""]resources: ["pods/eviction"]verbs: ["create"]- apiGroups: [""]resources: ["pods/status"]verbs: ["update"]- apiGroups: [""]resources: ["endpoints"]resourceNames: ["cluster-autoscaler"]verbs: ["get", "update"]- apiGroups: [""]resources: ["nodes"]verbs: ["watch", "list", "get", "update"]- apiGroups: [""]resources:- "pods"- "services"- "replicationcontrollers"- "persistentvolumeclaims"- "persistentvolumes"verbs: ["watch", "list", "get"]- apiGroups: ["extensions"]resources: ["replicasets", "daemonsets"]verbs: ["watch", "list", "get"]- apiGroups: ["policy"]resources: ["poddisruptionbudgets"]verbs: ["watch", "list"]- apiGroups: ["apps"]resources: ["statefulsets", "replicasets", "daemonsets"]verbs: ["watch", "list", "get"]- apiGroups: ["storage.k8s.io"]resources: ["storageclasses", "csinodes"]verbs: ["watch", "list", "get"]- apiGroups: ["batch", "extensions"]resources: ["jobs"]verbs: ["get", "list", "watch", "patch"]- apiGroups: ["coordination.k8s.io"]resources: ["leases"]verbs: ["create"]- apiGroups: ["coordination.k8s.io"]resourceNames: ["cluster-autoscaler"]resources: ["leases"]verbs: ["get", "update"]---apiVersion: rbac.authorization.k8s.io/v1kind: Rolemetadata:name: {{ template "eks-helper.name" . }}-cluster-autoscalernamespace: kube-systemlabels:k8s-addon: cluster-autoscaler.addons.k8s.iok8s-app: {{ template "eks-helper.name" . }}-cluster-autoscalerrules:- apiGroups: [""]resources: ["configmaps"]verbs: ["create","list","watch"]- apiGroups: [""]resources: ["configmaps"]resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]verbs: ["delete", "get", "update", "watch"]---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata:name: {{ template "eks-helper.name" . }}-cluster-autoscalerlabels:k8s-addon: cluster-autoscaler.addons.k8s.iok8s-app: {{ template "eks-helper.name" . }}-cluster-autoscalerroleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: {{ template "eks-helper.name" . }}-cluster-autoscalersubjects:- kind: ServiceAccountname: {{ template "eks-helper.name" . }}-cluster-autoscalernamespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: RoleBindingmetadata:name: {{ template "eks-helper.name" . }}-cluster-autoscalernamespace: kube-systemlabels:k8s-addon: cluster-autoscaler.addons.k8s.iok8s-app: {{ template "eks-helper.name" . }}-cluster-autoscalerroleRef:apiGroup: rbac.authorization.k8s.iokind: Rolename: {{ template "eks-helper.name" . }}-cluster-autoscalersubjects:- kind: ServiceAccountname: {{ template "eks-helper.name" . }}-cluster-autoscalernamespace: kube-system---apiVersion: apps/v1kind: Deploymentmetadata:name: {{ template "eks-helper.name" . }}-cluster-autoscaler{{- if .Values.clusterAutoscaler.annotations }}annotations:{{ toYaml .Values.clusterAutoscaler.annotations | indent 4}}{{- end }}namespace: kube-systemlabels:app: cluster-autoscalerspec:replicas: 1selector:matchLabels:app: {{ template "eks-helper.name" . }}-cluster-autoscalertemplate:metadata:labels:app: {{ template "eks-helper.name" . }}-cluster-autoscalerannotations:prometheus.io/scrape: 'true'prometheus.io/port: '8085'spec:serviceAccountName: {{ template "eks-helper.name" . }}-cluster-autoscalercontainers:- image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.17.3name: cluster-autoscalerresources:limits:cpu: 100mmemory: 300Mirequests:cpu: 100mmemory: 300Micommand:- ./cluster-autoscaler- --v=4- --stderrthreshold=info- --cloud-provider=aws- --skip-nodes-with-local-storage=false- --expander=least-waste{{- if .Values.clusterAutoscaler.updateEKSCommandOption }}- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/{{ .Values.clusterAutoscaler.clusterName }}- --balance-similar-node-groups- --skip-nodes-with-system-pods=false{{- end }}volumeMounts:- name: ssl-certsmountPath: /etc/ssl/certs/ca-certificates.crtreadOnly: trueimagePullPolicy: "Always"volumes:- name: ssl-certshostPath:path: "/etc/ssl/certs/ca-bundle.crt"

Repo link

https://github.com/shubhamitc/eks-helm-helper

--

--

shubham kumar singh
DevOps-Journey

Googler | Cloud computing| Kubernetes | Containers | Monitoring | Python