项目作者: lorenzomicheli

项目描述 :
Setup EKS cluster on Fargate to run your application at scale!
高级语言:
项目地址: git://github.com/lorenzomicheli/pimp-your-eks.git
创建时间: 2020-04-01T09:15:41Z
项目社区:https://github.com/lorenzomicheli/pimp-your-eks

开源协议:Apache License 2.0

下载


Pimp your EKS cluster on Fargate


Introduction

Setting up a Kubernetes cluster is not an easy task. Amazon EKS greatly simplify the creation and the management of a Kubernetes (k8s) cluster, and used together with AWS Fargate removes the need to provision and manage servers part of the cluster.

The purpose of this post is to guide you step by step in the creation of the cluster and the configuration of the following components:

  • OIDC Provider (k8s services accounts integrated with AWS IAM)
  • ALB Ingress Controller (k8s ingresses integrated with AWS ALB)
  • External DNS (k8s services and ingresses integrated with Amazon Route53)
  • Horizontal Pod Autoscaler (automatically scales the number of pods according to CPU usage)
  • Pod Monitoring for EKS Fargate with Prometheus and Grafana (monitoring and alerting for k8s infrastructure and applications)
  • Pod Logging for EKS Fargate (logging integration for k8s applications with Amazon CloudWatch Logs)

Prerequisites

In order to follow this guide you will need the following tools:

  • kubectl - Kubernetes CLI tool
  • eksctl - a simple CLI tool for creating clusters on EKS
  • aws-iam-authenticator - a tool to use AWS IAM credentials to authenticate to a Kubernetes cluster
  • helm - package manager for Kubernetes
  • jq - a lightweight and flexible command-line JSON processor
  • [OPTIONAL] k9s - a terminal UI to interact with your Kubernetes clusters

Once installed helm, you need to add a chart repository as follows helm:

  1. helm repo add stable https://kubernetes-charts.storage.googleapis.com
  2. helm repo update

Cluster creation

Let’s start creating our cluster. To run our containers on AWS Fargate, we need to create one or more Fargate profiles and define which namespaces are going to be deployed with these profiles.

Open a file named cluster.yaml where we define, the name of the cluster, the region where our cluster will be deployed, and the Fargate profiles of our choice.

  1. apiVersion: eksctl.io/v1alpha5
  2. kind: ClusterConfig
  3. metadata:
  4. name: eks-cluster
  5. region: eu-west-1
  6. fargateProfiles:
  7. - name: default
  8. selectors:
  9. - namespace: default
  10. - name: kube-system
  11. selectors:
  12. - namespace: kube-system

To create our cluster and the profiles we use the eksctl tool as follows:

  1. eksctl create cluster -f cluster.yaml

By default, eksctl will create a new VPC and a number of subnets used by your Fargate containers.

In some scenarios, you might want to use your own VPC and subnets which you have already configured:

  • You need to have at least 2 public and 2 private subnets

  • Tag the VPC and the subnets in your VPC containing the subnets you specify in the following way so that Kubernetes can discover it

  • Key: The <cluster-name> value matches your Amazon EKS cluster as
    defined in cluster.yaml

  • Value: The shared value allows more than one cluster to use this
    subnet.

  • Tag the private subnets in the following way so that Kubernetes knows it can use the subnets for internal load balancers.

  • Tag the public subnets in your VPC so that Kubernetes knows to use only those subnets for external load balancers

  • Add the following block to cluster.yaml, where you specify the ids of your public and private subnets in the respective AZ, before running the eksctl command:

  1. vpc:
  2. subnets:
  3. private:
  4. eu-west-1a: { id: subnet-0ff156e0c4a6d300c }
  5. eu-west-1b: { id: subnet-0549cdab573695c03 }
  6. public:
  7. eu-west-1a: { id: subnet-0ff156e0c4a6d300b }
  8. eu-west-1b: { id: subnet-0549cdab573695c43 }

ALB Ingress Controller on Amazon EKS

In order to trigger the creation of an Application Load Balancer (currently the only support by EKS on Fargate) and the necessary supporting AWS resources whenever a specific Ingress resource is created on the cluster, we need to configure the ALB Ingress Controller for Kubernetes.

Create the following iam-policy.json file:

  1. {
  2. "Version": "2012-10-17",
  3. "Statement": [
  4. {
  5. "Effect": "Allow",
  6. "Action": [
  7. "acm:DescribeCertificate",
  8. "acm:ListCertificates",
  9. "acm:GetCertificate"
  10. ],
  11. "Resource": "*"
  12. },
  13. {
  14. "Effect": "Allow",
  15. "Action": [
  16. "ec2:AuthorizeSecurityGroupIngress",
  17. "ec2:CreateSecurityGroup",
  18. "ec2:CreateTags",
  19. "ec2:DeleteTags",
  20. "ec2:DeleteSecurityGroup",
  21. "ec2:DescribeAccountAttributes",
  22. "ec2:DescribeAddresses",
  23. "ec2:DescribeInstances",
  24. "ec2:DescribeInstanceStatus",
  25. "ec2:DescribeInternetGateways",
  26. "ec2:DescribeNetworkInterfaces",
  27. "ec2:DescribeSecurityGroups",
  28. "ec2:DescribeSubnets",
  29. "ec2:DescribeTags",
  30. "ec2:DescribeVpcs",
  31. "ec2:ModifyInstanceAttribute",
  32. "ec2:ModifyNetworkInterfaceAttribute",
  33. "ec2:RevokeSecurityGroupIngress"
  34. ],
  35. "Resource": "*"
  36. },
  37. {
  38. "Effect": "Allow",
  39. "Action": [
  40. "elasticloadbalancing:AddListenerCertificates",
  41. "elasticloadbalancing:AddTags",
  42. "elasticloadbalancing:CreateListener",
  43. "elasticloadbalancing:CreateLoadBalancer",
  44. "elasticloadbalancing:CreateRule",
  45. "elasticloadbalancing:CreateTargetGroup",
  46. "elasticloadbalancing:DeleteListener",
  47. "elasticloadbalancing:DeleteLoadBalancer",
  48. "elasticloadbalancing:DeleteRule",
  49. "elasticloadbalancing:DeleteTargetGroup",
  50. "elasticloadbalancing:DeregisterTargets",
  51. "elasticloadbalancing:DescribeListenerCertificates",
  52. "elasticloadbalancing:DescribeListeners",
  53. "elasticloadbalancing:DescribeLoadBalancers",
  54. "elasticloadbalancing:DescribeLoadBalancerAttributes",
  55. "elasticloadbalancing:DescribeRules",
  56. "elasticloadbalancing:DescribeSSLPolicies",
  57. "elasticloadbalancing:DescribeTags",
  58. "elasticloadbalancing:DescribeTargetGroups",
  59. "elasticloadbalancing:DescribeTargetGroupAttributes",
  60. "elasticloadbalancing:DescribeTargetHealth",
  61. "elasticloadbalancing:ModifyListener",
  62. "elasticloadbalancing:ModifyLoadBalancerAttributes",
  63. "elasticloadbalancing:ModifyRule",
  64. "elasticloadbalancing:ModifyTargetGroup",
  65. "elasticloadbalancing:ModifyTargetGroupAttributes",
  66. "elasticloadbalancing:RegisterTargets",
  67. "elasticloadbalancing:RemoveListenerCertificates",
  68. "elasticloadbalancing:RemoveTags",
  69. "elasticloadbalancing:SetIpAddressType",
  70. "elasticloadbalancing:SetSecurityGroups",
  71. "elasticloadbalancing:SetSubnets",
  72. "elasticloadbalancing:SetWebACL"
  73. ],
  74. "Resource": "*"
  75. },
  76. {
  77. "Effect": "Allow",
  78. "Action": [
  79. "iam:CreateServiceLinkedRole",
  80. "iam:GetServerCertificate",
  81. "iam:ListServerCertificates"
  82. ],
  83. "Resource": "*"
  84. },
  85. {
  86. "Effect": "Allow",
  87. "Action": ["cognito-idp:DescribeUserPoolClient"],
  88. "Resource": "*"
  89. },
  90. {
  91. "Effect": "Allow",
  92. "Action": [
  93. "waf-regional:GetWebACLForResource",
  94. "waf-regional:GetWebACL",
  95. "waf-regional:AssociateWebACL",
  96. "waf-regional:DisassociateWebACL"
  97. ],
  98. "Resource": "*"
  99. },
  100. {
  101. "Effect": "Allow",
  102. "Action": ["tag:GetResources", "tag:TagResources"],
  103. "Resource": "*"
  104. },
  105. {
  106. "Effect": "Allow",
  107. "Action": ["waf:GetWebACL"],
  108. "Resource": "*"
  109. },
  110. {
  111. "Effect": "Allow",
  112. "Action": [
  113. "shield:DescribeProtection",
  114. "shield:GetSubscriptionState",
  115. "shield:DeleteProtection",
  116. "shield:CreateProtection",
  117. "shield:DescribeSubscription",
  118. "shield:ListProtections"
  119. ],
  120. "Resource": "*"
  121. }
  122. ]
  123. }

Create an IAM OIDC provider and associate it with your cluster:

  1. eksctl utils associate-iam-oidc-provider --region eu-west-1 --cluster eks-cluster \
  2. --approve

Create an IAM policy called ALBIngressControllerIAMPolicy for the ALB Ingress Controller pod that allows it to make calls to AWS APIs on your behalf:

  1. aws iam create-policy --policy-name ALBIngressControllerIAMPolicy \
  2. --policy-document file://iam-policy.json

We need to define a Kubernetes service account named alb-ingress-controller in the kube-system namespace, cluster role, and a cluster role binding for the ALB Ingress Controller. Create a file named rbac-role.yaml as follows:

  1. ---
  2. apiVersion: rbac.authorization.k8s.io/v1
  3. kind: ClusterRole
  4. metadata:
  5. labels:
  6. app.kubernetes.io/name: alb-ingress-controller
  7. name: alb-ingress-controller
  8. rules:
  9. - apiGroups:
  10. - ""
  11. - extensions
  12. resources:
  13. - configmaps
  14. - endpoints
  15. - events
  16. - ingresses
  17. - ingresses/status
  18. - services
  19. - pods/status
  20. verbs:
  21. - create
  22. - get
  23. - list
  24. - update
  25. - watch
  26. - patch
  27. - apiGroups:
  28. - ""
  29. - extensions
  30. resources:
  31. - nodes
  32. - pods
  33. - secrets
  34. - services
  35. - namespaces
  36. verbs:
  37. - get
  38. - list
  39. - watch
  40. ---
  41. apiVersion: rbac.authorization.k8s.io/v1
  42. kind: ClusterRoleBinding
  43. metadata:
  44. labels:
  45. app.kubernetes.io/name: alb-ingress-controller
  46. name: alb-ingress-controller
  47. roleRef:
  48. apiGroup: rbac.authorization.k8s.io
  49. kind: ClusterRole
  50. name: alb-ingress-controller
  51. subjects:
  52. - kind: ServiceAccount
  53. name: alb-ingress-controller
  54. namespace: kube-system
  55. ---
  56. apiVersion: v1
  57. kind: ServiceAccount
  58. metadata:
  59. labels:
  60. app.kubernetes.io/name: alb-ingress-controller
  61. name: alb-ingress-controller
  62. namespace: kube-system

And apply it with kubectl:

  1. kubectl apply -f rbac-role.yaml

Create an IAM role for the ALB ingress controller and attach the role to the service account created in the previous step. Change <ACCOUNT_ID> with the id of your AWS Account and <AWS_REGION> with the region where the cluster has been created:

  1. eksctl create iamserviceaccount --region <AWS_REGION> \
  2. --name alb-ingress-controller \
  3. --namespace kube-system \
  4. --cluster eks-cluster \
  5. --attach-policy-arn arn:aws:iam::<ACCOUNT_ID>:policy/ALBIngressControllerIAMPolicy \
  6. --override-existing-serviceaccounts \
  7. --approve

Create a file alb-ingress-controller.yaml to define the alb-ingress-controller deployment, replacing <CLUSTER_NAME> with the name of your cluster, <VPC_ID> with the id of the VPC used by the cluster and <AWS_REGION> with the region where the cluster as been created:

  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. labels:
  5. app.kubernetes.io/name: alb-ingress-controller
  6. name: alb-ingress-controller
  7. namespace: kube-system
  8. spec:
  9. selector:
  10. matchLabels:
  11. app.kubernetes.io/name: alb-ingress-controller
  12. template:
  13. metadata:
  14. labels:
  15. app.kubernetes.io/name: alb-ingress-controller
  16. spec:
  17. containers:
  18. - name: alb-ingress-controller
  19. args:
  20. - --ingress-class=alb
  21. - --cluster-name=<CLUSTER_NAME>
  22. - --aws-vpc-id=<VPC_ID>
  23. - --aws-region=<AWS_REGION>
  24. # - --aws-api-debug
  25. # - --aws-max-retries=10
  26. env:
  27. #- name: AWS_ACCESS_KEY_ID
  28. # value: KEYVALUE
  29. #- name: AWS_SECRET_ACCESS_KEY
  30. # value: SECRETVALUE
  31. image: docker.io/amazon/aws-alb-ingress-controller:v1.1.6
  32. serviceAccountName: alb-ingress-controller

Deploy the ALB Ingress Controller with the following command:

  1. kubectl apply -f alb-ingress-controller.yaml

Verify that the alb-ingress-controller is running and there are no errors in logs:

  1. kubectl get pods -n kube-system | grep alb-ingress-controller
  2. alb-ingress-controller-688dd94984-m6554 1/1 Running 0 4d22h

Deploy a sample application

We deploy the game 2048 as a sample application to verify that the ALB Ingress Controller creates an Application Load Balancer as a result of the Ingress object.

First we create a new Fargate profile for this application:

  1. eksctl create fargateprofile --cluster eks-cluster --region eu-west-1 \
  2. --name 2048-game --namespace 2048-game

Create a 2048.yaml to define our namespace, deployment, service and
ingress:

  1. ---
  2. apiVersion: v1
  3. kind: Namespace
  4. metadata:
  5. name: "2048-game"
  6. ---
  7. apiVersion: apps/v1
  8. kind: Deployment
  9. metadata:
  10. name: "2048-deployment"
  11. namespace: "2048-game"
  12. spec:
  13. selector:
  14. matchLabels:
  15. app: "2048"
  16. replicas: 5
  17. template:
  18. metadata:
  19. labels:
  20. app: "2048"
  21. spec:
  22. containers:
  23. - image: alexwhen/docker-2048
  24. imagePullPolicy: Always
  25. name: "2048"
  26. ports:
  27. - containerPort: 80
  28. ---
  29. apiVersion: v1
  30. kind: Service
  31. metadata:
  32. name: "service-2048"
  33. namespace: "2048-game"
  34. spec:
  35. ports:
  36. - port: 80
  37. targetPort: 80
  38. protocol: TCP
  39. type: NodePort
  40. selector:
  41. app: "2048"
  42. ---
  43. apiVersion: apps/v1
  44. kind: Deployment
  45. metadata:
  46. name: "2048-deployment"
  47. namespace: "2048-game"
  48. spec:
  49. selector:
  50. matchLabels:
  51. app: "2048"
  52. replicas: 5
  53. template:
  54. metadata:
  55. labels:
  56. app: "2048"
  57. spec:
  58. containers:
  59. - image: alexwhen/docker-2048
  60. imagePullPolicy: Always
  61. name: "2048"
  62. ports:
  63. - containerPort: 80
  64. ---
  65. apiVersion: extensions/v1beta1
  66. kind: Ingress
  67. metadata:
  68. name: "2048-ingress"
  69. namespace: "2048-game"
  70. annotations:
  71. kubernetes.io/ingress.class: alb
  72. alb.ingress.kubernetes.io/scheme: internet-facing
  73. alb.ingress.kubernetes.io/target-type: ip
  74. labels:
  75. app: 2048-ingress
  76. spec:
  77. rules:
  78. - http:
  79. paths:
  80. - path: /*
  81. backend:
  82. serviceName: "service-2048"
  83. servicePort: 80

Apply the manifest file:

  1. kubectl apply -f 2048.yaml

After a few minutes, verify that the Ingress resource was created with the following command.

  1. kubectl get ingress/2048-ingress -n 2048-game
  2. NAME HOSTS ADDRESS PORTS AGE
  3. 2048-ingress * example-2048game-2048ingr-6fa0-352729433.region-code.elb.amazonaws.com 80 24h

Open a browser and navigate to the ADDRESS URL from the previous command output to see the sample application.

When you finish experimenting with your sample application, delete it with the following command.

  1. kubectl delete -f 2048.yaml

Route53 Integration

external-dns provisions DNS records based on the host information. This project will setup and manage records in Route 53 that point to controller deployed ALBs.

Create a private hosted zone on Route 53 and associate it with the VPC where EKS is running (https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zone-private-creating.html).

Create the service account:

  1. eksctl create iamserviceaccount --name external-dns --namespace kube-system --cluster eks-cluster --attach-policy-arn arn:aws:iam::aws:policy/AmazonRoute53FullAccess --override-existing-serviceaccounts --approve --region eu-west-1

Create a file external-dns.yaml as follows:

  1. apiVersion: v1
  2. kind: ServiceAccount
  3. metadata:
  4. name: external-dns
  5. namespace: kube-system
  6. ---
  7. apiVersion: rbac.authorization.k8s.io/v1beta1
  8. kind: ClusterRole
  9. metadata:
  10. name: external-dns
  11. rules:
  12. - apiGroups: [""]
  13. resources: ["services"]
  14. verbs: ["get", "watch", "list"]
  15. - apiGroups: [""]
  16. resources: ["pods"]
  17. verbs: ["get", "watch", "list"]
  18. - apiGroups: ["extensions"]
  19. resources: ["ingresses"]
  20. verbs: ["get", "watch", "list"]
  21. - apiGroups: [""]
  22. resources: ["nodes"]
  23. verbs: ["list"]
  24. ---
  25. apiVersion: rbac.authorization.k8s.io/v1beta1
  26. kind: ClusterRoleBinding
  27. metadata:
  28. name: external-dns-viewer
  29. roleRef:
  30. apiGroup: rbac.authorization.k8s.io
  31. kind: ClusterRole
  32. name: external-dns
  33. subjects:
  34. - kind: ServiceAccount
  35. name: external-dns
  36. namespace: kube-system
  37. ---
  38. apiVersion: apps/v1
  39. kind: Deployment
  40. metadata:
  41. name: external-dns
  42. namespace: kube-system
  43. spec:
  44. selector:
  45. matchLabels:
  46. app: external-dns
  47. strategy:
  48. type: Recreate
  49. template:
  50. metadata:
  51. labels:
  52. app: external-dns
  53. spec:
  54. serviceAccountName: external-dns
  55. containers:
  56. - name: external-dns
  57. #image: registry.opensource.zalan.do/teapot/external-dns:v0.5.18
  58. image: eu.gcr.io/k8s-artifacts-prod/external-dns/external-dns:v0.7.0
  59. args:
  60. - --source=service
  61. - --source=ingress
  62. - --domain-filter=my-domain.com # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones
  63. - --provider=aws
  64. - --policy=upsert-only # would prevent ExternalDNS from deleting any records, omit to enable full synchronization
  65. - --aws-zone-type=private # only look at public hosted zones (valid values are public, private or no value for both)
  66. - --registry=txt
  67. - --txt-owner-id=my-identifier
  68. securityContext:
  69. fsGroup: 65534
  70. volumes:
  71. - name: token-vol
  72. projected:
  73. sources:
  74. - serviceAccountToken:
  75. path: token

Edit the --domain-filter flag to include your hosted zone(s) and deploy external-dns:

  1. kubectl apply -f external-dns.yaml

Verify it deployed correctly:

  1. kubectl apply -f external-dns.yaml

In order to check if the integration works, you can add the external-dns.alpha.kubernetes.io/hostname annotation in the 2048 ingress annotations using the domain created in the private hosted zone:

  1. apiVersion: extensions/v1beta1
  2. kind: Ingress
  3. metadata:
  4. name: "2048-ingress"
  5. namespace: "2048-game"
  6. annotations:
  7. kubernetes.io/ingress.class: alb
  8. alb.ingress.kubernetes.io/scheme: internet-facing
  9. alb.ingress.kubernetes.io/target-type: ip
  10. external-dns.alpha.kubernetes.io/hostname: 2048.my-domain.com # add this new annotation
  11. labels:
  12. app: 2048-ingress
  13. spec:
  14. rules:
  15. - http:
  16. paths:
  17. - path: /*
  18. backend:
  19. serviceName: "service-2048"
  20. servicePort: 80

Then, apply it once again with kubectl, and you can check external-dns pod logs again or the Route53 private hosted zone for the new DNS records.


Horizontal Pod Autoscaler

The Kubernetes Horizontal Pod Autoscaler automatically scales the number of pods in a deployment, replication controller, or replica set based on that resource\’s CPU utilization. This can help your applications scale out to meet increased demand or scale in when resources are not needed, thus freeing up your worker nodes for other applications. When you set a target CPU utilization percentage, the Horizontal Pod Autoscaler scales your application in or out to try to meet that target.

Install the metric service:

  1. DOWNLOAD_URL=$(curl -Ls "https://api.github.com/repos/kubernetes-sigs/metrics-server/releases/latest" | jq -r .tarball_url)
  2. DOWNLOAD_VERSION=$(grep -o '[^/v]*$' <<< $DOWNLOAD_URL)
  3. curl -Ls $DOWNLOAD_URL -o metrics-server-$DOWNLOAD_VERSION.tar.gz
  4. mkdir metrics-server-$DOWNLOAD_VERSION
  5. tar -xzf metrics-server-$DOWNLOAD_VERSION.tar.gz --directory metrics-server-$DOWNLOAD_VERSION --strip-components 1
  6. kubectl apply -f metrics-server-$DOWNLOAD_VERSION/deploy/1.8+/

Verify that the metrics-server deployment is running the desired number of pods with the following command.

  1. kubectl get deployment metrics-server -n kube-system

Test the autoscaler

To test the autoscaler configuration, let’s run an Apache web server and a stress test against it.

First run the web server.

  1. kubectl get deployment metrics-server -n kube-system

Then configure autoscaler to target the 50% of CPU, with a minimum of 1
pod and a maximum of 5:

  1. kubectl autoscale deployment httpd --cpu-percent=50 --min=1 --max=5

Describe the autoscaler to crosscheck the configuration:

  1. kubectl describe hpa/httpd

Run the load test to test it:

  1. kubectl run apache-bench -i --tty --rm --image=httpd -- ab -n 500000 -c 1000 http://httpd.default.svc.cluster.local/

Verify that the pods are scaling out:

  1. kubectl get horizontalpodautoscaler.autoscaling/httpd

And cleanup:

  1. kubectl delete deployment.apps/httpd service/httpd horizontalpodautoscaler.autoscaling/httpd

Monitoring with Prometheus and Grafana

We will use Prometheus to record real-time metrics from our cluster and Grafana to create dashboard to display these metrics.

So far we created an EKS cluster backed by the Fargate engine. As of now, one limitation of AWS Fargate is the lack of persistence storage on both EBS or EFS. If we spin-up a Prometheus container on Fargate, it will collect the metrics locally to the container, and if the container terminates we will loose all our metrics. The same stands for Grafana, if we deploy a Grafana container on Fargate, our fancy dashboards won’t survive a container termination.

In most cases this behavior is not acceptable. To overcome this current limitation, we will configure an EKS node group, which will create an Autoscaling Group which spans VPC AZs with one desired EC2 instance. We will also create an EFS volume to persist our data and make it available on all AZs.

Prometheus

Prometheus is an open-source systems monitoring and alerting toolkit. It records real-time metrics in a time series database built using a HTTP pull model, with flexible queries and real-time alerting.

Option 1: Setup Prometheus without persistence

This setup will deploy a Prometheus container on Fargate, which means that the metrics database will not survive a container termination.

Let’s start creating a Fargate profile and a namespace:

  1. eksctl create fargateprofile --cluster eks-cluster --region eu-west-1 \
  2. --name prometheus --namespace prometheus
  1. kubectl create namespace prometheus

Install prometheus from HelmHub without any persistence:

  1. helm install prometheus stable/prometheus --set alertmanager.persistentVolume.enabled="false" --set server.persistentVolume.enabled="false" -n prometheus

Enable port forwarding:

  1. kubectl port-forward -n prometheus deploy/prometheus-server 8080:9090

Verify the installation navigating with the browser to http://localhost:8080/targets.

Option 2: Setup Prometheus with persistence

EKS node group definition

In a eksctl-nodegroup.yaml we define the new managed node group:

  1. apiVersion: eksctl.io/v1alpha5
  2. kind: ClusterConfig
  3. metadata:
  4. name: eks-cluster
  5. region: eu-west-1
  6. managedNodeGroups:
  7. - name: ng-1-monitoring
  8. labels: { role: monitoring }
  9. instanceType: t3.small
  10. desiredCapacity: 1
  11. minSize: 1
  12. maxSize: 2

And we create the managed node group:

  1. eksctl create nodegroup --config-file=eksctl-nodegroup.yaml

Verify that the node group is up and running with the following command:

  1. eksctl get nodegroup --cluster eks-cluster --region eu-west-1
EFS Configuration

Next, we are going to create an EFS filesystem. It will be used by the Prometheus container to store real-time metric data:

  1. aws efs create-file-system \
  2. --tags Key=Name,Value=PrometheusFs \
  3. --region eu-west-1

You will get the following response:

  1. {
  2. "OwnerId": "174112236391",
  3. "CreationToken": "41d3ee19-9f1d-4b60-859d-937d2e2e1586",
  4. "FileSystemId": "fs-98132853",
  5. "CreationTime": "2020-03-31T09:33:57+02:00",
  6. "LifeCycleState": "available",
  7. "Name": "PrometheusFs",
  8. "NumberOfMountTargets": 0,
  9. "SizeInBytes": {
  10. "Value": 6144,
  11. "ValueInIA": 0,
  12. "ValueInStandard": 6144
  13. },
  14. "PerformanceMode": "generalPurpose",
  15. "Encrypted": false,
  16. "ThroughputMode": "bursting",
  17. "Tags": [
  18. {
  19. "Key": "Name",
  20. "Value": "PrometheusFs"
  21. }
  22. ]
  23. }

Then we need to create a mount target for each private subnet, specifying the security group of the EKS cluster created with eksctl:

  1. aws eks describe-cluster --name eks-cluster --region eu-west-1 | jq ".cluster.resourcesVpcConfig.clusterSecurityGroupId"
  2. "sg-029325a242a081e08"
  3. aws efs create-mount-target \
  4. --file-system-id fs-98132853 \
  5. --subnet-id "subnet-09be91ec7cb1ddbd6" \
  6. --security-group "sg-029325a242a081e08" \
  7. --region eu-west-1
  8. aws efs create-mount-target \
  9. --file-system-id fs-98132853 \
  10. --subnet-id "subnet-0e5d1f9e4e203ce75" \
  11. --security-group "sg-029325a242a081e08" \
  12. --region eu-west-1
EFS Provisioner

Now that the Node group and the EFS have been created we move on k8s setup. Create a prometheus namespace:

  1. kubectl create namespace prometheus

In order to have our prometheus container mount the EFS volume we have just created, we need to deploy an efs-provisioner.

The efs-provisioner allows you to mount EFS storage as PersistentVolumes in Kubernetes. It consists of a container that has access to an AWS EFS resource. The container reads a ConfigMap which contains the EFS filesystem ID, the AWS region and the name you want to use for your efs-provisioner. This name will be used later when you create a storage class.

To do that we create a prometheus-efs-manifest.yaml as follows:

  1. ---
  2. kind: PersistentVolumeClaim
  3. apiVersion: v1
  4. metadata:
  5. name: prometheus-efs
  6. namespace: prometheus
  7. annotations:
  8. volume.beta.kubernetes.io/storage-class: "prometheus-efs"
  9. spec:
  10. accessModes:
  11. - ReadWriteMany
  12. resources:
  13. requests:
  14. storage: 1Mi
  15. ---
  16. kind: StorageClass
  17. apiVersion: storage.k8s.io/v1
  18. metadata:
  19. name: prometheus-efs
  20. provisioner: prometheus/aws-efs
  21. ---
  22. kind: ClusterRole
  23. apiVersion: rbac.authorization.k8s.io/v1
  24. metadata:
  25. name: efs-provisioner-runner
  26. rules:
  27. - apiGroups: [""]
  28. resources: ["persistentvolumes"]
  29. verbs: ["get", "list", "watch", "create", "delete"]
  30. - apiGroups: [""]
  31. resources: ["persistentvolumeclaims"]
  32. verbs: ["get", "list", "watch", "update"]
  33. - apiGroups: ["storage.k8s.io"]
  34. resources: ["storageclasses"]
  35. verbs: ["get", "list", "watch"]
  36. - apiGroups: [""]
  37. resources: ["events"]
  38. verbs: ["create", "update", "patch"]
  39. ---
  40. kind: ClusterRoleBinding
  41. apiVersion: rbac.authorization.k8s.io/v1
  42. metadata:
  43. name: run-efs-provisioner
  44. subjects:
  45. - kind: ServiceAccount
  46. name: efs-provisioner
  47. # replace with namespace where provisioner is deployed
  48. namespace: prometheus
  49. roleRef:
  50. kind: ClusterRole
  51. name: efs-provisioner-runner
  52. apiGroup: rbac.authorization.k8s.io
  53. ---
  54. kind: Role
  55. apiVersion: rbac.authorization.k8s.io/v1
  56. metadata:
  57. name: leader-locking-efs-provisioner
  58. namespace: prometheus
  59. rules:
  60. - apiGroups: [""]
  61. resources: ["endpoints"]
  62. verbs: ["get", "list", "watch", "create", "update", "patch"]
  63. ---
  64. kind: RoleBinding
  65. apiVersion: rbac.authorization.k8s.io/v1
  66. metadata:
  67. name: leader-locking-efs-provisioner
  68. namespace: prometheus
  69. subjects:
  70. - kind: ServiceAccount
  71. name: efs-provisioner
  72. # replace with namespace where provisioner is deployed
  73. namespace: prometheus
  74. roleRef:
  75. kind: Role
  76. name: leader-locking-efs-provisioner
  77. apiGroup: rbac.authorization.k8s.io

Then we create a prometheus-efs-deployment.yaml file to define a ConfigMap, replace <FS_ID> with your EFS filesystem id in the field file.system.id (in our example fs-98132853) and <AWS_REGION> with the region of our cluster, and a new Deployment replacing <EFS_URL> with the url of our EFS filesystem in the following format fs_id.efs.region.amazonaws.com .

  1. ---
  2. apiVersion: v1
  3. kind: ConfigMap
  4. metadata:
  5. name: efs-provisioner
  6. namespace: prometheus
  7. data:
  8. file.system.id: <FS_ID>
  9. aws.region: <AWS_REGION>
  10. provisioner.name: prometheus/aws-efs
  11. dns.name: ""
  12. ---
  13. apiVersion: v1
  14. kind: ServiceAccount
  15. metadata:
  16. name: efs-provisioner
  17. namespace: prometheus
  18. ---
  19. kind: Deployment
  20. apiVersion: apps/v1
  21. metadata:
  22. name: efs-provisioner
  23. namespace: prometheus
  24. spec:
  25. replicas: 1
  26. selector:
  27. matchLabels:
  28. app: efs-provisioner
  29. strategy:
  30. type: Recreate
  31. template:
  32. metadata:
  33. labels:
  34. app: efs-provisioner
  35. spec:
  36. serviceAccount: efs-provisioner
  37. containers:
  38. - name: efs-provisioner
  39. image: quay.io/external_storage/efs-provisioner:latest
  40. env:
  41. - name: FILE_SYSTEM_ID
  42. valueFrom:
  43. configMapKeyRef:
  44. name: efs-provisioner
  45. key: file.system.id
  46. - name: AWS_REGION
  47. valueFrom:
  48. configMapKeyRef:
  49. name: efs-provisioner
  50. key: aws.region
  51. - name: DNS_NAME
  52. valueFrom:
  53. configMapKeyRef:
  54. name: efs-provisioner
  55. key: dns.name
  56. optional: true
  57. - name: PROVISIONER_NAME
  58. valueFrom:
  59. configMapKeyRef:
  60. name: efs-provisioner
  61. key: provisioner.name
  62. volumeMounts:
  63. - name: pv-volume
  64. mountPath: /persistentvolumes
  65. volumes:
  66. - name: pv-volume
  67. nfs:
  68. server: <EFS_URL>
  69. path: /

And we apply our configuration:

  1. kubectl apply -f prometheus-efs-manifest.yaml,prometheus-efs-deployment.yaml

Verify the correctness of the deployment:

  1. kubectl get deployment efs-provisioner -n prometheus
  2. NAME READY UP-TO-DATE AVAILABLE AGE
  3. efs-provisioner 1/1 1 1 3d18h
Prometheus deployment

Once we have our efs-provisioner up and running, we create a
prometheus-helm-config.yaml:

  1. alertmanager:
  2. persistentVolume:
  3. enabled: true
  4. existingClaim: prometheus-efs
  5. server:
  6. persistentVolume:
  7. enabled: true
  8. existingClaim: prometheus-efs

Install Prometheus using helm with the configuration above:

  1. helm install prometheus stable/prometheus -f prometheus-helm-config.yaml -n prometheus

Enable port forwarding:

  1. kubectl port-forward -n prometheus deploy/prometheus-server 8080:9090

Verify the installation navigating with the browser to
http://localhost:8080/targets.

Leave prometheus running for some time, so that it will collect metrics from the cluster. Then, if you kill the pod, the new pod will use the same database and write-ahead-log persisted on the shared EFS filesystem.

Grafana

Grafana is the open source analytics & monitoring solution that we will use to display metrics stored in Prometheus.

Likewise, deploying Grafana in EKS on Fargate will not persist your dashboard after a pod reboot or termination. If that works you, continue with the setup Option 1 otherwise follow the Option 2.

Option 1: Setup Grafana without persistence

Let’s start creating a namespace:

  1. kubectl create namespace grafana

Create a grafana-config.yaml file to configure the datasource:

  1. persistence:
  2. enabled: false
  3. datasources:
  4. datasources.yaml:
  5. apiVersion: 1
  6. datasources:
  7. - name: Prometheus
  8. type: prometheus
  9. url: http://prometheus-server.prometheus.svc.cluster.local
  10. access: proxy
  11. isDefault: true
  12. dashboardProviders:
  13. dashboardproviders.yaml:
  14. apiVersion: 1
  15. providers:
  16. - name: "default"
  17. orgId: 1
  18. folder: ""
  19. type: file
  20. disableDeletion: false
  21. editable: true
  22. options:
  23. path: /var/lib/grafana/dashboards/default
  24. dashboards:
  25. default:
  26. kube:
  27. gnetId: 8588
  28. revision: 1
  29. datasource: Prometheus
  30. prometheus-stats:
  31. gnetId: 2
  32. revision: 2
  33. datasource: Prometheus

Install Grafana from HelmHub without any persistence:

  1. helm install grafana stable/grafana --namespace grafana -f grafana-config.yaml --set adminPassword='admin123'

Deploy a Grafana ingress:

  1. apiVersion: extensions/v1beta1
  2. kind: Ingress
  3. metadata:
  4. name: "grafana"
  5. namespace: "grafana"
  6. annotations:
  7. kubernetes.io/ingress.class: alb
  8. alb.ingress.kubernetes.io/scheme: internet-facing
  9. alb.ingress.kubernetes.io/target-type: ip
  10. external-dns.alpha.kubernetes.io/hostname: grafana.my-domain.com
  11. labels:
  12. app: grafana
  13. spec:
  14. rules:
  15. - http:
  16. paths:
  17. - path: /*
  18. backend:
  19. serviceName: grafana
  20. servicePort: 80

Get the ingress properties for Grafana:

  1. kubectl get ingress -n grafana

Which shows the url of the ALB:

  1. NAME HOSTS ADDRESS PORTS AGE
  2. grafana * 6318d69f-grafana-grafana-fdb1-895535036.eu-west-1.elb.amazonaws.com 80 3d17h

Option 2: Setup Grafana with persistence

This setup is exactly the same done for Prometheus Option 2. The only difference is that, there is no need to create another EKS nodegroup as both Prometheus and Grafana will be deployed in the same node group.

Create a new k8s namespace for Grafana as follows:

  1. kubectl create namespace grafana

Then you will need to create another EFS filesystem and deploy another EFS Provisioner to mount the new filesystem.

Follow the configuration steps in the Prometheus section described in the EFS Configuration section and EFS-Provisioner section replacing every occurrence of the string “prometheus“ with “grafana“ in both YAML files and command lines. Create and apply grafana-efs-manifest.yaml and grafana-efs-deployment.yaml using the grafana namespace and the grafana EFS.

Create a grafana-config.yaml file to configure the datasource and add a couple of dashboards:

  1. replicas: 1
  2. persistence:
  3. enabled: true
  4. existingClaim: grafana-efs
  5. datasources:
  6. datasources.yaml:
  7. apiVersion: 1
  8. datasources:
  9. - name: Prometheus
  10. type: prometheus
  11. url: http://prometheus-server.prometheus.svc.cluster.local
  12. access: proxy
  13. isDefault: true
  14. dashboardProviders:
  15. dashboardproviders.yaml:
  16. apiVersion: 1
  17. providers:
  18. - name: "default"
  19. orgId: 1
  20. folder: ""
  21. type: file
  22. disableDeletion: false
  23. editable: true
  24. options:
  25. path: /var/lib/grafana/dashboards/default
  26. dashboards:
  27. default:
  28. kube:
  29. gnetId: 8588
  30. revision: 1
  31. datasource: Prometheus
  32. prometheus-stats:
  33. gnetId: 2
  34. revision: 2
  35. datasource: Prometheus

Install Grafana from HelmHub without any persistence:

  1. helm install grafana stable/grafana --namespace grafana -f grafana-config.yaml --set adminPassword='admin123'

Deploy a Grafana ingress:

  1. apiVersion: extensions/v1beta1
  2. kind: Ingress
  3. metadata:
  4. name: "grafana"
  5. namespace: "grafana"
  6. annotations:
  7. kubernetes.io/ingress.class: alb
  8. alb.ingress.kubernetes.io/scheme: internet-facing
  9. alb.ingress.kubernetes.io/target-type: ip
  10. external-dns.alpha.kubernetes.io/hostname: grafana.my-domain.com
  11. labels:
  12. app: grafana
  13. spec:
  14. rules:
  15. - http:
  16. paths:
  17. - path: /*
  18. backend:
  19. serviceName: grafana
  20. servicePort: 80

Get the ingress properties for Grafana:

  1. kubectl get ingress -n grafana

Which shows the url of the ALB:

  1. NAME HOSTS ADDRESS PORTS AGE
  2. grafana * 6318d69f-grafana-grafana-fdb1-895535036.eu-west-1.elb.amazonaws.com 80 3d17h

Logging on CloudWatch Logs

Currently EKS for Fargate does not support Kubenertes DaemonSets, therefore CloudWatch Insights is not usable. To be able to send container’s logs to CloudWatch with EKS Fargate, we need to create a new Service Account and attach an IAM policy which allow to do so.

  1. eksctl create iamserviceaccount --name cwl-fargate \
  2. --namespace default --cluster eks-cluster \
  3. --attach-policy-arn arn:aws:iam::aws:policy/CloudWatchFullAccess \
  4. --approve --region eu-west-1

On top of that, we will need a forwarder to forward logs from k8s to CloudWatch.

Fluent Bit is an open source and multi-platform Log Processor and Forwarder which allows you to collect data/logs from different sources, unify and send them to multiple destinations. It\’s fully compatible with Docker and Kubernetes environments.

First let’s create a fluentbit-config.yaml to hold the ConfigMap to configure fluent-bit. Replace the <AWS_REGION> with the region of the cluster, and replace <LOG_GROUP_NAME> and <LOG_STREAM_PREFIX> namely with a log group name and a prefix of your choice:

  1. apiVersion: v1
  2. kind: ConfigMap
  3. metadata:
  4. name: fluentbit-config
  5. data:
  6. # Configuration files: server, input, filters and output
  7. # ======================================================
  8. fluent-bit.conf: |
  9. [INPUT]
  10. Name tail
  11. Tag *.logs
  12. Path /var/log/*.log
  13. DB /var/log/logs.db
  14. Mem_Buf_Limit 5MB
  15. Skip_Long_Lines On
  16. Refresh_Interval 10
  17. [OUTPUT]
  18. Name cloudwatch
  19. Match *
  20. region <AWS_REGION>
  21. log_group_name <LOG_GROUP_NAME>
  22. log_stream_prefix <LOG_STREAM_PREFIX>
  23. auto_create_group true

Next we will configure a test pod to use the fluentbit config map above and to send the output on both standard output and to a log file which will be forwarded to CloudWatch from fluent bit, and a sidecar container with the fluentbit image:

fluentbit-sidecar.yaml

  1. apiVersion: v1
  2. kind: Pod
  3. metadata:
  4. name: counter
  5. spec:
  6. serviceAccountName: cwl-fargate
  7. containers:
  8. - name: count
  9. image: busybox
  10. args:
  11. - /bin/sh
  12. - -c
  13. - >
  14. i=0;
  15. while true;
  16. do
  17. echo "$i: $(date) this is an app log" 2>&1 | tee -a /var/log/app.log;
  18. echo "$(date) $(uname -r) $i" 2>&1 | tee -a /var/log/system.log;
  19. i=$((i+1));
  20. sleep 60;
  21. done
  22. volumeMounts:
  23. - name: varlog
  24. mountPath: /var/log
  25. - name: count-agent
  26. image: amazon/aws-for-fluent-bit:latest
  27. imagePullPolicy: Always
  28. ports:
  29. - containerPort: 2020
  30. env:
  31. - name: FLUENTD_HOST
  32. value: "fluentd"
  33. - name: FLUENTD_PORT
  34. value: "24224"
  35. volumeMounts:
  36. - name: varlog
  37. mountPath: /var/log
  38. - name: fluentbit-config
  39. mountPath: /fluent-bit/etc/
  40. terminationGracePeriodSeconds: 10
  41. volumes:
  42. - name: varlog
  43. emptyDir: {}
  44. - name: fluentbit-config
  45. configMap:
  46. name: fluentbit-config

Apply:

  1. kubectl apply -f fluentbit-config.yaml,fluentbit-sidecar.yaml

After few minutes check on CloudWatch console and you should see the new LogGroup and the logs generated by the test app.

  1. kubectl apply -f fluentbit-config.yaml,fluentbit-sidecar.yaml

AWS Secrets Manager Integration

Most likely your application needs to use secrets (API keys, database credentials, tokens). By default, Kubernetes does not provide a proper secrets management system so we need to integrate it using an external system.

AWS Secrets Manager is a secrets management system used to store, rotate, monitor, and control access to secrets.

At the time of this writing, there is no integration between EKS and AWS Secrets Manager. Fortunately GoDaddy wrote their own integration and open sourced it on GitHub.

For each secret in AWS Secret Manager, we need to define an ExternalSecrets. An External Secrets Controller deployed in our cluster, fetch the secret from AWS Secrets Manager and translate it into a Secrets entity. This process is totally transparent to the Pod that can access Secrets normally.

Alt

Let’s start creating a new secret in AWS Secret Manager

  1. aws secretsmanager create-secret --region eu-west-1 --name secret-consumer-service/credentials \
  2. --secret-string '{"username":"admin","password":"1234"}'

Then we define a scm-iam-policy.json to allow access to our secrets

  1. {
  2. "Version": "2012-10-17",
  3. "Statement": [
  4. {
  5. "Effect": "Allow",
  6. "Action": [
  7. "secretsmanager:GetResourcePolicy",
  8. "secretsmanager:GetSecretValue",
  9. "secretsmanager:DescribeSecret",
  10. "secretsmanager:ListSecretVersionIds"
  11. ],
  12. "Resource": ["arn:aws:secretsmanager:eu-west-1:608500418044:secret:*"]
  13. }
  14. ]
  15. }

And create the IAM policy

  1. aws iam create-policy --policy-name EKSSecretsManagerIAMPolicy \
  2. --policy-document file://scm-iam-policy.json

Create a ServiceAccount secrets-manager-sa.yaml:

  1. apiVersion: v1
  2. kind: ServiceAccount
  3. metadata:
  4. labels:
  5. app.kubernetes.io/name: secrets-manager
  6. name: secrets-manager
  7. namespace: kube-system
  1. kubectl apply -f secrets-manager-sa.yaml

Create an IAM role for the ALB ingress controller and attach the role to the service account created in the previous step. Change <ACCOUNT_ID> with the id of your AWS Account and <AWS_REGION> with the region where the cluster has been created:

  1. eksctl create iamserviceaccount --region <AWS_REGION> \
  2. --name secrets-manager \
  3. --namespace kube-system \
  4. --cluster eks-cluster \
  5. --attach-policy-arn arn:aws:iam::<ACCOUNT_ID>:policy/EKSSecretsManagerIAMPolicy \
  6. --override-existing-serviceaccounts \
  7. --approve

Add the kubernetes-external-secrets repo to helm

  1. helm repo add external-secrets https://godaddy.github.io/kubernetes-external-secrets/
  2. helm repo update

Create a kubernetes-external-secrets-config.yaml to configure kubernetes-external-secrets

  1. serviceAccount:
  2. create: false
  3. name: secrets-manager
  4. env:
  5. AWS_REGION: eu-west-1
  6. securityContext:
  7. fsGroup: 65534

and deploy kubernetes-external-secrets:

  1. helm install external-secrets external-secrets/kubernetes-external-secrets \
  2. -f ./kubernetes-external-secrets-config.yaml

Now that we configured kubernetes-external-secret, let’s see how to use it.

We create an external-secrets.yaml file to map our ExternalSecrets to the secrets defined in AWS Secrets Manager:

  1. apiVersion: "kubernetes-client.io/v1"
  2. kind: ExternalSecret
  3. metadata:
  4. name: secret-consumer-service
  5. spec:
  6. backendType: secretsManager
  7. data:
  8. - key: secret-consumer-service/credentials
  9. name: password
  10. property: password
  11. - key: secret-consumer-service/credentials
  12. name: username
  13. property: username

and apply

  1. kubectl apply -f external-secret-service.yaml

You can verify that the secrets has been correctly imported running:

  1. kubectl get secret secret-consumer-service -o=yaml

We create a test pod secrets-consumer-pod.yaml that logs our secrets on standard output (do not do this in production code). It reads the Secrets username and password from secret-consumer-service and it maps them with two environment variables to be used by our application:

  1. apiVersion: v1
  2. kind: Pod
  3. metadata:
  4. name: secrets-consumer
  5. spec:
  6. containers:
  7. - name: secrets-consumer
  8. image: busybox
  9. args:
  10. - /bin/sh
  11. - -c
  12. - >
  13. while true;
  14. do
  15. echo "Username: $SECRET_USERNAME";
  16. echo "Password: $SECRET_PASSWORD";
  17. sleep 60;
  18. done
  19. env:
  20. - name: SECRET_USERNAME
  21. valueFrom:
  22. secretKeyRef:
  23. name: secret-consumer-service
  24. key: username
  25. - name: SECRET_PASSWORD
  26. valueFrom:
  27. secretKeyRef:
  28. name: secret-consumer-service
  29. key: password
  30. restartPolicy: Never

Apply it:

  1. kubectl apply -f secrets-consumer-pod.yaml

If you check the logs, you should see the secrets:

  1. secrets-consumer Username: admin
  2. secrets-consumer Password: 1234