项目作者: Fred78290

项目描述 :
Kubernetes autoscaler for vsphere
高级语言: Go
项目地址: git://github.com/Fred78290/kubernetes-vmware-autoscaler.git
创建时间: 2018-12-26T20:15:52Z
项目社区:https://github.com/Fred78290/kubernetes-vmware-autoscaler

开源协议:Apache License 2.0

下载


Build Status
Quality Gate Status
Licence

kubernetes-vmware-autoscaler

Kubernetes autoscaler for vsphere/esxi including a custom resource controller to create managed node without code

Supported releases

  • 1.26.11
    • This version is supported kubernetes v1.26 and support k3s, rke2, external kubernetes distribution
  • 1.27.9
    • This version is supported kubernetes v1.27 and support k3s, rke2, external kubernetes distribution
  • 1.28.4
    • This version is supported kubernetes v1.28 and support k3s, rke2, external kubernetes distribution
  • 1.29.0
    • This version is supported kubernetes v1.29 and support k3s, rke2, external kubernetes distribution

How it works

This tool will drive vSphere to deploy VM at the demand. The cluster autoscaler deployment use vanilla cluster-autoscaler or my enhanced version of cluster-autoscaler.

This version use grpc to communicate with the cloud provider hosted outside the pod. A docker image is available here cluster-autoscaler

A sample of the cluster-autoscaler deployment is available at examples/cluster-autoscaler.yaml. You must fill value between <>

Before you must create a kubernetes cluster on vSphere

You can do it from scrash or you can use script from project autoscaled-masterkube-vmware to create a kubernetes cluster in single control plane or in HA mode with 3 control planes.

Commandline arguments

Parameter Description
version Display version and exit
save Tell the tool to save state in this file
config The the tool to use config file
log-format The format in which log messages are printed (default: text, options: text, json)
log-level Set the level of logging. (default: info, options: panic, debug, info, warning, error, fatal)
debug Debug mode
distribution Which kubernetes distribution to use: kubeadm, k3s, rke2, external
use-vanilla-grpc Tell we use vanilla autoscaler externalgrpc cloudprovider
use-controller-manager Tell we use vsphere controller manager
use-external-etcd Tell we use an external etcd service (overriden by config file if defined)
src-etcd-ssl-dir Locate the source etcd ssl files (overriden by config file if defined)
dst-etcd-ssl-dir Locate the destination etcd ssl files (overriden by config file if defined)
kubernetes-pki-srcdir Locate the source kubernetes pki files (overriden by config file if defined)
kubernetes-pki-dstdir Locate the destination kubernetes pki files (overriden by config file if defined)
server The Kubernetes API server to connect to (default: auto-detect)
kubeconfig Retrieve target cluster configuration from a Kubernetes configuration file (default: auto-detect)
request-timeout Request timeout when calling Kubernetes APIs. 0s means no timeout
deletion-timeout Deletion timeout when delete node. 0s means no timeout
node-ready-timeout Node ready timeout to wait for a node to be ready. 0s means no timeout
max-grace-period Maximum time evicted pods will be given to terminate gracefully.
min-cpus Limits: minimum cpu (default: 1)
max-cpus Limits: max cpu (default: 24)
min-memory Limits: minimum memory in MB (default: 1G)
max-memory Limits: max memory in MB (default: 24G)
min-managednode-cpus Managed node: minimum cpu (default: 2)
max-managednode-cpus Managed node: max cpu (default: 32)
min-managednode-memory Managed node: minimum memory in MB (default: 2G)
max-managednode-memory Managed node: max memory in MB (default: 24G)
min-managednode-disksize Managed node: minimum disk size in MB (default: 10MB)
max-managednode-disksize Managed node: max disk size in MB (default: 1T)

Build

The build process use make file. The simplest way to build is make container

New features

Use k3s, rke2 or external as kubernetes distribution method

Instead using kubeadm as kubernetes distribution method, it is possible to use k3s, rke2 or external

external allow to use custom shell script to join cluster

Samples provided here

Use the vanilla autoscaler with extern gRPC cloud provider

You can also use the vanilla autoscaler with the externalgrpc cloud provider

Samples of the cluster-autoscaler deployment with vanilla autoscaler. You must fill value between <>

Use external kubernetes distribution

When you use a custom method to create your cluster, you must provide a shell script to vmware-autoscaler to join the cluster. The script use a yaml config created by vmware-autscaler at the given path.

config: /etc/default/vmware-autoscaler-config.yaml

  1. provider-id: vsphere://42373f8d-b72d-21c0-4299-a667a18c9fce
  2. max-pods: 110
  3. node-name: vmware-dev-rke2-woker-01
  4. server: 192.168.1.120:9345
  5. token: K1060b887525bbfa7472036caa8a3c36b550fbf05e6f8e3dbdd970739cbd7373537
  6. disable-cloud-controller: false
  7. `

If you declare to use an external etcd service

  1. datastore-endpoint: https://1.2.3.4:2379
  2. datastore-cafile: /etc/ssl/etcd/ca.pem
  3. datastore-certfile: /etc/ssl/etcd/etcd.pem
  4. datastore-keyfile: /etc/ssl/etcd/etcd-key.pem

You can also provide extras config onto this file

  1. {
  2. "external": {
  3. "join-command": "/usr/local/bin/join-cluster.sh"
  4. "config-path": "/etc/default/vmware-autoscaler-config.yaml"
  5. "extra-config": {
  6. "mydata": {
  7. "extra": "ball"
  8. },
  9. "...": "..."
  10. }
  11. }
  12. }

Your script is responsible to set the correct kubelet flags such as max-pods=110, provider-id=vsphere://42373f8d-b72d-21c0-4299-a667a18c9fce, cloud-provider=external, …

Annotations requirements

If you expected to use vmware-autoscaler on already deployed kubernetes cluster, you must add some node annotations to existing node

Also don’t forget to create an image usable by vmware-autoscaler to scale up the cluster create-image.sh

Annotation Description Value
cluster-autoscaler.kubernetes.io/scale-down-disabled Avoid scale down for this node true
cluster.autoscaler.nodegroup/name Node group name vmware-dev-rke2
cluster.autoscaler.nodegroup/autoprovision Tell if the node is provisionned by vmware-autoscaler false
cluster.autoscaler.nodegroup/instance-id The vm UUID 42373f8d-b72d-21c0-4299-a667a18c9fce
cluster.autoscaler.nodegroup/managed Tell if the node is managed by vmware-autoscaler not autoscaled false
cluster.autoscaler.nodegroup/node-index The node index, will be set if missing 0

Sample master node

  1. cluster-autoscaler.kubernetes.io/scale-down-disabled: "true"
  2. cluster.autoscaler.nodegroup/autoprovision: "false"
  3. cluster.autoscaler.nodegroup/instance-id: 42373f8d-b72d-21c0-4299-a667a18c9fce
  4. cluster.autoscaler.nodegroup/managed: "false"
  5. cluster.autoscaler.nodegroup/name: vmware-dev-rke2
  6. cluster.autoscaler.nodegroup/node-index: "0"

Sample first worker node

  1. cluster-autoscaler.kubernetes.io/scale-down-disabled: "true"
  2. cluster.autoscaler.nodegroup/autoprovision: "false"
  3. cluster.autoscaler.nodegroup/instance-id: 42370879-d4f7-eab0-a1c2-918a97ac6856
  4. cluster.autoscaler.nodegroup/managed: "false"
  5. cluster.autoscaler.nodegroup/name: vmware-dev-rke2
  6. cluster.autoscaler.nodegroup/node-index: "1"

Sample autoscaled worker node

  1. cluster-autoscaler.kubernetes.io/scale-down-disabled: "false"
  2. cluster.autoscaler.nodegroup/autoprovision: "true"
  3. cluster.autoscaler.nodegroup/instance-id: 3d25c629-3f1d-46b3-be9f-b95db2a64859
  4. cluster.autoscaler.nodegroup/managed: "false"
  5. cluster.autoscaler.nodegroup/name: vmware-dev-rke2
  6. cluster.autoscaler.nodegroup/node-index: "2"

Node labels

These labels will be added

Label Description Value
node-role.kubernetes.io/control-plane Tell if the node is control-plane true
node-role.kubernetes.io/master Tell if the node is master true
node-role.kubernetes.io/worker Tell if the node is worker true

Network

Now it’s possible to disable dhcp-default routes and custom route

VMWare CPI compliant

Version 1.24.6 and 1.25.2 and above are vsphere cloud provider by building provider-id conform to syntax vsphere://<VM UUID>

CRD controller

This new release include a CRD controller allowing to create kubernetes node without use of govc or code. Just by apply a configuration file, you have the ability to create nodes on the fly.

As exemple you can take a look on artifacts/examples/example.yaml on execute the following command to create a new node

  1. kubectl apply -f artifacts/examples/example.yaml

If you want delete the node just delete the CRD with the call

  1. kubectl delete -f artifacts/examples/example.yaml

You have the ability also to create a control plane as instead a worker

  1. kubectl apply -f artifacts/examples/controlplane.yaml

The resource is cluster scope so you don’t need a namespace. The name of the resource is not the name of the managed node.

The minimal resource declaration

  1. apiVersion: "nodemanager.aldunelabs.com/v1alpha1"
  2. kind: "ManagedNode"
  3. metadata:
  4. name: "vmware-ca-k8s-managed-01"
  5. spec:
  6. nodegroup: vmware-ca-k8s
  7. vcpus: 2
  8. memorySizeInMb: 2048
  9. diskSizeInMb: 10240

The full qualified resource including networks declaration to override the default controller network management and adding some node labels & annotations. If you specify the managed node as controller, you can also allows the controlplane to support deployment as a worker node

  1. apiVersion: "nodemanager.aldunelabs.com/v1alpha1"
  2. kind: "ManagedNode"
  3. metadata:
  4. name: "vmware-ca-k8s-managed-01"
  5. spec:
  6. nodegroup: vmware-ca-k8s
  7. controlPlane: false
  8. allowDeployment: false
  9. vcpus: 2
  10. memorySizeInMb: 2048
  11. diskSizeInMb: 10240
  12. labels:
  13. - demo-label.acme.com=demo
  14. - sample-label.acme.com=sample
  15. annotations:
  16. - demo-annotation.acme.com=demo
  17. - sample-annotation.acme.com=sample
  18. networks:
  19. -
  20. network: "VM Network"
  21. address: 10.0.0.80
  22. netmask: 255.255.255.0
  23. gateway: 10.0.0.1
  24. use-dhcp-routes: false
  25. routes:
  26. - to: x.x.0.0/16
  27. via: 10.0.0.253
  28. metric: 100
  29. - to: y.y.y.y/8
  30. via: 10.0.0.253
  31. metric: 500
  32. -
  33. network: "VM Private"
  34. address: 192.168.1.80
  35. netmask: 255.255.255.0
  36. use-dhcp-routes: false

Declare additional routes and disable default DHCP routes

The release 1.24 and above allows to add additionnal route per interface, it also allows to disable default route declared by DHCP server.

As example of use generated by autoscaled-masterkube-vmware scripts

  1. {
  2. "use-external-etcd": false,
  3. "src-etcd-ssl-dir": "/etc/etcd/ssl",
  4. "dst-etcd-ssl-dir": "/etc/kubernetes/pki/etcd",
  5. "kubernetes-pki-srcdir": "/etc/kubernetes/pki",
  6. "kubernetes-pki-dstdir": "/etc/kubernetes/pki",
  7. "distribution": "rke2",
  8. "network": "unix",
  9. "listen": "/var/run/cluster-autoscaler/vmware.sock",
  10. "cert-private-key": "/etc/ssl/client-cert/tls.key",
  11. "cert-public-key": "/etc/ssl/client-cert/tls.crt",
  12. "cert-ca": "/etc/ssl/client-cert/ca.crt",
  13. "secret": "vmware",
  14. "minNode": 0,
  15. "maxNode": 9,
  16. "maxNode-per-cycle": 2,
  17. "node-name-prefix": "autoscaled",
  18. "managed-name-prefix": "managed",
  19. "controlplane-name-prefix": "master",
  20. "nodePrice": 0,
  21. "podPrice": 0,
  22. "image": "jammy-kubernetes-cni-flannel-v1.27.8-containerd-amd64",
  23. "optionals": {
  24. "pricing": false,
  25. "getAvailableMachineTypes": false,
  26. "newNodeGroup": false,
  27. "templateNodeInfo": false,
  28. "createNodeGroup": false,
  29. "deleteNodeGroup": false,
  30. },
  31. "kubeadm": {
  32. "address": "192.168.1.120:6443",
  33. "token": "h1g55p.hm4rg52ymloax182",
  34. "ca": "sha256:c7a86a7a9a03a628b59207f4f3b3e038ebd03260f3ad5ba28f364d513b01f542",
  35. "extras-args": [
  36. "--ignore-preflight-errors=All"
  37. ],
  38. },
  39. "k3s": {
  40. "address": "192.168.1.120:6443",
  41. "token": "h1g55p.hm4rg52ymloax182",
  42. "datastore-endpoint": "https://1.2.3.4:2379",
  43. "extras-commands": []
  44. },
  45. "external": {
  46. "address": "192.168.1.120:6443",
  47. "token": "h1g55p.hm4rg52ymloax182",
  48. "datastore-endpoint": "https://1.2.3.4:2379",
  49. "join-command": "/usr/local/bin/join-cluster.sh",
  50. "config-path": "/etc/default/vmware-autoscaler-config.yaml",
  51. "extra-config": {
  52. "...": "..."
  53. }
  54. },
  55. "default-machine": "large",
  56. "machines": {
  57. "tiny": {
  58. "memsize": 2048,
  59. "vcpus": 2,
  60. "disksize": 10240
  61. },
  62. "small": {
  63. "memsize": 4096,
  64. "vcpus": 2,
  65. "disksize": 20480
  66. },
  67. "medium": {
  68. "memsize": 4096,
  69. "vcpus": 4,
  70. "disksize": 20480
  71. },
  72. "large": {
  73. "memsize": 8192,
  74. "vcpus": 4,
  75. "disksize": 51200
  76. },
  77. "xlarge": {
  78. "memsize": 16384,
  79. "vcpus": 4,
  80. "disksize": 102400
  81. },
  82. "2xlarge": {
  83. "memsize": 16384,
  84. "vcpus": 8,
  85. "disksize": 102400
  86. },
  87. "4xlarge": {
  88. "memsize": 32768,
  89. "vcpus": 8,
  90. "disksize": 102400
  91. },
  92. },
  93. "node-labels": [
  94. "topology.kubernetes.io/region=home",
  95. "topology.kubernetes.io/zone=office",
  96. "topology.csi.vmware.com/k8s-region=home",
  97. "topology.csi.vmware.com/k8s-zone=office",
  98. ],
  99. "cloud-init": {
  100. "package_update": false,
  101. "package_upgrade": false,
  102. "runcmd": [
  103. "echo 1 > /sys/block/sda/device/rescan",
  104. "growpart /dev/sda 1",
  105. "resize2fs /dev/sda1",
  106. "echo '192.168.1.120 vmware-ca-k8s-masterkube vmware-ca-k8s-masterkube.acme.com' >> /etc/hosts",
  107. ],
  108. },
  109. "ssh-infos": {
  110. "user": "kubernetes",
  111. "ssh-private-key": "/root/.ssh/id_rsa"
  112. },
  113. "autoscaling-options": {
  114. "scaleDownUtilizationThreshold": 0.5,
  115. "scaleDownGpuUtilizationThreshold": 0.5,
  116. "scaleDownUnneededTime": "1m",
  117. "scaleDownUnreadyTime": "1m",
  118. },
  119. "vmware": {
  120. "vmware-ca-k8s": {
  121. "url": "https://administrator@acme.com:mySecret@vsphere.acme.com/sdk",
  122. "uid": "administrator@vsphere.acme.com",
  123. "password": "mySecret",
  124. "insecure": true,
  125. "dc": "DC01",
  126. "datastore": "datastore1",
  127. "resource-pool": "ACME/Resources/FR",
  128. "vmFolder": "HOME",
  129. "timeout": 300,
  130. "template-name": "jammy-kubernetes-cni-flannel-v1.26.0-containerd-amd64",
  131. "template": false,
  132. "linked": false,
  133. "customization": "",
  134. "network": {
  135. "domain": "acme.com",
  136. "dns": {
  137. "search": [
  138. "acme.com"
  139. ],
  140. "nameserver": [
  141. "10.0.0.1"
  142. ]
  143. },
  144. "interfaces": [
  145. {
  146. "primary": false,
  147. "exists": true,
  148. "network": "VM Network",
  149. "adapter": "vmxnet3",
  150. "mac-address": "generate",
  151. "nic": "eth0",
  152. "dhcp": true,
  153. "use-dhcp-routes": true,
  154. "routes": [
  155. {
  156. "to": "172.30.0.0/16",
  157. "via": "10.0.0.5",
  158. "metric": 500,
  159. },
  160. ],
  161. },
  162. {
  163. "primary": true,
  164. "exists": true,
  165. "network": "VM Private",
  166. "adapter": "vmxnet3",
  167. "mac-address": "generate",
  168. "nic": "eth1",
  169. "dhcp": true,
  170. "use-dhcp-routes": false,
  171. "address": "192.168.1.124",
  172. "gateway": "10.0.0.1",
  173. "netmask": "255.255.255.0",
  174. "routes": []
  175. }
  176. ]
  177. }
  178. }
  179. }
  180. }

Unmaintened releases

All release before 1.26.11 are not maintened