Kubernetes Cluster Autoscaler - cluster-autoscaler-1.29.0


Deprecations

  • --ignore-taint flag and ignore-taint.cluster-autoscaler.kubernetes.io/ taint prefix are now deprecated. Instead use:
  • --status-taint flag or status-taint.cluster-autoscaler.kubernetes.io/ taint prefix for taints that denote node status.
  • --startup-taint flag or startup-taint.cluster-autoscaler.kubernetes.io/ taint prefix for taints that are used to prevent pods from scheduling before node is fully initialized (e.g. when using daemonset to install device plugin).
  • For backward compatibility --ignore-taint flag and ignore-taint.cluster-autoscaler.kubernetes.io/ continues to work with behavior identical to startup taint (which is the same behavior it had before).
  • Please see FAQ for more details. - #6132, #6218
  • Flags that were unused in the code (i.e. setting them had no effect) were deprecated and will be removed in the future release. Affected flags are: --node-autoprovisioning-enabled and --max-autoprovisioned-node-group-count.

General

  • Adds new flag --bypassed-scheduler-names with default empty value to maintain original behaviour.
    If flag is set to non-empty list, CA will not wait for schedulers (listed in the flag value) to mark pods as unschedulable and will evaluate non processed pods. Furthermore, if bypassed schedulers are non-empty CA will not wait for pods to reach a certain age to scale-up, effectively ignoring unschedulablePodTimeBuffer - #6235
  • Enabling this feature can improve autoscaling latency (CA will react to pods faster), but it can also increase load on CA in case of very large scale-ups (thousands of pending pods). This is because limited scheduler throughput can effectively act as a rate limiter, protecting CA from having to process a scale-up of too many pods at the same time. We believe this change will be beneficial in vast majority of environments, but given that CA scalability varies greatly between cloud providers we recommend testing this feature before enabling it in large clusters.
  • A new flag (--drain-priority-config) is introduced which allows users to configure drain behavior during scale-down based on pod priority. The new flag is mutually exclusive with --max-graceful-termination-sec. --max-graceful-termination-sec can still be used if the new configuration options are not needed. The default behavior is preserved (simple config, default value of --max-graceful-termination-sec). - #6139
  • Added --dynamic-node-delete-delay-after-taint-enabled flag. Enabling this flag changes delay between tainting and draining a node from constant delay to a dynamic one based on Kubernetes api-server latency. This minimizes the risk of race conditions if api-server connection is slow and improves scale-down throughput when it's fast. - #6019
  • Add structured logging support via --logging-format json - #6035
  • Introduced a new node_group_target_count metric that keeps track of target sizes of each NodeGroup. This metric is only available if --emit-per-nodegroup-metrics flag is enabled. - #6361
  • Introduced a new node_taints_count metric tracking different types of taints in the cluster. - #6201
  • New command line option --kube-api-content-type is added to specify content type to communicate with apiserver. This option also changes default content type from "application/json" to "application/vnd.kubernetes.protobuf". - #6114
  • Fixed a bug where resource requests of restartable init containers were not included in utilization calculation. - #6225
  • Fixed a bug where CA might have created less nodes than desired with a message about "Capping binpacking after exceeding threshold of 4 nodes" even though it then didn't actually add four new nodes. - #6165
  • Fixed support for --feature-gates=ContextualLogging=true. - #6162
  • Fixed a bug where scale down may have failed with "daemonset.apps not found". - #6122
  • Optimized CA memory usage. - #6159, #6110
  • Disambiguated wording in the log messages related to node removal ineligibility caused by high resource allocation. - #6223
  • Pods with the "cluster-autoscaler.kubernetes.io/safe-to-evict": "false" annotation will now always report an annotation-related warning message if they block scale-down (where previously they might've reported e.g. a message about not being replicated).- #6077

AWS

  • Added c7a, r7i, mac2-m2 families and size in i4i, c7i.metal, r7a.metal, r7iz.metal for Amazon EC2 instances static list. - #6347
  • Added p5.48xlarge - #6131
  • Updated cloudprovider/aws/aws-sdk-go to 1.48.7 in order to support dynamic auth token. - #6325
  • Fixed an issue where the capacityType label inferred from an empty AWS ManagedNodeGroup does not match the same label on the nodes after it scales from 0 -> 1. - #6261
  • Introduced caching to reduce volume of DescribeLaunchTemplateVersions API calls made by Cluster Autoscaler. - #6245
  • Nodes annotated with k8s.io/cluster-autoscaler-enabled=false will be skipped by CA and would no longer produce spammy logs about missing AWS instances. - #6265, #6301
  • Added additional log output when updating the ASG information form AWS. - #6282
  • Fixed a bug where CA may tried to remove an instance that was already in Terminated state. - #6166
  • Scale up from 0 now working with existing AWS EBS CSI PersistentVolume without having to add tag to ASG. - #6090

Azure

  • Removed AKS vmType. - #6186

Civo

  • Introduced support for scaling NodeGroup from 0. - #6322

Cluster API

  • Users of Cluster API can override the default architecture to consider in the templates for autoscaling from zero so that pods requesting non-amd64 nodes in their node selector terms can trigger the scale-up in non-amd64 single-arch clusters. - #6066

Equinix Metal

  • The packet provider and its configuration parameters are now deprecated in favor of equinixmetal - #6085
  • The cluster-autoscaler --cloud-provider flag should now be set to equinixmetal. For backward compatibility, "--cloud-provider=packet" continues to work
  • "METAL_AUTH_TOKEN" replaces "PACKET_AUTH_TOKEN". For backward compatibility, the latter still works.
  • "EQUINIX_METAL_MANAGER" replaces "PACKET_MANAGER". For backward compatibility, the latter still works.
  • Each node managed by cloud-provider "equinixmetal" will be labeled with the "METAL_CONTROLLER_NODE_IDENTIFIER_LABEL" defined label. For backward compatibility, "PACKET_CONTROLLER_NODE_IDENTIFIER_LABEL" still works.
  • We now use metros in the Equinix Metal (Packet) cloudprovider. Facilities support has been removed. - #6078

GCE

  • Flag --gce-expander-ephemeral-storage-support is now Deprecated. The ephemeral-storage support is always enabled and the flag itself would be ignored.
  • Support for paginated MIG instance listing. - #6376
  • Improved reporting of errors related to GCE Reservations. - #6093

gRPC

  • Timeout of grpc calls can be specified through cloud-config. - #6373
  • grpc based cloud providers can now pass the grpc error code 12, Unimplemented, to signal they do not implement optional methods. - #5937
  • Fixed: cluster-autoscaler thinks newly scaled up nodegroup using externalgrpc provider has MaxNodeProvisionTime set as 0 seconds and expects the new node to be registered in 0-10 seconds instead of the default 15m. Check https://github.com/kubernetes/autoscaler/issues/5935 for more info. - #5936

Hetzner

  • Fixed a bug where failed servers are kept for longer than necessary. - #6364
  • Fixed a bug where too many requests are sent to the Hetzner Cloud API, causing Rate Limit issues. - #6308
  • Each node pool can now have different init configs. - #6184

Kwok

  • Introduced new kwok cloud provider (check https://github.com/kubernetes/autoscaler/blob/kwok-poc/cluster-autoscaler/cloudprovider/kwok/README.md) for more info.

Images

  • registry.k8s.io/autoscaling/cluster-autoscaler:v1.29.0
  • registry.k8s.io/autoscaling/cluster-autoscaler-arm64:v1.29.0
  • registry.k8s.io/autoscaling/cluster-autoscaler-amd64:v1.29.0
  • registry.k8s.io/autoscaling/cluster-autoscaler-s390x:v1.29.0

Details

date
Dec. 27, 2023, 5:25 p.m.
name
Cluster Autoscaler 1.29.0
type
Minor
👇
Register or login to:
  • 🔍View and search all Kubernetes Cluster Autoscaler releases.
  • 🛠️Create and share lists to track your tools.
  • 🚨Setup notifications for major, security, feature or patch updates.
  • 🚀Much more coming soon!
Continue with GitHub
Continue with Google
or