dapr - v1.12.4


Dapr 1.12.4

This update includes bug fixes:

Mitigate race condition in placement table during sidecar restarts

Problem

When restarting a significant deployment on Kubernetes (> 20 pods), many pods will be removed from placement table. When the restarts happens concurrently with leadership changes in placement service, the placement table can get into a corrupt state that requires the restart of placement service #7311

Impact

Impacts users running Dapr 1.12.0-1.12.3 that uses actors and deployments with many sidecar instances.

Root cause

Race condition between placement leadership changes and placement table updates due to pods being terminated.

Solution

Mitigate by reducing the chances of leadership changes in placement service and fix a bug that can cause terminated pods to remain in placement table forever.

Fixes in service invocation when target app has multiple replicas on Kubernetes

Problem

When performing service invocation to another app using Dapr (using either HTTP or gRPC), the caller sidecar establishes a gRPC connection with the target sidecar. When the target app is deployed with multiple replicas (scaled horizontally), a connection is established with each replica and requests are load-balanced across all replicas.

In Dapr 1.12.3 and lower running on Kubernetes, due to the way this logic was implemented, if the connection with one of the replicas became idle, it would have caused the connection between the caller sidecar and all replicas to be severed, possibly interrupting other in-flight service invocation calls.

Impact

This issue impacts users running Dapr 1.12.3 and lower, running on Kubernetes, that use service invocation to invoke apps that are scaled horizontally.

This issue does not impact users that are running outside of Kubernetes or who are using other Dapr nameresolution components (including mDNS or Consul).

Root cause

On Kubernetes, the gRPC connection was established with the DNS name of the target Service, and we allowed the gRPC-Go library to perform name resolution with the Kubernetes DNS server and establish a connection with each replica. When the connection with one of the replicas becomes idle, the caller app receives a "GOAWAY" message which is interpreted as a signal to terminate all connections with all replicas of the app.

Solution

We have changed the internals of service invocation, so we always perform DNS resolution in the caller Dapr sidecar, also on Kubernetes. The gRPC-Go library receives an individual IP address to connect to, so "GOAWAY" messages only impact the connection with an individual replica.


Details

date
Jan. 17, 2024, 9:05 p.m.
name
Dapr Runtime v1.12.4
type
Patch
👇
Register or login to:
  • 🔍View and search all dapr releases.
  • 🛠️Create and share lists to track your tools.
  • 🚨Setup notifications for major, security, feature or patch updates.
  • 🚀Much more coming soon!
Continue with GitHub
Continue with Google
or