Cortex - v1.10.0-rc.0
Highlights
- Cortex chunks storage has been deprecated and is now in maintenance mode.
- Added many new limits, to help protect your installation against overload.
- The sharding feature in Alertmanager is now considered complete.
- Added support for ingesting and querying of exemplars.
- Release now has ARM binaries and packages (but not container images, yet).
Cortex
- [CHANGE] Enable strict JSON unmarshal for
pkg/util/validation.Limits
struct. The customUnmarshalJSON()
will now fail if the input has unknown fields. #4298 - [CHANGE] Cortex chunks storage has been deprecated and it's now in maintenance mode: all Cortex users are encouraged to migrate to the blocks storage. No new features will be added to the chunks storage. The default Cortex configuration still runs the chunks engine; please check out the blocks storage doc on how to configure Cortex to run with the blocks storage. #4268
- [CHANGE] The example Kubernetes manifests (stored at
k8s/
) have been removed due to a lack of proper support and maintenance. #4268 - [CHANGE] Querier / ruler: deprecated
-store.query-chunk-limit
CLI flag (and its respective YAML config optionmax_chunks_per_query
) in favour of-querier.max-fetched-chunks-per-query
(and its respective YAML config optionmax_fetched_chunks_per_query
). The new limit specifies the maximum number of chunks that can be fetched in a single query from ingesters and long-term storage: the total number of actual fetched chunks could be 2x the limit, being independently applied when querying ingesters and long-term storage. #4125 - [CHANGE] Alertmanager: allowed to configure the experimental receivers firewall on a per-tenant basis. The following CLI flags (and their respective YAML config options) have been changed and moved to the limits config section: #4143
-alertmanager.receivers-firewall.block.cidr-networks
renamed to-alertmanager.receivers-firewall-block-cidr-networks
-alertmanager.receivers-firewall.block.private-addresses
renamed to-alertmanager.receivers-firewall-block-private-addresses
- [CHANGE] Change default value of
-server.grpc.keepalive.min-time-between-pings
from5m
to10s
and-server.grpc.keepalive.ping-without-stream-allowed
totrue
. #4168 - [CHANGE] Ingester: Change default value of
-ingester.active-series-metrics-enabled
totrue
. This incurs a small increase in memory usage, between 1.2% and 1.6% as measured on ingesters with 1.3M active series. #4257 - [CHANGE] Dependency: update go-redis from v8.2.3 to v8.9.0. #4236
- [FEATURE] Querier: Added new
-querier.max-fetched-series-per-query
flag. When Cortex is running with blocks storage, the max series per query limit is enforced in the querier and applies to unique series received from ingesters and store-gateway (long-term storage). #4179 - [FEATURE] Querier/Ruler: Added new
-querier.max-fetched-chunk-bytes-per-query
flag. When Cortex is running with blocks storage, the max chunk bytes limit is enforced in the querier and ruler and limits the size of all aggregated chunks returned from ingesters and storage as bytes for a query. #4216 - [FEATURE] Alertmanager: support negative matchers, time-based muting - upstream release notes. #4237
- [FEATURE] Alertmanager: Added rate-limits to notifiers. Rate limits used by all integrations can be configured using
-alertmanager.notification-rate-limit
, while per-integration rate limits can be specified via-alertmanager.notification-rate-limit-per-integration
parameter. Both shared and per-integration limits can be overwritten using overrides mechanism. These limits are applied on individual (per-tenant) alertmanagers. Rate-limited notifications are failed notifications. It is possible to monitor rate-limited notifications via newcortex_alertmanager_notification_rate_limited_total
metric. #4135 #4163 - [FEATURE] Alertmanager: Added
-alertmanager.max-config-size-bytes
limit to control size of configuration files that Cortex users can upload to Alertmanager via API. This limit is configurable per-tenant. #4201 - [FEATURE] Alertmanager: Added
-alertmanager.max-templates-count
and-alertmanager.max-template-size-bytes
options to control number and size of templates uploaded to Alertmanager via API. These limits are configurable per-tenant. #4223 - [FEATURE] Added flag
-debug.block-profile-rate
to enable goroutine blocking events profiling. #4217 - [FEATURE] Alertmanager: The experimental sharding feature is now considered complete. Detailed information about the configuration options can be found here for alertmanager and here for the alertmanager storage. To use the feature: #3925 #4020 #4021 #4031 #4084 #4110 #4126 #4127 #4141 #4146 #4161 #4162 #4222
- Ensure that a remote storage backend is configured for Alertmanager to store state using
-alertmanager-storage.backend
, and flags related to the backend. Note that thelocal
andconfigdb
storage backends are not supported. - Ensure that a ring store is configured using
-alertmanager.sharding-ring.store
, and set the flags relevant to the chosen store type. - Enable the feature using
-alertmanager.sharding-enabled
. - Note the prior addition of a new configuration option
-alertmanager.persist-interval
. This sets the interval between persisting the current alertmanager state (notification log and silences) to object storage. See the configuration file reference for more information. - [ENHANCEMENT] Alertmanager: Cleanup persisted state objects from remote storage when a tenant configuration is deleted. #4167
- [ENHANCEMENT] Storage: Added the ability to disable Open Census within GCS client (e.g
-gcs.enable-opencensus=false
). #4219 - [ENHANCEMENT] Etcd: Added username and password to etcd config. #4205
- [ENHANCEMENT] Alertmanager: introduced new metrics to monitor operation when using
-alertmanager.sharding-enabled
: #4149 cortex_alertmanager_state_fetch_replica_state_total
cortex_alertmanager_state_fetch_replica_state_failed_total
cortex_alertmanager_state_initial_sync_total
cortex_alertmanager_state_initial_sync_completed_total
cortex_alertmanager_state_initial_sync_duration_seconds
cortex_alertmanager_state_persist_total
cortex_alertmanager_state_persist_failed_total
- [ENHANCEMENT] Blocks storage: support ingesting exemplars and querying of exemplars. Enabled by setting new CLI flag
-blocks-storage.tsdb.max-exemplars=<n>
or config optionblocks_storage.tsdb.max_exemplars
to positive value. #4124 #4181 - [ENHANCEMENT] Distributor: Added distributors ring status section in the admin page. #4151
- [ENHANCEMENT] Added zone-awareness support to alertmanager for use when sharding is enabled. When zone-awareness is enabled, alerts will be replicated across availability zones. #4204
- [ENHANCEMENT] Added
tenant_ids
tag to tracing spans #4186 - [ENHANCEMENT] Ring, query-frontend: Avoid using automatic private IPs (APIPA) when discovering IP address from the interface during the registration of the instance in the ring, or by query-frontend when used with query-scheduler. APIPA still used as last resort with logging indicating usage. #4032
- [ENHANCEMENT] Memberlist: introduced new metrics to aid troubleshooting tombstone convergence: #4231
memberlist_client_kv_store_value_tombstones
memberlist_client_kv_store_value_tombstones_removed_total
memberlist_client_messages_to_broadcast_dropped_total
- [ENHANCEMENT] Alertmanager: Added
-alertmanager.max-dispatcher-aggregation-groups
option to control max number of active dispatcher groups in Alertmanager (per tenant, also overrideable). When the limit is reached, Dispatcher produces log message and increasescortex_alertmanager_dispatcher_aggregation_group_limit_reached_total
metric. #4254 - [ENHANCEMENT] Alertmanager: Added
-alertmanager.max-alerts-count
and-alertmanager.max-alerts-size-bytes
to control max number of alerts and total size of alerts that a single user can have in Alertmanager's memory. Adding more alerts will fail with a log message and incrementingcortex_alertmanager_alerts_insert_limited_total
metric (per-user). These limits can be overrided by using per-tenant overrides. Current values are tracked incortex_alertmanager_alerts_limiter_current_alerts
andcortex_alertmanager_alerts_limiter_current_alerts_size_bytes
metrics. #4253 - [ENHANCEMENT] Store-gateway: added
-store-gateway.sharding-ring.wait-stability-min-duration
and-store-gateway.sharding-ring.wait-stability-max-duration
support to store-gateway, to wait for ring stability at startup. #4271 - [ENHANCEMENT] Ruler: added
rule_group
label to metricscortex_prometheus_rule_group_iterations_total
andcortex_prometheus_rule_group_iterations_missed_total
. #4121 - [ENHANCEMENT] Ruler: added new metrics for tracking total number of queries and push requests sent to ingester, as well as failed queries and push requests. Failures are only counted for internal errors, but not user-errors like limits or invalid query. This is in contrast to existing
cortex_prometheus_rule_evaluation_failures_total
, which is incremented also when query or samples appending fails due to user-errors. #4281 cortex_ruler_write_requests_total
cortex_ruler_write_requests_failed_total
cortex_ruler_queries_total
cortex_ruler_queries_failed_total
- [ENHANCEMENT] Ingester: Added option
-ingester.ignore-series-limit-for-metric-names
with comma-separated list of metric names that will be ignored in max series per metric limit. #4302 - [ENHANCEMENT] Added instrumentation to Redis client, with the following metrics: #3976
cortex_rediscache_request_duration_seconds
- [BUGFIX] Purger: fix
Invalid null value in condition for column range
caused bynil
value in range for WriteBatch query. #4128 - [BUGFIX] Ingester: fixed infrequent panic caused by a race condition between TSDB mmap-ed head chunks truncation and queries. #4176
- [BUGFIX] Alertmanager: fix Alertmanager status page if clustering via gossip is disabled or sharding is enabled. #4184
- [BUGFIX] Ruler: fix
/ruler/rule_groups
endpoint doesn't work when used with object store. #4182 - [BUGFIX] Ruler: Honor the evaluation delay for the
ALERTS
andALERTS_FOR_STATE
series. #4227 - [BUGFIX] Make multiple Get requests instead of MGet on Redis Cluster. #4056
- [BUGFIX] Ingester: fix issue where runtime limits erroneously override default limits. #4246
- [BUGFIX] Ruler: fix startup in single-binary mode when the new
ruler_storage
is used. #4252 - [BUGFIX] Querier: fix queries failing with "at least 1 healthy replica required, could only find 0" error right after scaling up store-gateways until they're ACTIVE in the ring. #4263
- [BUGFIX] Store-gateway: when blocks sharding is enabled, do not load all blocks in each store-gateway in case of a cold startup, but load only blocks owned by the store-gateway replica. #4271
- [BUGFIX] Memberlist: fix to setting the default configuration value for
-memberlist.retransmit-factor
when not provided. This should improve propagation delay of the ring state (including, but not limited to, tombstones). Note that if the configuration is already explicitly given, this fix has no effect. #4269 - [BUGFIX] Querier: Fix issue where samples in a chunk might get skipped by batch iterator. #4218
Blocksconvert
- [ENHANCEMENT] Scanner: add support for DynamoDB (v9 schema only). #3828
- [ENHANCEMENT] Add Cassandra support. #3795
- [ENHANCEMENT] Scanner: retry failed uploads. #4188
Details
date
July 21, 2021, 2:28 p.m.
name
Cortex 1.10.0-rc.0
type
Pre-release
👇
Register or login to:
- 🔍View and search all Cortex releases.
- 🛠️Create and share lists to track your tools.
- 🚨Setup notifications for major, security, feature or patch updates.
- 🚀Much more coming soon!