Skip to content

Commit 2cc8d07

Browse files
gouthamvepstibranyjtlisi
authored
Release 0.5.0 (#1963)
* Release 0.5.0-rc0 Signed-off-by: Goutham Veeramachaneni <[email protected]> * Fix changelog to point to 0.5.0-rc.0 Signed-off-by: Goutham Veeramachaneni <[email protected]> * Validate incoming labels for order and duplicate names. (#1964) * Validate incoming labels for order and duplicate names. Signed-off-by: Peter Štibraný <[email protected]> * Document that cortex rejects requests with incorrectly ordered or duplicate labels. Signed-off-by: Peter Štibraný <[email protected]> * Ignore empty metric name. {__name__=""} will be shown formatted as {__name__=""} instead of just empty string. Signed-off-by: Peter Štibraný <[email protected]> * As we rely on sorted labels, sort them before using them. Signed-off-by: Peter Štibraný <[email protected]> * Updated CHANGELOG.md to reflect latest changes to PR. Signed-off-by: Peter Štibraný <[email protected]> * Put back redundant aliases. Signed-off-by: Peter Štibraný <[email protected]> * wrap migration commands (#1980) * wrap migration commands Signed-off-by: Jacob Lisi <[email protected]> * update changelog Signed-off-by: Jacob Lisi <[email protected]> * fix missing semicolon Signed-off-by: Jacob Lisi <[email protected]> * Fix typo to make lint pass. Signed-off-by: Goutham Veeramachaneni <[email protected]> Co-authored-by: Goutham Veeramachaneni <[email protected]> * Call out breaking changes better. Signed-off-by: Goutham Veeramachaneni <[email protected]> * Label this version as -rc1 Signed-off-by: Goutham Veeramachaneni <[email protected]> * We're abandoning the `0.5.0` release. Signed-off-by: Goutham Veeramachaneni <[email protected]> Co-authored-by: Peter Štibraný <[email protected]> Co-authored-by: Jacob Lisi <[email protected]>
1 parent 8af6675 commit 2cc8d07

File tree

8 files changed

+190
-30
lines changed

8 files changed

+190
-30
lines changed

CHANGELOG.md

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,12 @@
22

33
## master / unreleased
44

5+
Note that the ruler flags need to be changed in this upgrade. You're moving from a single node ruler to something that might need to be sharded.
6+
If you are running with a high `-ruler.num-workers` and if you're not able to execute all your rules in `-ruler.evaluation-interval`, then you'll need to shard.
7+
Further, if you're using the configs service, we've upgraded the migration library and this requires some manual intervention. See full
8+
instructions below to upgrade your Postgres.
9+
510
* [CHANGE] The frontend component now does not cache results if it finds a `Cache-Control` header and if one of its values is `no-store`. #1974
6-
* [ENHANCEMENT] metric `cortex_ingester_flush_reasons` gets a new `reason` value: `Spread`, when `-ingester.spread-flushes` option is enabled.
711
* [CHANGE] Flags changed with transition to upstream Prometheus rules manager:
812
* `ruler.client-timeout` is now `ruler.configs.client-timeout` in order to match `ruler.configs.url`
913
* `ruler.group-timeout`has been removed
@@ -15,6 +19,7 @@
1519
* [CHANGE] Deprecated `-distributor.limiter-reload-period` flag. #1766
1620
* [CHANGE] Ingesters now write only normalised tokens to the ring, although they can still read denormalised tokens used by other ingesters. `-ingester.normalise-tokens` is now deprecated, and ignored. If you want to switch back to using denormalised tokens, you need to downgrade to Cortex 0.4.0. Previous versions don't handle claiming tokens from normalised ingesters correctly. #1809
1721
* [CHANGE] Overrides mechanism has been renamed to "runtime config", and is now separate from limits. Runtime config is simply a file that is reloaded by Cortex every couple of seconds. Limits and now also multi KV use this mechanism.<br />New arguments were introduced: `-runtime-config.file` (defaults to empty) and `-runtime-config.reload-period` (defaults to 10 seconds), which replace previously used `-limits.per-user-override-config` and `-limits.per-user-override-period` options. Old options are still used if `-runtime-config.file` is not specified. This change is also reflected in YAML configuration, where old `limits.per_tenant_override_config` and `limits.per_tenant_override_period` fields are replaced with `runtime_config.file` and `runtime_config.period` respectively. #1749
22+
* [CHANGE] Cortex now rejects data with duplicate labels. Previously, such data was accepted, with duplicate labels removed with only one value left. #1964
1823
* [CHANGE] Changed the default value for `-distributor.ha-tracker.prefix` from `collectors/` to `ha-tracker/` in order to not clash with other keys (ie. ring) stored in the same key-value store. #1940
1924
* [FEATURE] The distributor can now drop labels from samples (similar to the removal of the replica label for HA ingestion) per user via the `distributor.drop-label` flag. #1726
2025
* [FEATURE] Added flag `debug.mutex-profile-fraction` to enable mutex profiling #1969
@@ -23,11 +28,13 @@
2328
* [FEATURE] Added readiness probe endpoint`/ready` to queriers. #1934
2429
* [FEATURE] EXPERIMENTAL: Added `/series` API endpoint support with TSDB blocks storage. #1830
2530
* [FEATURE] Added "multi" KV store that can interact with two other KV stores, primary one for all reads and writes, and secondary one, which only receives writes. Primary/secondary store can be modified in runtime via runtime-config mechanism (previously "overrides"). #1749
26-
* [ENHANCEMENT] Added `password` and `enable_tls` options to redis cache configuration. Enables usage of Microsoft Azure Cache for Redis service.
31+
* [ENHANCEMENT] metric `cortex_ingester_flush_reasons` gets a new `reason` value: `Spread`, when `-ingester.spread-flushes` option is enabled. #1978
32+
* [ENHANCEMENT] Added `password` and `enable_tls` options to redis cache configuration. Enables usage of Microsoft Azure Cache for Redis service. #1923
2733
* [ENHANCEMENT] Experimental TSDB: Open existing TSDB on startup to prevent ingester from becoming ready before it can accept writes. #1917
2834
* `--experimental.tsdb.max-tsdb-opening-concurrency-on-startup`
2935
* [BUGFIX] Fixed unnecessary CAS operations done by the HA tracker when the jitter is enabled. #1861
3036
* [BUGFIX] Fixed #1904 ingesters getting stuck in a LEAVING state after coming up from an ungraceful exit. #1921
37+
* [BUGFIX] Reduce memory usage when ingester Push() errors. #1922
3138
* [BUGFIX] TSDB: Fixed handling of out of order/bound samples in ingesters with the experimental TSDB blocks storage. #1864
3239
* [BUGFIX] TSDB: Fixed querying ingesters in `LEAVING` state with the experimental TSDB blocks storage. #1854
3340
* [BUGFIX] TSDB: Fixed error handling in the series to chunks conversion with the experimental TSDB blocks storage. #1837
@@ -36,6 +43,18 @@
3643
* [BUGFIX] TSDB: Fixed `cortex_ingester_queried_samples` and `cortex_ingester_queried_series` metrics when using block storage. #1981
3744
* [BUGFIX] TSDB: Fixed `cortex_ingester_memory_series` and `cortex_ingester_memory_users` metrics when using with the experimental TSDB blocks storage. #1982
3845

46+
### Upgrading Postgres (if you're using configs service)
47+
48+
Reference: https://github.com/golang-migrate/migrate/tree/master/database/postgres#upgrading-from-v1
49+
50+
1. Install the migrate package cli tool: https://github.com/golang-migrate/migrate/tree/master/cmd/migrate#installation
51+
2. Drop the `schema_migrations` table: `DROP TABLE schema_migrations;`.
52+
2. Run the migrate command:
53+
54+
```bash
55+
migrate -path <absolute_path_to_cortex>/cmd/cortex/migrations -database postgres://localhost:5432/database force 2
56+
```
57+
3958
## 0.4.0 / 2019-12-02
4059

4160
* [CHANGE] The frontend component has been refactored to be easier to re-use. When upgrading the frontend, cache entries will be discarded and re-created with the new protobuf schema. #1734

RELEASE.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,8 @@ Our goal is to provide a new minor release every 4 weeks. This is a new process
1212
| v0.2.0 | 2019-08-28 | Goutham Veeramachaneni (Github: @gouthamve) |
1313
| v0.3.0 | 2019-10-09 | Bryan Boreham (@bboreham) |
1414
| v0.4.0 | 2019-11-13 | Tom Wilkie (@tomwilkie) |
15-
| v0.5.0 | 2019-12-11 | **searching for volunteer** |
15+
| v0.5.0 | 2020-01-08 | _Abandoned_ |
16+
| v0.6.0 | 2020-01-20 | **searching for a volunteer** |
1617

1718
## Release shepherd responsibilities
1819

cmd/cortex/migrations/002_immutable_configs.up.sql

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@
22
-- process (as exemplified in users/db/migrations/006 to 010), but currently
33
-- there are no data in production and only one row in dev.
44

5+
-- https://github.com/mattes/migrate/tree/master/database/postgres#upgrading-from-v1
6+
-- Wrap all commands in BEGIN and COMMIT to accommodate upgrade
7+
BEGIN;
8+
59
-- The existing id, type columns are the id & type of the entity that owns the
610
-- config.
711
ALTER TABLE configs RENAME COLUMN id TO owner_id;
@@ -12,3 +16,5 @@ ALTER TABLE configs ADD COLUMN id SERIAL;
1216

1317
ALTER TABLE configs DROP CONSTRAINT configs_pkey;
1418
ALTER TABLE configs ADD PRIMARY KEY (id, owner_id, owner_type, subsystem);
19+
20+
COMMIT;

pkg/distributor/distributor.go

Lines changed: 33 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@ package distributor
33
import (
44
"context"
55
"flag"
6-
"fmt"
76
"net/http"
87
"sort"
98
"strings"
@@ -227,7 +226,7 @@ func (d *Distributor) Stop() {
227226

228227
func (d *Distributor) tokenForLabels(userID string, labels []client.LabelAdapter) (uint32, error) {
229228
if d.cfg.ShardByAllLabels {
230-
return shardByAllLabels(userID, labels)
229+
return shardByAllLabels(userID, labels), nil
231230
}
232231

233232
metricName, err := extract.MetricNameFromLabelAdapters(labels)
@@ -244,19 +243,15 @@ func shardByMetricName(userID string, metricName string) uint32 {
244243
return h
245244
}
246245

247-
func shardByAllLabels(userID string, labels []client.LabelAdapter) (uint32, error) {
246+
// This function generates different values for different order of same labels.
247+
func shardByAllLabels(userID string, labels []client.LabelAdapter) uint32 {
248248
h := client.HashNew32()
249249
h = client.HashAdd32(h, userID)
250-
var lastLabelName string
251250
for _, label := range labels {
252-
if strings.Compare(lastLabelName, label.Name) >= 0 {
253-
return 0, fmt.Errorf("labels not sorted")
254-
}
255251
h = client.HashAdd32(h, label.Name)
256252
h = client.HashAdd32(h, label.Value)
257-
lastLabelName = label.Name
258253
}
259-
return h, nil
254+
return h
260255
}
261256

262257
// Remove the label labelname from a slice of LabelPairs if it exists.
@@ -375,6 +370,13 @@ func (d *Distributor) Push(ctx context.Context, req *client.WriteRequest) (*clie
375370
continue
376371
}
377372

373+
// We rely on sorted labels in different places:
374+
// 1) When computing token for labels, and sharding by all labels. Here different order of labels returns
375+
// different tokens, which is bad.
376+
// 2) In validation code, when checking for duplicate label names. As duplicate label names are rejected
377+
// later in the validation phase, we ignore them here.
378+
sortLabelsIfNeeded(ts.Labels)
379+
378380
// Generate the sharding token based on the series labels without the HA replica
379381
// label and dropped labels (if any)
380382
key, err := d.tokenForLabels(userID, ts.Labels)
@@ -450,6 +452,28 @@ func (d *Distributor) Push(ctx context.Context, req *client.WriteRequest) (*clie
450452
return &client.WriteResponse{}, lastPartialErr
451453
}
452454

455+
func sortLabelsIfNeeded(labels []client.LabelAdapter) {
456+
// no need to run sort.Slice, if labels are already sorted, which is most of the time.
457+
// we can avoid extra memory allocations (mostly interface-related) this way.
458+
sorted := true
459+
last := ""
460+
for _, l := range labels {
461+
if strings.Compare(last, l.Name) > 0 {
462+
sorted = false
463+
break
464+
}
465+
last = l.Name
466+
}
467+
468+
if sorted {
469+
return
470+
}
471+
472+
sort.Slice(labels, func(i, j int) bool {
473+
return strings.Compare(labels[i].Name, labels[j].Name) < 0
474+
})
475+
}
476+
453477
func (d *Distributor) sendSamples(ctx context.Context, ingester ring.IngesterDesc, timeseries []client.PreallocTimeseries) error {
454478
h, err := d.ingesterPool.GetClientFor(ingester.Addr)
455479
if err != nil {

pkg/distributor/distributor_test.go

Lines changed: 34 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ import (
88
"net/http"
99
"sort"
1010
"strconv"
11+
"strings"
1112
"sync"
1213
"testing"
1314
"time"
@@ -905,7 +906,7 @@ func (i *mockIngester) Push(ctx context.Context, req *client.WriteRequest, opts
905906

906907
for j := range req.Timeseries {
907908
series := req.Timeseries[j]
908-
hash, _ := shardByAllLabels(orgid, series.Labels)
909+
hash := shardByAllLabels(orgid, series.Labels)
909910
existing, ok := i.timeseries[hash]
910911
if !ok {
911912
// Make a copy because the request Timeseries are reused
@@ -1183,22 +1184,46 @@ func TestRemoveReplicaLabel(t *testing.T) {
11831184
}
11841185
}
11851186

1186-
func TestShardByAllLabelsChecksForSortedLabelNames(t *testing.T) {
1187-
val, err := shardByAllLabels("test", []client.LabelAdapter{
1187+
// This is not great, but we deal with unsorted labels when validating labels.
1188+
func TestShardByAllLabelsReturnsWrongResultsForUnsortedLabels(t *testing.T) {
1189+
val1 := shardByAllLabels("test", []client.LabelAdapter{
11881190
{Name: "__name__", Value: "foo"},
11891191
{Name: "bar", Value: "baz"},
11901192
{Name: "sample", Value: "1"},
11911193
})
11921194

1193-
assert.NotZero(t, val)
1194-
assert.NoError(t, err)
1195-
1196-
val, err = shardByAllLabels("test", []client.LabelAdapter{
1195+
val2 := shardByAllLabels("test", []client.LabelAdapter{
11971196
{Name: "__name__", Value: "foo"},
11981197
{Name: "sample", Value: "1"},
11991198
{Name: "bar", Value: "baz"},
12001199
})
12011200

1202-
assert.Zero(t, val)
1203-
assert.Error(t, err)
1201+
assert.NotEqual(t, val1, val2)
1202+
}
1203+
1204+
func TestSortLabels(t *testing.T) {
1205+
sorted := []client.LabelAdapter{
1206+
{Name: "__name__", Value: "foo"},
1207+
{Name: "bar", Value: "baz"},
1208+
{Name: "cluster", Value: "cluster"},
1209+
{Name: "sample", Value: "1"},
1210+
}
1211+
1212+
// no allocations if input is already sorted
1213+
require.Equal(t, 0.0, testing.AllocsPerRun(100, func() {
1214+
sortLabelsIfNeeded(sorted)
1215+
}))
1216+
1217+
unsorted := []client.LabelAdapter{
1218+
{Name: "__name__", Value: "foo"},
1219+
{Name: "sample", Value: "1"},
1220+
{Name: "cluster", Value: "cluster"},
1221+
{Name: "bar", Value: "baz"},
1222+
}
1223+
1224+
sortLabelsIfNeeded(unsorted)
1225+
1226+
sort.SliceIsSorted(unsorted, func(i, j int) bool {
1227+
return strings.Compare(unsorted[i].Name, unsorted[j].Name) < 0
1228+
})
12041229
}

pkg/ingester/client/compat.go

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,9 @@ func fromLabelMatchers(matchers []*LabelMatcher) ([]*labels.Matcher, error) {
190190
// FromLabelAdaptersToLabels casts []LabelAdapter to labels.Labels.
191191
// It uses unsafe, but as LabelAdapter == labels.Label this should be safe.
192192
// This allows us to use labels.Labels directly in protos.
193+
//
194+
// Note: while resulting labels.Labels is supposedly sorted, this function
195+
// doesn't enforce that. If input is not sorted, output will be wrong.
193196
func FromLabelAdaptersToLabels(ls []LabelAdapter) labels.Labels {
194197
return *(*labels.Labels)(unsafe.Pointer(&ls))
195198
}

pkg/util/validation/validate.go

Lines changed: 53 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
package validation
22

33
import (
4+
"fmt"
45
"net/http"
6+
"strings"
57
"time"
68

79
"github.com/cortexproject/cortex/pkg/ingester/client"
@@ -14,14 +16,16 @@ import (
1416
const (
1517
discardReasonLabel = "reason"
1618

17-
errMissingMetricName = "sample missing metric name"
18-
errInvalidMetricName = "sample invalid metric name: %.200q"
19-
errInvalidLabel = "sample invalid label: %.200q metric %.200q"
20-
errLabelNameTooLong = "label name too long: %.200q metric %.200q"
21-
errLabelValueTooLong = "label value too long: %.200q metric %.200q"
22-
errTooManyLabels = "sample for '%s' has %d label names; limit %d"
23-
errTooOld = "sample for '%s' has timestamp too old: %d"
24-
errTooNew = "sample for '%s' has timestamp too new: %d"
19+
errMissingMetricName = "sample missing metric name"
20+
errInvalidMetricName = "sample invalid metric name: %.200q"
21+
errInvalidLabel = "sample invalid label: %.200q metric %.200q"
22+
errLabelNameTooLong = "label name too long: %.200q metric %.200q"
23+
errLabelValueTooLong = "label value too long: %.200q metric %.200q"
24+
errTooManyLabels = "sample for '%s' has %d label names; limit %d"
25+
errTooOld = "sample for '%s' has timestamp too old: %d"
26+
errTooNew = "sample for '%s' has timestamp too new: %d"
27+
errDuplicateLabelName = "duplicate label name: %.200q metric %.200q"
28+
errLabelsNotSorted = "labels not sorted: %.200q metric %.200q"
2529

2630
// ErrQueryTooLong is used in chunk store and query frontend.
2731
ErrQueryTooLong = "invalid query, length > limit (%s > %s)"
@@ -31,6 +35,8 @@ const (
3135
tooFarInFuture = "too_far_in_future"
3236
invalidLabel = "label_invalid"
3337
labelNameTooLong = "label_name_too_long"
38+
duplicateLabelNames = "duplicate_label_names"
39+
labelsNotSorted = "labels_not_sorted"
3440
labelValueTooLong = "label_value_too_long"
3541

3642
// RateLimited is one of the values for the reason to discard samples.
@@ -102,6 +108,7 @@ func ValidateLabels(cfg LabelValidationConfig, userID string, ls []client.LabelA
102108

103109
maxLabelNameLength := cfg.MaxLabelNameLength(userID)
104110
maxLabelValueLength := cfg.MaxLabelValueLength(userID)
111+
lastLabelName := ""
105112
for _, l := range ls {
106113
var errTemplate string
107114
var reason string
@@ -118,11 +125,48 @@ func ValidateLabels(cfg LabelValidationConfig, userID string, ls []client.LabelA
118125
reason = labelValueTooLong
119126
errTemplate = errLabelValueTooLong
120127
cause = l.Value
128+
} else if cmp := strings.Compare(lastLabelName, l.Name); cmp >= 0 {
129+
if cmp == 0 {
130+
reason = duplicateLabelNames
131+
errTemplate = errDuplicateLabelName
132+
cause = l.Name
133+
} else {
134+
reason = labelsNotSorted
135+
errTemplate = errLabelsNotSorted
136+
cause = l.Name
137+
}
121138
}
122139
if errTemplate != "" {
123140
DiscardedSamples.WithLabelValues(reason, userID).Inc()
124-
return httpgrpc.Errorf(http.StatusBadRequest, errTemplate, cause, client.FromLabelAdaptersToMetric(ls).String())
141+
return httpgrpc.Errorf(http.StatusBadRequest, errTemplate, cause, formatLabelSet(ls))
125142
}
143+
lastLabelName = l.Name
126144
}
127145
return nil
128146
}
147+
148+
// this function formats label adapters as a metric name with labels, while preserving
149+
// label order, and keeping duplicates. If there are multiple "__name__" labels, only
150+
// first one is used as metric name, other ones will be included as regular labels.
151+
func formatLabelSet(ls []client.LabelAdapter) string {
152+
metricName, hasMetricName := "", false
153+
154+
labelStrings := make([]string, 0, len(ls))
155+
for _, l := range ls {
156+
if l.Name == model.MetricNameLabel && !hasMetricName && l.Value != "" {
157+
metricName = l.Value
158+
hasMetricName = true
159+
} else {
160+
labelStrings = append(labelStrings, fmt.Sprintf("%s=%q", l.Name, l.Value))
161+
}
162+
}
163+
164+
if len(labelStrings) == 0 {
165+
if hasMetricName {
166+
return metricName
167+
}
168+
return "{}"
169+
}
170+
171+
return fmt.Sprintf("%s{%s}", metricName, strings.Join(labelStrings, ", "))
172+
}

pkg/util/validation/validate_test.go

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,3 +80,41 @@ func TestValidateLabels(t *testing.T) {
8080
assert.Equal(t, c.err, err, "wrong error")
8181
}
8282
}
83+
84+
func TestValidateLabelOrder(t *testing.T) {
85+
var cfg validateLabelsCfg
86+
cfg.maxLabelNameLength = 10
87+
cfg.maxLabelNamesPerSeries = 10
88+
cfg.maxLabelValueLength = 10
89+
90+
userID := "testUser"
91+
92+
err := ValidateLabels(cfg, userID, []client.LabelAdapter{
93+
{Name: model.MetricNameLabel, Value: "m"},
94+
{Name: "b", Value: "b"},
95+
{Name: "a", Value: "a"},
96+
})
97+
assert.Equal(t, httpgrpc.Errorf(http.StatusBadRequest, errLabelsNotSorted, "a", `m{b="b", a="a"}`), err)
98+
}
99+
100+
func TestValidateLabelDuplication(t *testing.T) {
101+
var cfg validateLabelsCfg
102+
cfg.maxLabelNameLength = 10
103+
cfg.maxLabelNamesPerSeries = 10
104+
cfg.maxLabelValueLength = 10
105+
106+
userID := "testUser"
107+
108+
err := ValidateLabels(cfg, userID, []client.LabelAdapter{
109+
{Name: model.MetricNameLabel, Value: "a"},
110+
{Name: model.MetricNameLabel, Value: "b"},
111+
})
112+
assert.Equal(t, httpgrpc.Errorf(http.StatusBadRequest, errDuplicateLabelName, "__name__", `a{__name__="b"}`), err)
113+
114+
err = ValidateLabels(cfg, userID, []client.LabelAdapter{
115+
{Name: model.MetricNameLabel, Value: "a"},
116+
{Name: "a", Value: "a"},
117+
{Name: "a", Value: "a"},
118+
})
119+
assert.Equal(t, httpgrpc.Errorf(http.StatusBadRequest, errDuplicateLabelName, "a", `a{a="a", a="a"}`), err)
120+
}

0 commit comments

Comments
 (0)