Skip to content

Commit 27c6f85

Browse files
committed
Merge remote-tracking branch 'upstream/master' into wal
Signed-off-by: Ganesh Vernekar <[email protected]>
2 parents 037d4b3 + 8355623 commit 27c6f85

File tree

1,082 files changed

+121950
-15376
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,082 files changed

+121950
-15376
lines changed

.circleci/config.yml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ version: 2.1
33
# https://circleci.com/blog/circleci-hacks-reuse-yaml-in-your-circleci-config-with-yaml/
44
defaults: &defaults
55
docker:
6-
- image: cortexproject/build-image:master-d74af5958
6+
- image: quay.io/cortexproject/build-image:update-lint-ae47f740a
77
working_directory: /go/src/github.com/cortexproject/cortex
88

99
workflows:
@@ -68,12 +68,19 @@ jobs:
6868
- run:
6969
name: Lint
7070
command: make BUILD_IN_CONTAINER=false lint
71+
# fails to run everything first time - see https://github.com/golangci/golangci-lint/issues/866
72+
- run:
73+
name: Lint again
74+
command: make BUILD_IN_CONTAINER=false lint
7175
- run:
7276
name: Check vendor directory is consistent.
7377
command: make BUILD_IN_CONTAINER=false mod-check
7478
- run:
7579
name: Check protos are consistent.
7680
command: make BUILD_IN_CONTAINER=false check-protos
81+
- run:
82+
name: Check generated documentation is consistent.
83+
command: make BUILD_IN_CONTAINER=false check-doc
7784

7885
test:
7986
<<: *defaults

.errcheck-exclude

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
io/ioutil.WriteFile
2+
io/ioutil.ReadFile
3+
(github.com/go-kit/kit/log.Logger).Log

.golangci.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
output:
2+
format: line-number
3+
4+
linters-settings:
5+
errcheck:
6+
# path to a file containing a list of functions to exclude from checking
7+
# see https://github.com/kisielk/errcheck#excluding-functions for details
8+
exclude: ./.errcheck-exclude

CHANGELOG.md

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,12 @@
22

33
## master / unreleased
44

5+
Note that the ruler flags need to be changed in this upgrade. You're moving from a single node ruler to something that might need to be sharded.
6+
If you are running with a high `-ruler.num-workers` and if you're not able to execute all your rules in `-ruler.evaluation-interval`, then you'll need to shard.
7+
Further, if you're using the configs service, we've upgraded the migration library and this requires some manual intervention. See full
8+
instructions below to upgrade your Postgres.
9+
10+
* [CHANGE] The frontend component now does not cache results if it finds a `Cache-Control` header and if one of its values is `no-store`. #1974
511
* [CHANGE] Flags changed with transition to upstream Prometheus rules manager:
612
* `ruler.client-timeout` is now `ruler.configs.client-timeout` in order to match `ruler.configs.url`
713
* `ruler.group-timeout`has been removed
@@ -12,11 +18,44 @@
1218
* [CHANGE] Use relative links from /ring page to make it work when used behind reverse proxy. #1896
1319
* [CHANGE] Deprecated `-distributor.limiter-reload-period` flag. #1766
1420
* [CHANGE] Ingesters now write only normalised tokens to the ring, although they can still read denormalised tokens used by other ingesters. `-ingester.normalise-tokens` is now deprecated, and ignored. If you want to switch back to using denormalised tokens, you need to downgrade to Cortex 0.4.0. Previous versions don't handle claiming tokens from normalised ingesters correctly. #1809
21+
* [CHANGE] Overrides mechanism has been renamed to "runtime config", and is now separate from limits. Runtime config is simply a file that is reloaded by Cortex every couple of seconds. Limits and now also multi KV use this mechanism.<br />New arguments were introduced: `-runtime-config.file` (defaults to empty) and `-runtime-config.reload-period` (defaults to 10 seconds), which replace previously used `-limits.per-user-override-config` and `-limits.per-user-override-period` options. Old options are still used if `-runtime-config.file` is not specified. This change is also reflected in YAML configuration, where old `limits.per_tenant_override_config` and `limits.per_tenant_override_period` fields are replaced with `runtime_config.file` and `runtime_config.period` respectively. #1749
22+
* [CHANGE] Cortex now rejects data with duplicate labels. Previously, such data was accepted, with duplicate labels removed with only one value left. #1964
23+
* [CHANGE] Changed the default value for `-distributor.ha-tracker.prefix` from `collectors/` to `ha-tracker/` in order to not clash with other keys (ie. ring) stored in the same key-value store. #1940
1524
* [FEATURE] The distributor can now drop labels from samples (similar to the removal of the replica label for HA ingestion) per user via the `distributor.drop-label` flag. #1726
25+
* [FEATURE] Added flag `debug.mutex-profile-fraction` to enable mutex profiling #1969
1626
* [FEATURE] Added `global` ingestion rate limiter strategy. Deprecated `-distributor.limiter-reload-period` flag. #1766
1727
* [FEATURE] Added support for Microsoft Azure blob storage to be used for storing chunk data. #1913
28+
* [FEATURE] Added readiness probe endpoint`/ready` to queriers. #1934
29+
* [FEATURE] EXPERIMENTAL: Added `/series` API endpoint support with TSDB blocks storage. #1830
30+
* [FEATURE] Added "multi" KV store that can interact with two other KV stores, primary one for all reads and writes, and secondary one, which only receives writes. Primary/secondary store can be modified in runtime via runtime-config mechanism (previously "overrides"). #1749
31+
* [ENHANCEMENT] metric `cortex_ingester_flush_reasons` gets a new `reason` value: `Spread`, when `-ingester.spread-flushes` option is enabled. #1978
32+
* [ENHANCEMENT] Added `password` and `enable_tls` options to redis cache configuration. Enables usage of Microsoft Azure Cache for Redis service. #1923
33+
* [ENHANCEMENT] Experimental TSDB: Open existing TSDB on startup to prevent ingester from becoming ready before it can accept writes. #1917
34+
* `--experimental.tsdb.max-tsdb-opening-concurrency-on-startup`
35+
* [ENHANCEMENT] Experimental TSDB: Added `cortex_ingester_shipper_dir_syncs_total`, `cortex_ingester_shipper_dir_sync_failures_total`, `cortex_ingester_shipper_uploads_total` and `cortex_ingester_shipper_upload_failures_total` metrics from TSDB shipper component. #1983
1836
* [BUGFIX] Fixed unnecessary CAS operations done by the HA tracker when the jitter is enabled. #1861
19-
37+
* [BUGFIX] Fixed #1904 ingesters getting stuck in a LEAVING state after coming up from an ungraceful exit. #1921
38+
* [BUGFIX] Reduce memory usage when ingester Push() errors. #1922
39+
* [BUGFIX] TSDB: Fixed handling of out of order/bound samples in ingesters with the experimental TSDB blocks storage. #1864
40+
* [BUGFIX] TSDB: Fixed querying ingesters in `LEAVING` state with the experimental TSDB blocks storage. #1854
41+
* [BUGFIX] TSDB: Fixed error handling in the series to chunks conversion with the experimental TSDB blocks storage. #1837
42+
* [BUGFIX] TSDB: Fixed TSDB creation conflict with blocks transfer in a `JOINING` ingester with the experimental TSDB blocks storage. #1818
43+
* [BUGFIX] TSDB: `experimental.tsdb.ship-interval` of <=0 treated as disabled instead of allowing panic. #1975
44+
* [BUGFIX] TSDB: Fixed `cortex_ingester_queried_samples` and `cortex_ingester_queried_series` metrics when using block storage. #1981
45+
* [BUGFIX] TSDB: Fixed `cortex_ingester_memory_series` and `cortex_ingester_memory_users` metrics when using with the experimental TSDB blocks storage. #1982
46+
47+
### Upgrading Postgres (if you're using configs service)
48+
49+
Reference: https://github.com/golang-migrate/migrate/tree/master/database/postgres#upgrading-from-v1
50+
51+
1. Install the migrate package cli tool: https://github.com/golang-migrate/migrate/tree/master/cmd/migrate#installation
52+
2. Drop the `schema_migrations` table: `DROP TABLE schema_migrations;`.
53+
2. Run the migrate command:
54+
55+
```bash
56+
migrate -path <absolute_path_to_cortex>/cmd/cortex/migrations -database postgres://localhost:5432/database force 2
57+
```
58+
2059
## 0.4.0 / 2019-12-02
2160

2261
* [CHANGE] The frontend component has been refactored to be easier to re-use. When upgrading the frontend, cache entries will be discarded and re-created with the new protobuf schema. #1734
@@ -36,6 +75,7 @@
3675
* [FEATURE] EXPERIMENTAL: Use TSDB in the ingesters & flush blocks to S3/GCS ala Thanos. This will let us use an Object Store more efficiently and reduce costs. #1695
3776
* [FEATURE] Allow Query Frontend to log slow queries with `frontend.log-queries-longer-than`. #1744
3877
* [FEATURE] Add HTTP handler to trigger ingester flush & shutdown - used when running as a stateful set with the WAL enabled. #1746
78+
* [FEATURE] EXPERIMENTAL: Added GCS support to TSDB blocks storage. #1772
3979
* [ENHANCEMENT] Reduce memory allocations in the write path. #1706
4080
* [ENHANCEMENT] Consul client now follows recommended practices for blocking queries wrt returned Index value. #1708
4181
* [ENHANCEMENT] Consul client can optionally rate-limit itself during Watch (used e.g. by ring watchers) and WatchPrefix (used by HA feature) operations. Rate limiting is disabled by default. New flags added: `--consul.watch-rate-limit`, and `--consul.watch-burst-size`. #1708
@@ -45,6 +85,7 @@
4585
* [BUGFIX] Fix bug where duplicate labels can be returned through metadata APIs. #1790
4686
* [BUGFIX] Fix reading of old, v3 chunk data. #1779
4787
* [BUGFIX] Now support IAM roles in service accounts in AWS EKS. #1803
88+
* [BUGFIX] Fixed duplicated series returned when querying both ingesters and store with the experimental TSDB blocks storage. #1778
4889

4990
In this release we updated the following dependencies:
5091
- gRPC v1.25.0 (resulted in a drop of 30% CPU usage when compression is on)

GOVERNANCE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This document defines project governance for the project.
44

55
## Voting
66

7-
The Cortex project employs voting to ensure no single member can dominate the project. Any maintainer may cast a vote.
7+
The Cortex project employs voting to ensure no single member can dominate the project. Any maintainer may cast a vote. To avoid having a single company dominate the project, at most two votes from maintainers working for the same company will count.
88

99
For formal votes, a specific statement of what is being voted on should be added to the relevant github issue or PR, and a link to that issue or PR added to the maintainers meeting agenda document.
1010
Maintainers should indicate their yes/no vote on that issue or PR, and after a suitable period of time, the votes will be tallied and the outcome noted.

MAINTAINERS

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
Bryan Boreham, Weaveworks <[email protected]> (@bboreham)
2-
Chris Marchbanks, independent <[email protected]> (@csmarchbanks)
3-
Cody Boggs, independent <[email protected]> (@cboggs)
2+
Chris Marchbanks, Splunk <[email protected]> (@csmarchbanks)
3+
Cody Boggs, Splunk <[email protected]> (@cboggs)
44
Jacob Lisi, Grafana Labs <[email protected]> (@jtlisi)
55
Ken Haines, Microsoft <[email protected]> (@khaines)
6+
Marco Pracucci, Grafana Labs <[email protected]> (@pracucci)
67
Tom Wilkie, Grafana Labs <[email protected]> (@tomwilkie)
78
Goutham Veeramachaneni, Grafana Labs <[email protected]> (@gouthamve)

Makefile

Lines changed: 19 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -80,27 +80,24 @@ GO_FLAGS := -ldflags "-extldflags \"-static\" -s -w" -tags netgo
8080

8181
ifeq ($(BUILD_IN_CONTAINER),true)
8282

83+
GOVOLUMES= -v $(shell pwd)/.cache:/go/cache:delegated \
84+
-v $(shell pwd)/.pkg:/go/pkg:delegated \
85+
-v $(shell pwd):/go/src/github.com/cortexproject/cortex:delegated
86+
8387
exes $(EXES) protos $(PROTO_GOS) lint test shell mod-check check-protos web-build web-pre web-deploy: build-image/$(UPTODATE)
8488
@mkdir -p $(shell pwd)/.pkg
8589
@mkdir -p $(shell pwd)/.cache
8690
@echo
8791
@echo ">>>> Entering build container: $@"
88-
@$(SUDO) time docker run $(RM) $(TTY) -i \
89-
-v $(shell pwd)/.cache:/go/cache \
90-
-v $(shell pwd)/.pkg:/go/pkg \
91-
-v $(shell pwd):/go/src/github.com/cortexproject/cortex \
92-
$(BUILD_IMAGE) $@;
92+
@$(SUDO) time docker run $(RM) $(TTY) -i $(GOVOLUMES) $(BUILD_IMAGE) $@;
9393

9494
configs-integration-test: build-image/$(UPTODATE)
9595
@mkdir -p $(shell pwd)/.pkg
9696
@mkdir -p $(shell pwd)/.cache
9797
@DB_CONTAINER="$$(docker run -d -e 'POSTGRES_DB=configs_test' postgres:9.6)"; \
9898
echo ; \
9999
echo ">>>> Entering build container: $@"; \
100-
$(SUDO) docker run $(RM) $(TTY) -i \
101-
-v $(shell pwd)/.cache:/go/cache \
102-
-v $(shell pwd)/.pkg:/go/pkg \
103-
-v $(shell pwd):/go/src/github.com/cortexproject/cortex \
100+
$(SUDO) docker run $(RM) $(TTY) -i $(GOVOLUMES) \
104101
-v $(shell pwd)/cmd/cortex/migrations:/migrations \
105102
--workdir /go/src/github.com/cortexproject/cortex \
106103
--link "$$DB_CONTAINER":configs-db.cortex.local \
@@ -123,11 +120,8 @@ protos: $(PROTO_GOS)
123120
protoc -I $(GOPATH)/src:./vendor:./$(@D) --gogoslick_out=plugins=grpc,Mgoogle/protobuf/any.proto=github.com/gogo/protobuf/types,:./$(@D) ./$(patsubst %.pb.go,%.proto,$@)
124121

125122
lint:
126-
./tools/lint -notestpackage -novet -ignorespelling queriers -ignorespelling Queriers .
127-
128-
# -stdmethods=false disables checks for non-standard signatures for methods with familiar names.
129-
# This is needed because the Prometheus storage interface requires a non-standard Seek() method.
130-
go vet -stdmethods=false ./pkg/...
123+
misspell -error docs
124+
golangci-lint run --new-from-rev ed7c302fd968 --build-tags netgo --timeout=5m --enable golint --enable misspell --enable gofmt
131125

132126
test:
133127
./tools/test -netgo
@@ -194,5 +188,16 @@ prime-minikube: save-images
194188
fi \
195189
done
196190

191+
# Generates the config file documentation.
192+
doc:
193+
cp ./docs/configuration/config-file-reference.template ./docs/configuration/config-file-reference.md
194+
go run ./tools/doc-generator/ >> ./docs/configuration/config-file-reference.md
195+
196+
clean-doc:
197+
rm -f ./docs/configuration/config-file-reference.md
198+
199+
check-doc: clean-doc doc
200+
@git diff --exit-code -- ./docs/configuration/config-file-reference.md
201+
197202
web-serve:
198203
cd website && hugo --config config.toml -v server

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,9 @@ Read the [getting started guide](docs/getting_started.md) if you're new to the
2525
project. Before deploying Cortex with a permanent storage backend you
2626
should read:
2727
1. [An overview of Cortex's architecture](docs/architecture.md)
28-
1. [A general guide to running Cortex](docs/running.md)
29-
1. [Information regarding configuring Cortex](docs/arguments.md)
28+
1. [A general guide to running Cortex](docs/guides/running.md)
29+
1. [Information regarding configuring Cortex](docs/configuration/arguments.md)
30+
1. [Steps to run Cortex with Cassandra](docs/guides/cortex-with-cassandra.md)
3031

3132
For a guide to contributing to Cortex, see the [contributor guidelines](CONTRIBUTING.md).
3233

RELEASE.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,8 @@ Our goal is to provide a new minor release every 4 weeks. This is a new process
1212
| v0.2.0 | 2019-08-28 | Goutham Veeramachaneni (Github: @gouthamve) |
1313
| v0.3.0 | 2019-10-09 | Bryan Boreham (@bboreham) |
1414
| v0.4.0 | 2019-11-13 | Tom Wilkie (@tomwilkie) |
15-
| v0.5.0 | 2019-12-11 | **searching for volunteer** |
15+
| v0.5.0 | 2020-01-08 | _Abandoned_ |
16+
| v0.6.0 | 2020-01-20 | **searching for a volunteer** |
1617

1718
## Release shepherd responsibilities
1819

build-image/Dockerfile

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,6 @@ RUN apt-get update && apt-get install -y curl python-requests python-yaml file j
44
RUN curl -sL https://deb.nodesource.com/setup_6.x | sh -
55
RUN apt-get install -y nodejs npm && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
66
RUN npm install -g postcss-cli autoprefixer
7-
RUN go clean -i net && \
8-
go install -tags netgo std && \
9-
go install -race -tags netgo std
107
ENV HUGO_VERSION=v0.59.1
118
RUN git clone https://github.com/gohugoio/hugo.git --branch ${HUGO_VERSION} --depth 1 && \
129
cd hugo && go install --tags extended && cd ../ && \
@@ -15,17 +12,13 @@ RUN curl -fsSLo shfmt https://github.com/mvdan/sh/releases/download/v1.3.0/shfmt
1512
echo "b1925c2c405458811f0c227266402cf1868b4de529f114722c2e3a5af4ac7bb2 shfmt" | sha256sum -c && \
1613
chmod +x shfmt && \
1714
mv shfmt /usr/bin
15+
RUN curl -sfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh| sh -s -- -b /usr/bin v1.21.0
1816
RUN GO111MODULE=on go get -tags netgo \
19-
github.com/fzipp/gocyclo \
20-
golang.org/x/lint/golint \
21-
github.com/kisielk/[email protected] \
2217
github.com/client9/misspell/cmd/[email protected] \
2318
github.com/golang/protobuf/[email protected] \
2419
github.com/gogo/protobuf/[email protected] \
2520
github.com/gogo/protobuf/[email protected] && \
2621
rm -rf /go/pkg /go/src
27-
RUN curl -Ls https://github.com/golang/dep/releases/download/v0.5.0/dep-linux-amd64 -o $GOPATH/bin/dep && \
28-
chmod +x $GOPATH/bin/dep
2922

3023
ENV NODE_PATH=/usr/lib/node_modules
3124
COPY build.sh /

cmd/cortex/main.go

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,20 @@ func init() {
2727

2828
func main() {
2929
var (
30-
cfg cortex.Config
31-
configFile = ""
32-
eventSampleRate int
33-
ballastBytes int
30+
cfg cortex.Config
31+
configFile = ""
32+
eventSampleRate int
33+
ballastBytes int
34+
mutexProfileFraction int
3435
)
3536
flag.StringVar(&configFile, "config.file", "", "Configuration file to load.")
3637
flag.IntVar(&eventSampleRate, "event.sample-rate", 0, "How often to sample observability events (0 = never).")
3738
flag.IntVar(&ballastBytes, "mem-ballast-size-bytes", 0, "Size of memory ballast to allocate.")
39+
flag.IntVar(&mutexProfileFraction, "debug.mutex-profile-fraction", 0, "Fraction at which mutex profile vents will be reported, 0 to disable")
40+
41+
if mutexProfileFraction > 0 {
42+
runtime.SetMutexProfileFraction(mutexProfileFraction)
43+
}
3844

3945
flagext.RegisterFlags(&cfg)
4046
flag.Parse()

cmd/cortex/migrations/002_immutable_configs.up.sql

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@
22
-- process (as exemplified in users/db/migrations/006 to 010), but currently
33
-- there are no data in production and only one row in dev.
44

5+
-- https://github.com/mattes/migrate/tree/master/database/postgres#upgrading-from-v1
6+
-- Wrap all commands in BEGIN and COMMIT to accommodate upgrade
7+
BEGIN;
8+
59
-- The existing id, type columns are the id & type of the entity that owns the
610
-- config.
711
ALTER TABLE configs RENAME COLUMN id TO owner_id;
@@ -12,3 +16,5 @@ ALTER TABLE configs ADD COLUMN id SERIAL;
1216

1317
ALTER TABLE configs DROP CONSTRAINT configs_pkey;
1418
ALTER TABLE configs ADD PRIMARY KEY (id, owner_id, owner_type, subsystem);
19+
20+
COMMIT;

docs/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ Prometheus sources in a single cluster, allowing untrusted parties to share the
2424
- **Long term storage:** Cortex supports Amazon DynamoDB, Google Bigtable, Cassandra, S3 and GCS for long term storage of metric data. This allows you to durably store data for longer than the lifetime of any single machine, and use this data for long term capacity planning.
2525

2626
Cortex is a [CNCF](https://cncf.io) sandbox project used in several production systems including [Weave Cloud](https://cloud.weave.works) and [Grafana Cloud](https://grafana.com/cloud).
27-
Cortex is primarily used used as a [remote write](https://prometheus.io/docs/operating/configuration/#remote_write) destination for Prometheus, exposing a Prometheus-compatible query API.
27+
Cortex is primarily used as a [remote write](https://prometheus.io/docs/operating/configuration/#remote_write) destination for Prometheus, exposing a Prometheus-compatible query API.
2828

2929
## Documentation
3030

0 commit comments

Comments
 (0)