API endpoints for the series deletion API using block storage #4370

ilangofman · 2021-07-19T17:47:38Z

What this PR does:
This PR implements the API endpoints for the time series deletion API for block storage. Based on the proposal for the series deletion API here.

The work done in this PR includes creating, getting and canceling deletion requests.

When a delete request is received, a tombstone is created to track the deletion information and state.
Getting the delete request information requires reading the tombstone files.
Cancelling a deletion is done by deleting the tombstone. The cancellation can only be done while the -purger.delete-request-cancel-period hasn't passed since the delete request was originally made.

For the API to be enabled, set --purger.enable to true.

The new endpoints are located within the purger, where the existing tenant deletion API is also located. However, the files are under a chunks subdirectory but contain existing APIs for blocks storage. A future PR will be required to move the purger service out of the chunks subdirectory.

To make it easier to review and get incremental feedback, the implementation of the proposal is going to be split among roughly 4 PR's.

API endpoints (this PR)
Permanent deletion processing
Query time filtering using tombstones before data is deleted
Cache invalidation

Please let me know if you have any questions or concerns. Any feedback is appreciated. Thank you for taking a look at this PR and helping out!

Which issue(s) this PR fixes:
Initial PR of multiple to address #4267

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: ilangofman <[email protected]>

pkg/cortex/modules.go

Signed-off-by: ilangofman <[email protected]>

jeromeinsf · 2021-08-10T01:25:11Z

Can you please also review this PR @alolita @Aneurysm9

bboreham · 2021-08-11T15:24:42Z

I'm a bit confused why files disappeared like pkg/chunk/purger/tenant_deletion_api.go and the why the blocks purger is under pkg/chunk.

Did any functionality disappear? Could the blocks functionality live under pkg/storage/tsdb ?

ilangofman · 2021-08-11T15:48:06Z

I'm a bit confused why files disappeared like pkg/chunk/purger/tenant_deletion_api.go and the why the blocks purger is under pkg/chunk.

Did any functionality disappear?

@bboreham No functionality disappeared. The pkg/chunk/purger/tenant_deletion_api.go file was the blocks implementation of the tenant deletion endpoints. Since I created more endpoints for the series deletion with blocks storage, I just combined both under one file and called it the blocks purger. So all the previous code in tenant_deletion_api.go is now in pkg/chunk/purger/blocks_purger.go. I can undo that and create individual files for them if you would like.

Could the blocks functionality live under pkg/storage/tsdb ?

I agree, it is a bit confusing that some of the blocks functionality is under the chunks folder. I can try moving it to the suggested folder. If the answer to my previous question was to keep the pkg/chunk/purger/tenant_deletion_api.go separate, I am guessing it would make sense to move that as well to this folder, since it is only for blocks storage, WDYT?

Signed-off-by: ilangofman <[email protected]>

…man/cortex into block_deletion_endpoints

Signed-off-by: ilangofman <[email protected]>

bboreham · 2021-08-18T15:31:43Z

I just combined both under one file and called it the blocks purger

This would be a good thing to explain in the PR description, to aid reviewers.

Also you could organise the change into commits where one is a pure refactor, just moving code, and then subsequent ones modify the behaviour. The current set of 22 commits shows the journey that you went on, but is not a good breakdown for reviewing.

bboreham · 2021-08-18T15:34:57Z

I expect when we remove chunks support (which is deprecated), then we can delete pkg/chunks, so don't add anything blocks-related under there.

ilangofman · 2021-08-18T23:29:40Z

Can you explain the distinction between "delete request" and "tombstone" as used in this code?

The way I have been thinking about it is that the tombstone is just a file to keep track of a series deletion request. Getting the delete request information would mean reading the tombstone for it. Admittedly, I have been using the two terms synonymously. I'll update the code to be more consistent.
If you see any issues with that, please let me know.

Signed-off-by: ilangofman <[email protected]>

jeromeinsf · 2021-09-03T21:13:23Z

@ilangofman Are you updating the code to be more consistent or should someone take over?

ilangofman · 2021-09-07T17:54:16Z

@ilangofman Are you updating the code to be more consistent or should someone take over?

Hi @jeromeinsf . I took a couple of weeks off but I am going to get back to this. I think this PR I should be able to finish on my own.

For the rest of the deletion PR's, I am not sure how much time I will have available to work on them, with university starting this week. I would still like to contribute when I can but it might take some time to make changes. To speed things up, if someone wants to take over the other PR's or collaborate together, I'm open to all!

Signed-off-by: ilangofman <[email protected]>

reverse change of combining blocks purger and tenant deletion

alvinlin123

Here is the first batch of review comments.

alvinlin123 · 2021-11-09T00:10:23Z