You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
so the DNS resolution is working properly in the k8s cluster. If the ip of the S3 endpoint never changes over time in Cortex, it seems to be due to DNS caching issues.
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.
The number of S3 calls was reduced dramatically by the implementation of the bucket index.
The high number of GET S3 errors was because Cortex, for performance reasons, performed directly a GET on objects without checking their existence first, since it's faster. This errors were reported also as errors in the metrics exposed by Cortex, but this case was then removed from the count, so now the dashboards don't report it as a problem anymore.
About all the calls made to the same single S3 endpoint, the issue was brought up by the S3 team in AWS itself, but we couldn't prove on our side that the IP hit by all the S3 calls was the same.
Considering the above points and after deploying newer versions of Cortex, the issues mentioned in this jira don't seem to be a problem anymore.
Closing this.
Describe the bug
Using block storage on AWS using S3
To Reproduce
Steps to reproduce the behavior:
Expected behavior
There are no 4xx errors from S3 service.
Environment:
Storage Engine
Additional Context

As you can see the GET requests per second are aligned with the 4xx errors in AWS console
This might be the cause of #3753
For issue number 2, different nslookup calls from the same cortex pod return different ip addresses for the S3 endpoint
so the DNS resolution is working properly in the k8s cluster. If the ip of the S3 endpoint never changes over time in Cortex, it seems to be due to DNS caching issues.
The text was updated successfully, but these errors were encountered: