Skip to content

Content is read from 304 response #6970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
andrask opened this issue Sep 3, 2019 · 11 comments
Closed

Content is read from 304 response #6970

andrask opened this issue Sep 3, 2019 · 11 comments
Labels
auto-locked Outdated issues that have been locked by automation C: cache Dealing with cache and files in it project: vendored dependency Related to a vendored dependency type: bug A confirmed bug or unintended behavior

Comments

@andrask
Copy link

andrask commented Sep 3, 2019

Environment

  • pip version: 19.2.3
  • Python version: 2.7, 3.6
  • OS: Linux

Pypi server: local Artifactory
Highly concurrent tox/pip sessions.

Description
pip receives a 304 response and consumes the response as if it were valid data.

The log shows that the 304 response with size 0 is received.

Created temporary directory: /tmp/pip-unpack-a3hgm2dy
Looking up "pypiserver.local/api/pypi/python/packages/packages/b8/12/d6e2a96db63c1bf43158e3ce4502c349dbcdcc338998976fedb402cce031/oslo.i18n-3.19.0-py2.py3-none-any.whl" in the cache
Current age based on date: 1
pypiserver.local:443 "GET /api/pypi/python/packages/packages/b8/12/d6e2a96db63c1bf43158e3ce4502c349dbcdcc338998976fedb402cce031/oslo.i18n-3.19.0-py2.py3-none-any.whl HTTP/1.1" 304 0
Downloading pypiserver.local/api/pypi/python/packages/packages/b8/12/d6e2a96db63c1bf43158e3ce4502c349dbcdcc338998976fedb402cce031/oslo.i18n-3.19.0-py2.py3-none-any.whl
Downloading from URL pypiserver.local/api/pypi/python/packages/packages/b8/12/d6e2a96db63c1bf43158e3ce4502c349dbcdcc338998976fedb402cce031/oslo.i18n-3.19.0-py2.py3-none-any.whl#sha256=7f58f6b5865313442948132361122b1972599ea4985af61e8d4b63e78093d5f8 (from pypiserver.local/api/pypi/python/simple/oslo-i18n/)

Then the execution goes on to download the content and the error is raised when the installation is attempted. The error is always something like:

ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    oslo.i18n===3.19.0 from https://pypiserver.local/api/pypi/python/packages/packages/b8/12/d6e2a96db63c1bf43158e3ce4502c349dbcdcc338998976fedb402cce031/oslo.i18n-3.19.0-py2.py3-none-any.whl#sha256=7f58f6b5865313442948132361122b1972599ea4985af61e8d4b63e78093d5f8:
        Expected sha256 7f58f6b5865313442948132361122b1972599ea4985af61e8d4b63e78093d5f8
             Got        e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

The erroneous SHA value of e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 indicates NULL content.

The error is sporadic and random but due to the sheer number of pip sessions, the result is a very high error rate.

Expected behavior

Normally, the log messages should show that the cache was used:

  <server> "GET <uri> 304 0
  Using cached <url>
  Downloading from URL <url>#sha256=<sha>
  Added <package> from <url> to build tracker '<tracker>'
  Removed <package> from <url>#sha256=<sha> from build tracker '<tracker>'

In the erroneous case, the messages are:

<server> "GET <uri> 304 0
Downloading <url>
Downloading from URL <url>#sha256=<sha>

The expected behavior is that on 304 cache miss (that can obviously happen for some reason) the package is still downloaded and the cache is updated.

How to Reproduce

Given cachemiss_server.py

from http.server import BaseHTTPRequestHandler, HTTPServer


class CacheMissServer(BaseHTTPRequestHandler):
    def do_GET(self):
        if self.requestline == 'GET /certifi/ HTTP/1.1':
            self.send_response(200, self.responses[200])
            self.send_header("Content-Type", "text/html; charset=UTF-8")
            self.end_headers()
            self.wfile.write(index_resp)
        else:
            self.send_response(304)
            self.end_headers()

index_resp = '''<html><head><title>Simple Index</title><meta name="api-version" value="2" /></head><body>
<a href="../../packages/packages/55/bf/e1094f052b5932a452b12ac5a6c229f3d7dbf41a66cfc8d5832fe3fb6463/certifi-0.0.1.tar.gz#sha256=a6f0d79b2b63a43c71bc128929be3a17bdf5d0a5f74208d7ad9563b3bf8210e9" rel="internal" >certifi-0.0.1.tar.gz</a><br/>
</body></html>
'''


if __name__ == '__main__':
    server = HTTPServer(('', 8888), CacheMissServer)
    server.serve_forever()

Run it with
python2 cachemiss_server.py

Then run pip to see the error:

pip install -i http://localhost:8888 certifi -v

Created temporary directory: /tmp/pip-ephem-wheel-cache-jnh8j65n
Created temporary directory: /tmp/pip-req-tracker-7t1503qx
Created requirements tracker '/tmp/pip-req-tracker-7t1503qx'
Created temporary directory: /tmp/pip-install-z7wubh5n
Looking in indexes: http://localhost:8888
Collecting certifi
  1 location(s) to search for versions of certifi:
  * http://localhost:8888/certifi/
  Getting page http://localhost:8888/certifi/
  Found index url http://localhost:8888
  Starting new HTTP connection (1): localhost:8888
  http://localhost:8888 "GET /certifi/ HTTP/1.1" 200 None
  Analyzing links from page http://localhost:8888/certifi/
    Found link http://localhost:8888/packages/packages/55/bf/e1094f052b5932a452b12ac5a6c229f3d7dbf41a66cfc8d5832fe3fb6463/certifi-0.0.1.tar.gz#sha256=a6f0d79b2b63a43c71bc128929be3a17bdf5d0a5f74208d7ad9563b3bf8210e9 (from http://localhost:8888/certifi/), version: 0.0.1
  Given no hashes to check 1 links for project 'certifi': discarding no candidates
  Using version 0.0.1 (newest of versions: 0.0.1)
  Created temporary directory: /tmp/pip-unpack-c57gf32i
  Found index url http://localhost:8888
  Resetting dropped connection: localhost
  http://localhost:8888 "GET /packages/packages/55/bf/e1094f052b5932a452b12ac5a6c229f3d7dbf41a66cfc8d5832fe3fb6463/certifi-0.0.1.tar.gz HTTP/1.1" 304 0
  Downloading http://localhost:8888/packages/packages/55/bf/e1094f052b5932a452b12ac5a6c229f3d7dbf41a66cfc8d5832fe3fb6463/certifi-0.0.1.tar.gz
  Downloading from URL http://localhost:8888/packages/packages/55/bf/e1094f052b5932a452b12ac5a6c229f3d7dbf41a66cfc8d5832fe3fb6463/certifi-0.0.1.tar.gz#sha256=a6f0d79b2b63a43c71bc128929be3a17bdf5d0a5f74208d7ad9563b3bf8210e9 (from http://localhost:8888/certifi/)

Cleaning up...
Removed build tracker '/tmp/pip-req-tracker-7t1503qx'
ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    certifi from http://localhost:8888/packages/packages/55/bf/e1094f052b5932a452b12ac5a6c229f3d7dbf41a66cfc8d5832fe3fb6463/certifi-0.0.1.tar.gz#sha256=a6f0d79b2b63a43c71bc128929be3a17bdf5d0a5f74208d7ad9563b3bf8210e9:
        Expected sha256 a6f0d79b2b63a43c71bc128929be3a17bdf5d0a5f74208d7ad9563b3bf8210e9
             Got        e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

Exception information:
Traceback (most recent call last):
  File "/home/akovi/openstack/mistral/.tox/venv/lib/python3.6/site-packages/pip/_internal/cli/base_command.py", line 188, in main
    status = self.run(options, args)
  File "/home/akovi/openstack/mistral/.tox/venv/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 345, in run
    resolver.resolve(requirement_set)
  File "/home/akovi/openstack/mistral/.tox/venv/lib/python3.6/site-packages/pip/_internal/legacy_resolve.py", line 203, in resolve
    raise hash_errors
pip._internal.exceptions.HashErrors: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    certifi from http://localhost:8888/packages/packages/55/bf/e1094f052b5932a452b12ac5a6c229f3d7dbf41a66cfc8d5832fe3fb6463/certifi-0.0.1.tar.gz#sha256=a6f0d79b2b63a43c71bc128929be3a17bdf5d0a5f74208d7ad9563b3bf8210e9:
        Expected sha256 a6f0d79b2b63a43c71bc128929be3a17bdf5d0a5f74208d7ad9563b3bf8210e9
             Got        e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

@ghost ghost added the S: needs triage Issues/PRs that need to be triaged label Sep 3, 2019
@chrahunt
Copy link
Member

chrahunt commented Sep 3, 2019

If you have a few minutes could you put the reproduction in terms of this helper script? Just for ease of checking when someone gets to this.

@andrask
Copy link
Author

andrask commented Sep 3, 2019

I thought it a little further and came to this conclusion:

If an ETag request is sent, and there is a cache miss, the result will be taken from the 304 response (which is empty)

I'll try to assemble a small example server.

@andrask
Copy link
Author

andrask commented Sep 3, 2019

@chrahunt thanks for the note. I updated the description.

@andrask andrask changed the title Content is read from 304 request Content is read from 304 requesponse Sep 3, 2019
@andrask andrask changed the title Content is read from 304 requesponse Content is read from 304 response Sep 3, 2019
@chrahunt
Copy link
Member

chrahunt commented Sep 3, 2019

It looks like the current repro (and maybe the actual issue) are hitting this condition - if we aren't able to get something from the cache while processing a 304 then it returns the response directly.

Further, the cache that we're using (cachecontrol.caches.file_cache.FileCache) does not guard reads, and removes files if they exist on write. So the following is definitely possible:

  1. pip 1 reads headers from cache and includes them in request
  2. pip 2 reads headers from cache and includes them in request
  3. pip 1 receives 304 response and goes to write cache file here, removing it...
  4. pip 2 receives 304 response and goes to read cache file, for comparison - but it's gone and returns the 304 response to pip
  5. pip 1 creates the cache file and writes the new headers and existing response body

This should be fixed by the proposed solution to remove usage of lockfile in the cache in #4766 in favor of safer filesystem operations.

@chrahunt chrahunt added C: cache Dealing with cache and files in it type: bug A confirmed bug or unintended behavior labels Sep 3, 2019
@ghost ghost removed the S: needs triage Issues/PRs that need to be triaged label Sep 3, 2019
@chrahunt chrahunt added the project: vendored dependency Related to a vendored dependency label Sep 3, 2019
@andrask
Copy link
Author

andrask commented Sep 4, 2019

If it improves the situation, I fully support your changes in #6879. The retry mechanism would also be beneficial.
However, I believe, cache miss is not an impossible situation despite all efforts. So it would be desirable to support falling back to downloading the files without etag. Even if this may cause delays in the completion of the full operation, it's the safest bet. E.g. in our case it would cause 1 or 2 packages to be excessively downloaded in a pipeline where dozens of pip sessions are executed in parallel downloading hundreds of packages. The overhead would be negligible but one error can cause complete failure of the pipeline run.

@chrahunt
Copy link
Member

chrahunt commented Sep 4, 2019

Under the new scheme, a 304 would only lead to error like this if the cache entry was explicitly removed by something external to pip (since pip itself would not do it, and we would only get a 304 if we had a cache entry in the first place), but I take your point. We should be robust to something like that.

One hint for anyone that wants to address this: we could check for the status code from the response here, but I don't think that there's a nice way to incorporate that result into our existing retry mechanism.

@chrahunt
Copy link
Member

@andrask, just merged the lockfile changes. If you are able to you may try with the latest master version and see if there's any improvement.

@chrahunt
Copy link
Member

The cache updates are now part of 19.3, so it should be easier to check.

@andrask
Copy link
Author

andrask commented Oct 28, 2019

@chrahunt Thank you very much! I cannot check if the error is gone. We worked around the issue by not using the http cache at all. Let's hope it got better.
Unfortunately, the example I provided in this ticket still fails as on 304 it is expected that the file is present in the cache.

@chrahunt
Copy link
Member

Thanks for checking! I think that is expected. While the server did help a lot to uncover the underlying issue, it is not well-behaved since normally a 304 should only be returned in response to a request which included cache information. The fix for that would be what was discussed above, making this area in general more robust with respect to response handling.

I think it's still worth checking in your original environment if you're able to - at least it would help identify if the cause of the original issue was the race condition described here.

@andrask
Copy link
Author

andrask commented Oct 29, 2019

I found a job that still uses the 'old mode' with http cache. The last three builds from Oct 25 to Oct 28 have installed the 19.3.x version and the issue seems to not have occurred. I guess it endorses that the new cache solution probably works much better.
I'll close this issue. Great work! Woohoo 🚂

@andrask andrask closed this as completed Oct 29, 2019
@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Nov 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Nov 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation C: cache Dealing with cache and files in it project: vendored dependency Related to a vendored dependency type: bug A confirmed bug or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants