-
Notifications
You must be signed in to change notification settings - Fork 217
Closed
Closed
Copy link
Labels
Milestone
Description
In download_file we use urllib.request, which seems to throw an error in url_fd.read() when trying to read the whole file at once.
It can be reproduced with this script:
import urllib.request
x = urllib.request.urlopen('https://developer.download.nvidia.com/compute/cuda/11.0.2/local_installers/cuda_11.0.2_450.51.05_linux.run')
x.read()Traceback (most recent call last):
File "./foo.py", line 5, in <module>
x.read()
File "/usr/lib/python3.8/http/client.py", line 467, in read
s = self._safe_read(self.length)
File "/usr/lib/python3.8/http/client.py", line 608, in _safe_read
data = self.fp.read(amt)
File "/usr/lib/python3.8/socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "/usr/lib/python3.8/ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "/usr/lib/python3.8/ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
OverflowError: signed integer is greater than maximum
While urllib is bugged like this, we need to either read it in chunks ourselves or skip the naive read and combine it with the subsequent write to file via shutil.copyfileobj:
with open('/dev/shm/out.raw', 'wb') as fh:
shutil.copyfileobj(x, fh)As a bonus, we won't be reading the whole thing into memory, possibly exhausting it.