downloading of large files fails with urllib.request with recent Python 3.x

In `download_file` we use `urllib.request`, which seems to throw an error in `url_fd.read()` when trying to read the whole file at once.

It can be reproduced with this script:
```python
import urllib.request

x = urllib.request.urlopen('https://developer.download.nvidia.com/compute/cuda/11.0.2/local_installers/cuda_11.0.2_450.51.05_linux.run')
x.read()
```

```
Traceback (most recent call last):
  File "./foo.py", line 5, in <module>
    x.read()
  File "/usr/lib/python3.8/http/client.py", line 467, in read
    s = self._safe_read(self.length)
  File "/usr/lib/python3.8/http/client.py", line 608, in _safe_read
    data = self.fp.read(amt)
  File "/usr/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.8/ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib/python3.8/ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
OverflowError: signed integer is greater than maximum
```

While `urllib` is bugged like this, we need to either read it in chunks ourselves or skip the naive `read` and combine it with the subsequent write to file via `shutil.copyfileobj`:
```python
with open('/dev/shm/out.raw', 'wb') as fh:
        shutil.copyfileobj(x, fh)
```

As a bonus, we won't be reading the whole thing into memory, possibly exhausting it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

downloading of large files fails with urllib.request with recent Python 3.x #3455

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

downloading of large files fails with urllib.request with recent Python 3.x #3455

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions