Potential Malfunctioning of GDriveReader for torchtext datasets (or files with larger sizes in general)

### 🐛 Describe the bug

Currently none of the torchtext datasets with GDrive URL are able to download the files. The reason being is confirm token is alway None. 

The logic to download large datasets is borrowed from tensor2tensor library [here](https://github.com/tensorflow/tensor2tensor/blob/a8e50c0364071ca596bc2e4a617e1f4174b2941b/tensor2tensor/data_generators/generator_utils.py#L286). This is also proposed in various answers [here](https://stackoverflow.com/questions/38511444/python-download-files-from-google-drive-using-url). 

The confirm token comes from response.cookies.items(). Unfortunately, I notice that for all the URLs we have in torchtext this return an empty list.

```python
import requests
session  = requests.Session()
#URL of dbpedia dataset
URL = "https://drive.google.com/uc?export=download&id=0Bz8a_Dbh9QhbQ2Vic1kxMmZZQ1k"
response = session.get(URL, stream=True)
response.cookies.items() #return empty list
```
Eventually this lead to following error (since response above does not contain "content-disposition")
`Internal error: headers don't contain content-disposition.`

As a aside (although not relevant to solve current problem) In torchtext earlier we referred this error as `Internal error: confirm_token was not found in Google drive link.`

I am not sure if something has changed recently regarding confirm token, and wonder if we have any potential workarounds/fix to this problem? 

cc: @NivekT , @Nayef211 



### Versions

Latest from main

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Potential Malfunctioning of GDriveReader for torchtext datasets (or files with larger sizes in general) #468

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potential Malfunctioning of GDriveReader for torchtext datasets (or files with larger sizes in general) #468

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions