-
Notifications
You must be signed in to change notification settings - Fork 424
Open
Description
Hello,
It might be because of my lack of comprehension of the package but I can not find an easy way to perform a glob directly on an url.
I see that with fsspec.get_fs_token_paths I can direclty get a path expansion for instance:
fsspec.get_fs_token_paths('gs://my-bucket/test/*.json')
>>> (<gcsfs.core.GCSFileSystem at 0x7f65ec653af0>,
'90ad04da79e6e943b0f4d3dfba-------------33462f993faa9252',
['my-bucket/test/001.jsonl', 'my-bucket/test/002.jsonl'])
But then in order to get expanded urls, I need to reattach the protocol (parsing the original url), or detect if it is a local file. All of that is already done somewhere in get_fs_token_paths so I was wondering if there is an elegant way to just obtain directly: ['gs://my-bucket/test/001.jsonl', 'gs://my-bucket/test/002.jsonl']. Ideally in a way that handles absolute and local paths as well
lhoestq
Metadata
Metadata
Assignees
Labels
No labels