Skip to content

Add the dataset ID to the download modals for easier web <-> API transitions #7423

@sidneymbell

Description

@sidneymbell

Description

In the download modals for Datasets and Collections, please include the dataset_id and a code snippet for downloading this dataset via the Census API.

Context

Use case: today I wanted to pre-filter the tabula sapiens dataset based on metadata found in .obs before I download the count matrix. This is useful because I'm working on my local laptop, and the count data is large-ish, whereas I only actually need a small fraction of it.

In theory, this should be easy because Census provides a very nice cellxgene_census.get_obs function, which can be run something like this: cellxgene_census.get_obs(obs_value_filter='dataset_id == foo').

However, this dataset ID is impossible to find unless you query all dataset_id values in the Census and filter based on the collection_name. (H/T to @ebezzi for helping me figure out this workaround!)

Impact

I usually browse datasets online, and then download via notebook so I can be more precise in which slices of the data I actually need. Making this more seamless would save me a lot of headache trying to track down the data I want once I'm ready to download.

Alternatives you've considered

I really don't think we surface this dataset_id anywhere visible online. I even checked the dataset info box in Explorer. Maybe I'm just missing something? :)

Ideal behavior

In the modal, replace:
old:

Individual datasets and their versions may also be downloaded programmatically using the Discover API.

new:

To download this dataset via the Discover API, use this Python snippet:
cellxgene_census.get_anndata(obs_value_filter='dataset_id == foo')

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions