This example uses Quilt to inject data packages into a Jupyter notebook.
Data packages are versioned, immutable snapshots of data. Data packages may contain data of any size. Here is an example of data package: uciml/iris.
-
Add
quilttorequirements.txt -
Specify data package dependencies in
quilt.yml(docs). For example:
packages:
- vgauthier/DynamicPopEstimate # get the latest version
- danWebster/sgRNAs:a972d92 # get a specific hash (short hash)
- akarve/sales:tag:latest # get a specific tag
- asah/snli:v:1.0 # get a specific version
- Include the following lines at the top of
postBuild. (postBuildshould be executable:chmod +x postBuildon UNIX,git update-index --chmod=+x postBuildfor Windows).
#!/bin/bash
quilt installIf you are adopting the binder folder pattern for your repo2docker configuration files, and including quilt.yml, your postBuild file should look like this:
#!/bin/bash
quilt install @./binder/quilt.ymlNow you can access the package data in your Jupyter notebooks:
In [1]: from quilt.data.akarve import sales
In [2]: sales.transactions()
Out[2]:
Row ID Order ID Order Date Order Priority Order Quantity Sales \
0 1 3 2010-10-13 Low 6 261.5400
1 49 293 2012-10-01 High 49 10123.0200
2 50 293 2012-10-01 High 27 244.5700
...