Skip to content

Commit 9e5c354

Browse files
Reorganize docs (#413)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent e24e39c commit 9e5c354

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+1303
-5058
lines changed

.github/ISSUE_TEMPLATE/bug_report.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ Paste the output of `intake_esm.show_versions()` here:
4646

4747
```python
4848
import intake_esm
49+
4950
intake_esm.show_versions()
5051
```
5152

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,9 +106,9 @@ venv.bak/
106106
# mypy
107107
.mypy_cache/
108108

109-
110109
# Sphinx
111110
docs/_build
111+
_build/
112112
.vscode/
113113
notes/
114114
docs/source/collections/*

.pre-commit-config.yaml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,14 @@ repos:
99
- id: check-yaml
1010
- id: double-quote-string-fixer
1111

12+
# - repo: https://github.com/mwouts/jupytext
13+
# rev: v1.13.3
14+
# hooks:
15+
# - id: jupytext
16+
# args: [--pipe, black, --warn-only]
17+
# additional_dependencies:
18+
# - black==21.12b0 # Matches hook
19+
1220
- repo: https://github.com/psf/black
1321
rev: 21.12b0
1422
hooks:
@@ -33,11 +41,3 @@ repos:
3341
rev: v2.5.1
3442
hooks:
3543
- id: prettier
36-
37-
- repo: https://github.com/nbQA-dev/nbQA
38-
rev: 1.2.2
39-
hooks:
40-
- id: nbqa-pyupgrade
41-
additional_dependencies: [pyupgrade==2.7.3]
42-
- id: nbqa-isort
43-
additional_dependencies: [isort==v5.9.2]

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ providing necessary functionality for searching, discovering, data access/loadin
4343

4444
In [1]: import intake
4545

46-
In [2]: col_url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
46+
In [2]: col_url = "https://gist.githubusercontent.com/andersy005/7f416e57acd8319b20fc2b88d129d2b8/raw/987b4b336d1a8a4f9abec95c23eed3bd7c63c80e/pangeo-gcp-subset.json"
4747

4848
In [3]: col = intake.open_esm_datastore(col_url)
4949

ci/environment.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ dependencies:
1919
- pydantic
2020
- pytest
2121
- pytest-cov
22-
- pytest-icdiff
22+
# - pytest-icdiff
2323
- pytest-sugar
2424
- pytest-xdist
2525
- python=*=*cp*

docs/environment.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,13 +11,13 @@ dependencies:
1111
- myst-nb
1212
- pip
1313
- python-graphviz
14-
- python=3.8
14+
- python=3.9
1515
- s3fs
1616
- sphinx-copybutton
1717
- watermark
1818
- zarr
1919
- pip:
2020
- sphinxext-opengraph
21-
- sphinx-comments
21+
- autodoc_pydantic
2222
- -r ../requirements.txt
2323
- -e ..

docs/source/api.md

Lines changed: 0 additions & 11 deletions
This file was deleted.

docs/source/changelog.md

Lines changed: 0 additions & 3 deletions
This file was deleted.

docs/source/conf.py

Lines changed: 6 additions & 105 deletions
Original file line numberDiff line numberDiff line change
@@ -1,64 +1,39 @@
11
# -*- coding: utf-8 -*-
22

3-
# import inspect
43
import datetime
5-
import os
6-
import sys
74

85
import yaml
96

107
import intake_esm
118

12-
# If extensions (or modules to document with autodoc) are in another directory,
13-
# add these directories to sys.path here. If the directory is relative to the
14-
# documentation root, use os.path.abspath to make it absolute, like shown here.
15-
# sys.path.insert(0, os.path.abspath('.'))
16-
17-
cwd = os.getcwd()
18-
parent = os.path.dirname(cwd)
19-
sys.path.insert(0, parent)
20-
21-
22-
# -- General configuration -----------------------------------------------------
23-
24-
# If your documentation needs a minimal Sphinx version, state it here.
25-
# needs_sphinx = '1.0'
26-
27-
# Add any Sphinx extension module names here, as strings. They can be extensions
28-
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
299
extensions = [
3010
'sphinx.ext.autodoc',
3111
'sphinx.ext.viewcode',
3212
'sphinx.ext.autosummary',
3313
'sphinx.ext.doctest',
3414
'sphinx.ext.intersphinx',
3515
'sphinx.ext.extlinks',
36-
# 'sphinx.ext.linkcode',
3716
'sphinx.ext.intersphinx',
38-
'IPython.sphinxext.ipython_console_highlighting',
39-
'IPython.sphinxext.ipython_directive',
4017
'sphinx.ext.napoleon',
4118
'myst_nb',
4219
'sphinxext.opengraph',
4320
'sphinx_copybutton',
44-
'sphinx_comments',
21+
'sphinxcontrib.autodoc_pydantic',
4522
]
4623

4724

4825
# MyST config
4926
myst_enable_extensions = ['amsmath', 'colon_fence', 'deflist', 'html_image']
50-
myst_url_schemes = ('http', 'https', 'mailto')
27+
myst_url_schemes = ['http', 'https', 'mailto']
5128

5229
# sphinx-copybutton configurations
5330
copybutton_prompt_text = r'>>> |\.\.\. |\$ |In \[\d*\]: | {2,5}\.\.\.: | {5,8}: '
5431
copybutton_prompt_is_regexp = True
5532

56-
comments_config = {
57-
'utterances': {'repo': 'intake/intake-esm', 'optional': 'config', 'label': '💬 comment'},
58-
'hypothesis': False,
59-
}
60-
33+
autodoc_pydantic_model_show_json = True
34+
autodoc_pydantic_model_show_config = False
6135

36+
jupyter_execute_notebooks = 'cache'
6237
execution_timeout = 600
6338

6439
extlinks = {
@@ -134,11 +109,6 @@
134109
html_theme_options = {}
135110

136111

137-
# The name of an image file (within the static path) to use as favicon of the
138-
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
139-
# pixels large.
140-
# html_favicon = None
141-
142112
# Add any paths that contain custom static files (such as style sheets) here,
143113
# relative to this directory. They are copied after the builtin static files,
144114
# so a file named "default.css" will overwrite the builtin "default.css".
@@ -147,14 +117,7 @@
147117
# Sometimes the savefig directory doesn't exist and needs to be created
148118
# https://github.com/ipython/ipython/issues/8733
149119
# becomes obsolete when we can pin ipython>=5.2; see ci/requirements/doc.yml
150-
ipython_savefig_dir = os.path.join(
151-
os.path.dirname(os.path.abspath(__file__)), '_build', 'html', '_static'
152-
)
153-
154-
savefig_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'source', '_static')
155120

156-
os.makedirs(ipython_savefig_dir, exist_ok=True)
157-
os.makedirs(savefig_dir, exist_ok=True)
158121

159122
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
160123
# using the given strftime format.
@@ -207,65 +170,10 @@
207170
'python': ('https://docs.python.org/3/', None),
208171
'xarray': ('http://xarray.pydata.org/en/stable/', None),
209172
'pandas': ('https://pandas.pydata.org/pandas-docs/stable/', None),
210-
'intake': ('https://intake.readthedocs.io/en/latest/', None),
173+
'intake': ('https://intake.readthedocs.io/en/stable/', None),
211174
}
212175

213176

214-
# based on numpy doc/source/conf.py
215-
216-
217-
# def linkcode_resolve(domain, info):
218-
# """
219-
# Determine the URL corresponding to Python object
220-
# """
221-
# if domain != 'py':
222-
# return None
223-
224-
# modname = info['module']
225-
# fullname = info['fullname']
226-
227-
# submod = sys.modules.get(modname)
228-
# if submod is None:
229-
# return None
230-
231-
# obj = submod
232-
# for part in fullname.split('.'):
233-
# try:
234-
# obj = getattr(obj, part)
235-
# except AttributeError:
236-
# return None
237-
238-
# try:
239-
# fn = inspect.getsourcefile(inspect.unwrap(obj))
240-
# except TypeError:
241-
# fn = None
242-
# if not fn:
243-
# return None
244-
245-
# try:
246-
# source, lineno = inspect.getsourcelines(obj)
247-
# except OSError:
248-
# lineno = None
249-
250-
# if lineno:
251-
# linespec = f'#L{lineno}-L{lineno + len(source) - 1}'
252-
# else:
253-
# linespec = ''
254-
255-
# fn = os.path.relpath(fn, start=os.path.dirname(intake_esm.__file__))
256-
257-
# if '+' in intake_esm.__version__:
258-
# return f'https://github.com/intake/intake-esm/blob/master/intake_esm/{fn}{linespec}'
259-
# else:
260-
# return (
261-
# f'https://github.com/intake/intake-esm/blob/'
262-
# f'v{intake_esm.__version__}/intake_esm/{fn}{linespec}'
263-
# )
264-
265-
266-
# https://www.ericholscher.com/blog/2016/jul/25/integrating-jinja-rst-sphinx/
267-
268-
269177
def rstjinja(app, docname, source):
270178
"""
271179
Render our pages as a jinja template for fancy templating goodness.
@@ -278,15 +186,8 @@ def rstjinja(app, docname, source):
278186
source[0] = rendered
279187

280188

281-
def html_page_context(app, pagename, templatename, context, doctree):
282-
# Disable edit button for docstring generated pages
283-
if 'generated' in pagename:
284-
context['theme_use_edit_page_button'] = False
285-
286-
287189
def setup(app):
288190
app.connect('source-read', rstjinja)
289-
app.connect('html-page-context', html_page_context)
290191

291192

292193
with open('catalogs.yaml') as f:
Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# ESM Collection Specification
2+
3+
```{note}
4+
This documents mirrors the [ESM Collection Specification](https://github.com/NCAR/esm-collection-spec/blob/master/collection-spec/collection-spec.md) and is updated as the specification evolves.
5+
```
6+
7+
- [ESM Collection Specification](#esm-collection-specification)
8+
- [Overview](#overview)
9+
- [Collection Specification](#collection-specification)
10+
- [Catalog](#catalog)
11+
- [Assets (Data Files)](#assets-data-files)
12+
- [Catalog fields](#catalog-fields)
13+
- [Attribute Object](#attribute-object)
14+
- [Assets Object](#assets-object)
15+
- [Aggregation Control Object](#aggregation-control-object)
16+
- [Aggregation Object](#aggregation-object)
17+
18+
## Overview
19+
20+
This document explains the structure and content of an ESM Collection.
21+
A collection provides metadata about the catalog, telling us what we expect to find inside and how to open it.
22+
The collection is described is a single json file, inspired by the STAC spec.
23+
24+
The ESM Collection specification consists of three parts:
25+
26+
### Collection Specification
27+
28+
The _collection_ specification provides metadata about the catalog, telling us what we expect to find inside and how to open it.
29+
The descriptor is a single json file, inspired by the [STAC spec](https://github.com/radiantearth/stac-spec).
30+
31+
```json
32+
{
33+
"esmcat_version": "0.1.0",
34+
"id": "sample",
35+
"description": "This is a very basic sample ESM collection.",
36+
"catalog_file": "sample_catalog.csv",
37+
"attributes": [
38+
{
39+
"column_name": "activity_id",
40+
"vocabulary": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_activity_id.json"
41+
},
42+
{
43+
"column_name": "source_id",
44+
"vocabulary": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_source_id.json"
45+
}
46+
],
47+
"assets": {
48+
"column_name": "path",
49+
"format": "zarr"
50+
}
51+
}
52+
```
53+
54+
### Catalog
55+
56+
The collection points to a single catalog.
57+
A catalog is a CSV file.
58+
The meaning of the columns in the csv file is defined by the parent collection.
59+
60+
```
61+
activity_id,source_id,path
62+
CMIP,ACCESS-CM2,gs://pangeo-data/store1.zarr
63+
CMIP,GISS-E2-1-G,gs://pangeo-data/store1.zarr
64+
```
65+
66+
### Assets (Data Files)
67+
68+
The data assets can be either netCDF or Zarr.
69+
They should be either [URIs](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier) or full filesystem paths.
70+
71+
## Catalog fields
72+
73+
| Element | Type | Description |
74+
| ------------------- | --------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
75+
| esmcat_version | string | **REQUIRED.** The ESM Catalog version the collection implements. |
76+
| id | string | **REQUIRED.** Identifier for the collection. |
77+
| title | string | A short descriptive one-line title for the collection. |
78+
| description | string | **REQUIRED.** Detailed multi-line description to fully explain the collection. [CommonMark 0.28](http://commonmark.org/) syntax MAY be used for rich text representation. |
79+
| catalog_file | string | **REQUIRED.** Path to a the CSV file with the catalog contents. |
80+
| catalog_dict | array | If specified, it is mutually exclusive with `catalog_file`. An array of dictionaries that represents the data that would otherwise be in the csv. |
81+
| attributes | [[Attribute Object](#attribute-object)] | **REQUIRED.** A list of attribute columns in the data set. |
82+
| assets | [Assets Object](#assets-object) | **REQUIRED.** Description of how the assets (data files) are referenced in the CSV catalog file. |
83+
| aggregation_control | [Aggregation Control Object](#aggregation-control-object) | **OPTIONAL.** Description of how to support aggregation of multiple assets into a single xarray data set. |
84+
85+
### Attribute Object
86+
87+
An attribute object describes a column in the catalog CSV file.
88+
The column names can optionally be associated with a controlled vocabulary, such as the [CMIP6 CVs](https://github.com/WCRP-CMIP/CMIP6_CVs), which explain how to interpret the attribute values.
89+
90+
| Element | Type | Description |
91+
| ----------- | ------ | -------------------------------------------------------------------------------------- |
92+
| column_name | string | **REQUIRED.** The name of the attribute column. Must be in the header of the CSV file. |
93+
| vocabulary | string | Link to the controlled vocabulary for the attribute in the format of a URL. |
94+
95+
### Assets Object
96+
97+
An assets object describes the columns in the CSV file relevant for opening the actual data files.
98+
99+
| Element | Type | Description |
100+
| ------------------ | ------ | ---------------------------------------------------------------------------------------------------------------------------------- |
101+
| column_name | string | **REQUIRED.** The name of the column containing the path to the asset. Must be in the header of the CSV file. |
102+
| format | string | The data format. Valid values are `netcdf` and `zarr`. If specified, it means that all data in the catalog is the same type. |
103+
| format_column_name | string | The column name which contains the data format, allowing for variable data types in one catalog. Mutually exclusive with `format`. |
104+
105+
### Aggregation Control Object
106+
107+
An aggregation control object defines neccessary information to use when aggregating multiple assets into a single xarray data set.
108+
109+
| Element | Type | Description |
110+
| -------------------- | ------------------------------------------- | --------------------------------------------------------------------------------------- |
111+
| variable_column_name | string | **REQUIRED.** Name of the attribute column in csv file that contains the variable name. |
112+
| groupby_attrs | array | Column names (attributes) that define data sets that can be aggegrated. |
113+
| aggregations | [[Aggregation Object](#aggregation-object)] | **OPTIONAL.** List of aggregations to apply to query results |
114+
115+
### Aggregation Object
116+
117+
An aggregation object describes types of operations done during the aggregation of multiple assets into a single xarray data set.
118+
119+
| Element | Type | Description |
120+
| -------------- | ------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
121+
| type | string | **REQUIRED.** Type of aggregation operation to apply. Valid values include: `join_new`, `join_existing`, `union` |
122+
| attribute_name | string | Name of attribute (column) across which to aggregate. |
123+
| options | object | **OPTIONAL.** Aggregration settings that are passed as keywords arguments to [`xarray.concat()`](https://xarray.pydata.org/en/stable/generated/xarray.concat.html) or [`xarray.merge()`](https://xarray.pydata.org/en/stable/generated/xarray.merge.html#xarray.merge). For `join_existing`, it must contain the name of the existing dimension to use (for e.g.: something like `{'dim': 'time'}`). |

0 commit comments

Comments
 (0)