- Showing the different ways of reading from a single REST API with multiple endpoints using dltHub
- Materialize the extraction of these endpoints as Dagster multi-assets
- Utilize the new dg CLI interface to scaffold a project
- Utilize the new duckdb CLI to interact with data locally
- Install uv (
>=0.6.7) - Install the experimental preview of dg (
==0.26.8) - Install DuckDB
(>= 1.2.1) - Install yarn for running Dagster's local documentation site
(==1.22.22)
Note:
- To launch the DuckDB Local UI: Run
duckdb -uiin your terminal dgis in beta; therefore, for the best results, install the same version I pinned above and utilize the pinned dependencies in thepyproject.toml
- Clone the repo locally
- Run
uv sync - Run
source .venv/bin/activate
dg docs serve
- Serve a local Dagster dg documentation site
dg list defs
- List asset definitions
dg check defs
- Check for validity of definitions
dg launch --assets <asset_key>
- Materialize an asset from the CLI
dg scaffold asset assets/dbt/dbt_assets.py
- Scaffold an example dbt asset definition within
defs/assets/dbt
dg dev
- Run the Dagster webserver/daemon to interact with, view, and launch your Assets in the UI
- Drop in replacement for
dagster dev
- Easy to scafold and organize your Dagster project
- Python venv management with
dgis integrated withuvout of the box - Automatic definitions discovery
- As soon as you create an asset definition, it will be recognized without manual import into a top-level
dg.Definitionsobject
- As soon as you create an asset definition, it will be recognized without manual import into a top-level
- CLI-first development that makes developing more streamlined and fun!
list,check,launch, etc
- Component framework and YML integration for building low/medium code, declarative pipelines
- Lowers the technical bar for contributors to a Dagster project
defs/assets/dlt/contains three example Python files1_poke_rest_api.py2_poke_rest_api.py3_poke_rest_api.py
defs/assets/dlt/README.mdcontains documentation that outlines three integration patterns for materializing multi-assets from multiple endpoints of a single REST API source
NOTE:
- Each pipelines creates a corresponding
rest_api_pokemon_<n>.duckdbfile. - You can optionally write all the assets to a single
rest_api_pokemon.duckdbfile but there are limitations to DuckDB's concurrency. - Race conditions can lead to erros when Dagster/dlt are trying to materialize mulitple assets to the same file at the same time.
To interact with these files as individual databases:
- Run
duckdb --ui - In the UI, hit the "+" icon next to "Attatched databases"
- Add the PATH to the
.duckdbfile relative to your current working directory (i.erest_api_pokemon_<n>.duckdb) - Run your queries!