This repository demonstrates how to modernize your data infrastructure by combining the power of:
Check out the YouTube talk for a quick walkthrough of this modernization: 📺 YouTube: Upgrade Your Infrastructure to Iceberg with dlt + Lakekeeper
This repository contains two configuration files:
dlt_warehouse.yml– This is the original configuration before modernization, representing a traditional Snowflake-based warehouse setup.dlt.yml– This is the modernized configuration, using Apache Iceberg as the destination via Lakekeeper.
Refer to each file to see how to transition from a legacy data warehouse pipeline to a modern, open table format with Iceberg.
- Install uv
pip install uv- clone this repo and run the command:
make dev- Put your Lakekeeper token to
dlt_portable_data_lake_demo/.dlt/secrets.toml
[destination.iceberg_lake.credentials]
credential="your-token"- Download an archive with data
make download-gh- Add the license:
[runtime]
license="..."💡 The license is needed when using dlt+ features, sources, or destinations like Iceberg in this demo. Don’t have one yet? Join the waiting list to request it.
- Run the pipeline
uv run dlt pipeline loading_events run-
You can see the data in the Lakekeeper: https://you.hosted.lakekeeper.app/catalog
-
(Optional) Run transformations. You need to specify credentials for Snowflake warehouse or change the warehouse type to DuckDB.
dlt transformation aggregate_issues runExample Snowflake credentials in secrets.toml
[destination.snowflake.credentials]
database = "dlt_data"
password = "<password>"
username = "loader"
host = "your-host"
warehouse = "COMPUTE_WH"
role = "DLT_LOADER_ROLE"➡️ See the dlt Snowflake destination docs for more.
- To access to your data, check the
access.ipynb.