Releases: dlt-hub/dlt
Releases · dlt-hub/dlt
1.20.0
Core Library
- feat: implement ConfigurationFileSelector by @ivasio in #3418
- Fix: reset config in PluggableRunContext.reload_providers by @ivasio in #3409
- add runtime CLI configs in WorkspaceRuntimeConfiguration by @ivasio in #3424
- implements run artifacts sync to a bucket using
filesystemby @ivasio in #3339 - Fix: extensive .gitignore for dlt init by @anuunchin in #3437
- Fix: Invisible sections are receiving border and background color in dashboard by @anuunchin in #3439
- implements cancellation of normalize jobs by @rudolfix in #3444
- information on pending and partially loaded packages when pipeline fails @rudolfix in #3444
- Fix race condition in LimitItem by @burnash in #3442
- Add offset/limit body_path fields to OffsetPaginatorConfig by @kinghuang in #3260
- [fix/3358] add pagination stopping to
JSONResponseCursorPaginatorby @segetsy in #3374 - (feat) small dashboard improvements by @rudolfix in #3450
Chores
Docs
New Contributors
- @kinghuang made their first contribution in #3260
- @segetsy made their first contribution in #3374
Full Changelog: 1.19.1...1.20.0
1.19.1
1.19.0
Core Library
- Feat: support return_type = arrow_stream for connectorx backend by @ivasio in #3218
- Feat: visual pipeline run section in dashboard by @anuunchin in #3250
- ingests parquet into mssql, mysql and sqlite via ADBC by @rudolfix in #3333
- fix/3165 Athena LakeFormation permissions are required even tho Lakeformation is not used by @alkaline-0 in #3271
- feat:
Schema.to_mermaid()by @zilto in #3364 - fix/3190 Fixed the persistence issue of boundary timestamp after removing it #3367 by @alkaline-0 in #3378
- feat:
snowflakeclustering key modifications by @jorritsandbrink in #3365 - fixes athena refresh mode (iceberg data incorrectly dropped) by @rudolfix in #3313
- override local marimo theme for dashboard, fix to 'light' by @djudjuu in #3337
- (fix) 3346 fix trace loading: ignore trace if cannot be unpickled by @rudolfix in #3354
- Reinitialize packages after exit() is called by @JayJai04 in #3300
- feat: updated scaffolding template by @zilto in #3275
- fix: dashboard no longer crashes on broken home cell by @djudjuu in #3348
- (fix) use sparse checkout for dlt init dlthub by @rudolfix in #3356
- fix: minor typos and redundant variable by @tahamuzammil100 in #3314
- feat/3198 profile selection in Dashboard if enabled in workspace by @alkaline-0 in #3295
- fix: make
with_table_nameand other functions available through `dlt.pip… by @hello-world-bfree in #3318 - Redshift feature: Include STS session token in COPY CREDENTIALS. by @timH6502 in #3307
- Fix: The child table column remains in the schema as a partial column with seen-null-first=True by @anuunchin in #3131
- Fix: Uncalled source loses resource-level hints in pipeline.run() by @anuunchin in #3369
- (fix) does not overwrite local file context in destination factory by @rudolfix in #3398
- (fix) 3351 fixes default type var to allow running with old typing_extensions (ie. old Databricks clusters) by @rudolfix in #3373
- Fix: pipeline_drop.init() got an unexpected keyword argument 'no_pwd' by @anuunchin in #3386
- sets ducklake fingerprint to storage fingerprint by @rudolfix in #3388
- Fix: Section backrgound colors and top margins in dashboard by @anuunchin in #3393
Docs
- Update incorrect LLM-native workflow link (404 error) by @zjacom in #3294
- Marimo Docs Page: Added Quotations to pip install ibis Command by @anair123 in #3304
- add init_replication description and required permissions by @molkazhani2001 in #3020
- fix notebook formatting by @ivasio in #3305
- updated the sql databases configuration docs by @dat-a-man in #3107
- Docs: fix footer in dark mode, add scaffoldings link by @martinibach in #3309
- Add example to SQL docs: updated docs on how to filter rows using
query_adapter_callbackby @dat-a-man in #3253 - Update deploy-with-dagster.md by @AstrakhantsevaAA in #3287
- Fix DocSearch v4 styles by @burnash in #3338
- Docs/release highlight 1.16-17 by @AstrakhantsevaAA in #3213
- Update weaviate destination docs and version by @VioletM in #3352
- (chore) fixes sqlglot from find by @rudolfix in #3357
- (docs) adds community destinations by @rudolfix in #3326
- Docs: Lifecycle of a dlt transformation (a sql query model of a transformation relation) by @anuunchin in #3329
- docs/snowflake native app architecture docs by @kaliole in #3359
- fix: Kestra links and docker image by @wrussell1999 in #3331
- docs:
data_qualityconcept page by @zilto in #3341
Chores
- Enable CI run for the runtime branch by @sh-rp in #3317
- Chore: Update docs npm dependencies and clean up docs build tooling by @sh-rp in #3247
- fix flaky dashboard tests by @rudolfix in #3370
- chore: add proper optional typehint to
dlt/extract/hints.pymodule by @luqmansen in #3332
New Contributors
- @zjacom made their first contribution in #3294
- @JayJai04 made their first contribution in #3300
- @anair123 made their first contribution in #3304
- @martinibach made their first contribution in #3309
- @hello-world-bfree made their first contribution in #3318
- @tahamuzammil100 made their first contribution in #3314
- @timH6502 made their first contribution in #3307
- @wrussell1999 made their first contribution in #3331
- @luqmansen made their first contribution in #3332
Full Changelog: 1.18.2...1.19.0
1.18.2
1.18.1
1.18.0
A few cool feats in this release that also need your attention if you use them:
databricksdestination now supportsicebergtable format,clusterandpartitionhints. note that previously they were ignored and mixing ofclusterandpartitionis not allowed. in super rare cases you can get error messages that were not there previously- signal handling went through full overhaul, (https://dlthub.com/docs/devel/running-in-production/running#allow-a-graceful-shutdown) this allows
dltto shut down load pools gracefully (previously we were raising exceptions). with console attached double CTRL-C will raise immediately. we also support graceful shutdowns for pipelines that run in thread pools. this is a behavioral change compared to1.17.0 - we now support named destinations with shorthands
dlt.pipeline(destination="warehouse")and viadlt.destinationwhich now serves both as decorator and a factory, overall super helpful when you go from dev to production but in rare cases you may get linter and runtime error on custom destinations@dlt.destinationthat were not using kwargs to pass options to decorator Relationgotto_ibismethod that works both on tables and queries and which uses dlt as a backed for ibis so you can read data from them- chores: we have new internal
_workspacemodules where all code for cli, dashboards, mcp (and other things that you typically not use in runtime). there were changes in internal private interfaces. as a user you do not to worry about it
Core Library
- feat: unify
dlt.RelationAPI and create bound Ibis tables by @zilto in #3179 - fix(dashboard): 3077 remove
pandasdeps by @zilto in #3157 - Add custom header generation with quote_none to ArrowToCsvWriter by @burnash in #3178
- [Databricks destination] Feat/2863 databricks table optimization by @bayees in #3137
- feat:3162 add resource name to incremental extract duplicate checks logging. by @and2reak in #3164
- Fix: make store_decimal_as_integer patching conditional on pyiceberg version by @ivasio in #3185
- QoL: accept destination name as shorthand form of destination by @anuunchin in #3122
- chore/moves cli to
_workspacemodule by @rudolfix in #3215 - ignores native config values if config spec does not implement those by @rudolfix in #3233
- avoids passing naming conventions as modules by @rudolfix in #3229
- feat: extend
TTableReferenceby @zilto in #3093 - Feat: dlt.destination as named destination factory by @anuunchin in #3204
- Feat: resource.add_metrics implementation by @anuunchin in #3240
- Feature: Introduce support of http based resources for fs source by @TheLazzziest in #3029
- Fix: add support for yield_map in rest resource by @ivasio in #3211
- Skip
cluster byin bigquery on alter statements by @adrian-173 in #3239 - Re-enable python 3.14 common tests by @sh-rp in #3242
- graceful signal handler by @rudolfix in #3234
- replace
.table(..., table_type="ibis")with.to_ibis()by @zilto in #3225 - Fix: Empty columns that were previously flattened into compound ones violate freeze contract by @anuunchin in #3226
- adds more signal options by @rudolfix in #3248
Chores
- adds workspace module by @rudolfix in #3171
- chore/moves cli to
_workspacemodule by @rudolfix in #3215 - databricks: removes cluster on create table test and allows only partition by @rudolfix in #3191
- repo(pytest): migrate to
pyproject.tomland reduce verbosity by @zilto in #3205 - Feat: workspace file selector, package builder by @anuunchin in #3207
- Feat/adds workspace configuration by @rudolfix in #3221
- chore/fixes pokemon table counts by @rudolfix in #3232
- Feat/3154 convert script preprocess docs to python and add destination capabilities section to destination pages by @alkaline-0 in #3188
- Fix: workspace package hash is dependent on file order by @anuunchin in #3251
Docs
- documents vault provider by @rudolfix in #3160
- Docs: dataset access doesn't work in tutorial by @anuunchin in #3197
- Update Databricks init scipt documentation by @AndreiBondarenko in #3202
- Databricks documentation typo fix by @Magicbeanbuyer in #3217
- Fixes docs on schema file naming convention by @willi-mueller in #3244
New Contributors
- @and2reak made their first contribution in #3164
- @ivasio made their first contribution in #3185
- @Magicbeanbuyer made their first contribution in #3217
- @TheLazzziest made their first contribution in #3029
- @adrian-173 made their first contribution in #3239
Full Changelog: 1.17.1...1.18.0
1.17.1
This patch release mostly addresses bugs and inconsistencies found in new ducklake destination. The most significant change was to rename catalog_name to ducklake_name in destination.ducklake configuration in #3153
Core Library
- fix(ducklake): 3140 disambiguate config key and default values by @zilto in #3153
- fix: explicitly replace
postgresqlbypostgresinATTACH(ducklake) by @zilto in #3148 - fix: 3139 pass SQLAlchemy credentials to f-string (ducklake) by @zilto in #3150
- rest_api: remove unused exceptions.py by @burnash in #3143
- fix incorrect export by @chulkilee in #3120
- Fix/3123 close sqlalchemy cursor by @rudolfix in #3136
- fix: 3145 add
read_parquet(use_arrow: bool)by @zilto in #3149 - restclient: add support for data parameter in RESTClient and rest_api by @burnash in #3134
- Feat: Custom metrics in the incremental transform by @anuunchin in #3117
Chores
- Remove obsolete instruction from
CONTRIBUTING.mdby @jorritsandbrink in #3135 - Moves test of newest libraries on dashboard e2e tests to mac by @sh-rp in #3142
- fixes
test_track_anon_eventtest by @rudolfix in #3152
Docs
- docs/prefect integration by @djudjuu in #3037
- docs: rest_api: fix rest_api_source parameters by @burnash in #3138
- docs: snowflake: document stage url path matching by @burnash in #3108
New Contributors
- @chulkilee made their first contribution in #3120
Full Changelog: 1.17.0...1.17.1
1.17.0
Core Library
- dashboard: fixes file opener on WSL by @rudolfix in #3076
- (bugfix) persist incremental initial value by @rudolfix in #3075
- (QoL) sets explicit timeouts on trackers by @rudolfix in #3074
- restclient: misc Paginators improvements by @burnash in #2924
- Improved pipeline attach command and Dashboard launcher extensions by @sh-rp in #3060
- Fix parameter reference in IncrementalCursorPathHasValueNone exceptio… by @rik-adegeest in #3070
- fix: convert local file path to posix before PUT to Databricks destination by @AndreiBondarenko in #3086
- Fix/67 (relational normalizer) ignore
Noneif child table exists by @sh-rp in #3048 - Fix/3047 prevent same naming for staging and final datasets by @alkaline-0 in #3096
- fix: fixed error in import of
BaseOperatorin airflow_helper.py (#2601) by @ianedmundson1 in #3043 - makes root key propagation more selective, fallbacks for 2nd degree nesting #2737
- allows to limit by row count #2737
- enables ordering or results in
filesystemviaIncrementalsort_order#2737 - cli: updated error in the dlt pipeline show command by @burnash in #3095
- Fix:
sql_client.raise_database_errorcreates circular__cause__dependency by @anuunchin in #3111 - Feat: allowing custom metrics to be added to dlt resources and transform steps by @anuunchin in #3078
- feat:
ducklakedestination (all buckets and catalog combinations supported) by @zilto in #3015
Chores
- repo(ci): disable docker container autorestart by @zilto in #3083
- Don't echo pypi token to console on library publish by @sh-rp in #3089
- Improve pipeline dashboard test coverage by @sh-rp in #3091
- Run common and dashboard tests also with newest available allowed packages for all deps by @sh-rp in #3100
- Docs Cloudflare worker deployment by @sh-rp in #3105
- Updates CONTRIBUTING.md and README.md to remove outdated information and add more info by @sh-rp in #3101
- Docs docusaurus / cloudflare fixes by @sh-rp in #3114
Verified Sources
- handling of
jsonandtimestampwithout timezone inpg_replication@anuunchin dlt-hub/verified-sources#657 - google sheet fixes @anuunchin dlt-hub/verified-sources#655
Docs
- explains various backfilling options for
sql_databaseandfilesystemwith examples and additional tests by @rudolfix in #2737 ducklakedestination documentation by @rudolfix in #3015
New Contributors
- @rik-adegeest made their first contribution in #3070
- @AndreiBondarenko made their first contribution in #3086
- @alkaline-0 made their first contribution in #3096
- @ianedmundson1 made their first contribution in #3043
Full Changelog: 1.16.0...1.17.0
1.16.0
Notable changes in this release
- Improved timestamp handling, please carefully read https://dlthub.com/docs/general-usage/schema#handling-of-timestamp-and-time-zones. For some edge cases, timestamps will be handled slightly differently as before.
- Simplified interfaces for
dlt.Relationanddlt.Dataset - Our education courses are now part of the documentation and can be launched in google colab directly from there: https://dlthub.com/docs/tutorial/education
- The beta version of our pipeline dashboard now replaces the streamlit app. Run
dlt dashboardto see it. Learn more about how it works here: https://dlthub.com/docs/general-usage/dashboard
Core Library
- fully support naive and tz-aware timestamp/time data types by @rudolfix in #2570
- Dashboard updates and fixes by @sh-rp in #3055
- Fix: Max table nesting is ignored for the first run when import schema path is specified by @anuunchin in #2992
- fix: avoid private interfaces; explicit compiler mapping by @zilto in #2966
- Refactor transformations by @sh-rp in #2970
- Dashboard Improvements by @sh-rp in #2965
- fix: top level relation by @zilto in #2983
- fix:
MissingDependencyExceptionshould inheritImportErrorby @zilto in #2977 - Add remaining paramiko connect params to SFTP filesystem by @AyushPatel101 in #2823
- Feat: dataset access telemetry by @anuunchin in #3056
- feat:
dlt.Schema.to_dot()graphviz export by @zilto in #2959 - fix: avoid setting "None" string for aws session token by @tpulmano in #2978
- fix:
dlt.Pipeline.__repr__by @zilto in #3022 - pip install marimo -> dlt[workspace] by @djudjuu in #3035
- fix: improve type hints for dataset and relation by @zilto in #2997
- Small dashboard fixes by @sh-rp in #3036
- feat: dlt widgets for marimo by @zilto in #3021
- feat(dataset): simplify public interface for
dlt.Datasetanddlt.Relationby @zilto in #3059
Docs
- basic runner docs by @djudjuu in #2886
- Docs: Page description fix in llm native workflow by @anuunchin in #3033
- docs: add missing page to index by @zilto in #2994
- Update advanced-course.md by @AstrakhantsevaAA in #3032
- Feat: Moving educational content to oss by @anuunchin in #2996
- Jinso o fix/cors playground by @Jinso-o in #2986
- Docs: llm workflow docs updated by @anuunchin in #3001
- fix grammar on timezone specific docs sections by @sh-rp in #3044
- Jinso o fix/cors playground by @Jinso-o in #2995
- Docs: Education notebooks formatted and linted by @anuunchin in #3017
- Release notes 1.15 by @AstrakhantsevaAA in #3038
- Updated resource docs on info for materializing empty tables by @dat-a-man in #2973
- Docs: Forcing root key propagation section improved by @anuunchin in #3063
Tests
- re-enable python 3.10 common tests by @sh-rp in #2979
- repo: use
ruff checkfor linting by @zilto in #2967 - Use license command for testing dlt+ installation by @sh-rp in #3026
- add up to date check for uv lockfile as first lint step by @sh-rp in #3052
Misc
New Contributors
Full Changelog: 1.15.0...1.16.0
1.15.0
Breaking changes
This version will add .gz extensions to files that are compressed. That includes filesystem destinations, internal working directory and staging locations used to feed other destinations. A few practical hints:
- Existing
filesystemdestination will continue storing files withoutgzextension and they are not affected by the change (existing datasets will retain their behavior where this extension is not added for backwards compatibility) - Compressed files uploaded to staging destinations will now have the
.gzextension, also ifdltis configured to keep data in stage - This does not apply to
parquetfiles. - More information can be found in the filesystem destination docs: https://dlthub.com/docs/dlt-ecosystem/destinations/filesystem#file-compression
Core Library
- [Databricks destination] Adding comment and tags for table and columns and applying primary and foreign key constraints in Unity Catalog by @bayees in #2674
- feat - add crlf support for csv exports by @7amza79 in #2783
- feat: add
has_moreboolean flag logic to RESTClient OffsetPaginator by @michaelconan in #2817 - rest_api: fix: make ProcessingSteps filter and map fields optional by @burnash in #2913
- Enable and test python 3.14 support by @sh-rp in #2789
- removes init files from dlt tables in filesystem by @rudolfix in #2868
- restclient: json param range paginator by @Giackgamba in #2917
- fix sync destination warning logging call by @sh-rp in #2927
- fix: missing
__repr__for@dlt.transformationby @zilto in #2940 - fix: restclient: handle null data in response by @burnash in #2936
- Fix: saving compressed load files with .gz extension by @anuunchin in #2835
- fix: prevent DuplicateSchema error when using public schema in Redshift by @franloza in #2953
- feat:
Schema.to_dbml(), auto export schemas indbmlformat by @zilto in #2929 - QoL: improve DataValidationError output: use identifying columns if present by @djudjuu in #2915
- callback collector by @djudjuu in #2922
- skips inferring incomplete column when already incomplete by @rudolfix in #2935
- 2946 sqlalchemy destination fixes (full support for mssq, partial for trino) by @rudolfix in #2951
- adds precision to _dlt_load_id and _dlt_id columns by @rudolfix in #2951
- adds json field support for mssql by @rudolfix in #2951
- fixes clickhouse temporary table engine not propagate to nodes (failed merges fix) by @rudolfix in #2951
- fixes BIGQUERY numeric creation (when scale was set to 0) by @rudolfix in #2951
- fix: replace
arrow2witharrowbackend forconnectorx, enables newestconnectorxversions by @zilto in #2933 - AI Command: extended with IDEs (rules for all major IDEs are supported) by @anuunchin in #2937
- duckdb bumped to 1.3.2, iceberg scanners updated by @rudolfix in #2958
- Feat: Allow control over
streamed_execin delta merge upsert by @anuunchin in #2961 - fix failing top level module imports on projects in dirs that start with a dot by @sh-rp in #2963
Docs
- Fix docusaurus / netlify trailing slash issue by @sh-rp in #2878
- make docs snippets tests use local secrets by @sh-rp in #2903
- 2784: update dlt+ pojects.md sources example by @kaliole in #2861
- Docs: Some additions to the filesystem gdrive source by @anuunchin in #2912
- Release highlights 1.12.3-1.14.1 by @AstrakhantsevaAA in #2939
- Updating custom configurations with @configspec decorator by @dat-a-man in #2826
- docs: rest_api: add tip for escaping curly braces by @burnash in #2925
- link to add_map and add_yield_map usage example by @molkazhani2001 in #2916
- adjust cursor docs to new flow by @adrianbr in #2885
- Optimize add_limit docs title for the search by @VioletM in #2949
- Docs/dlt plus project docs rest api restructuring by @kaliole in #2911
- Updated merge loading docs on scd2 strategy handling nested structures by @dat-a-man in #2944
New Contributors
- @bayees made their first contribution in #2674
- @7amza79 made their first contribution in #2783
- @michaelconan made their first contribution in #2817
- @Giackgamba made their first contribution in #2917
Full Changelog: 1.14.1...1.15.0