Skip to content

Conversation

@rudolfix
Copy link
Collaborator

Description

master merge for 0.5.2 release

willi-mueller and others added 30 commits July 8, 2024 14:51
…ructions (#1573)

Removed extra spaces around the equals sign in LABEL instructions in Dockerfile and Dockerfile.airflow to ensure proper syntax and avoid potential build errors. This change ensures that labels are correctly assigned.

Co-authored-by: Avinash Niroula <[email protected]>
* Rename qdrant_client module to not clash with qdrant_client library

* Qdrant get state returns the latest committed state

* drops discover schema when extract package is created

---------

Co-authored-by: Marcin Rudolf <[email protected]>
* add upsert merge strategy

* handle destination upsert support

* refactor row id type handling

* black format

* add supported_merge_strategies destination capability

* fix child row id type handling

* add condition to exclude destinations that handle merge, but do not implement any of the defined merge strategies

* repair improper merge conflict resolution

* improve merge strategy validation

* add default merge strategy to dummy destination capabilities

* re-add merge strategies capability

* re-add upsert schema verification

* black format

* change SchemaException to ValueError

* test upsert merge key warning log

* remove obsolete property

* remove obsolete import

* use get_qualified_table_names utility function consistently

* remove obsolete import

* move test because it needs postgres credentials

* correct supported merge strategies

* move dremio supported merge strategies

* remove hardcoded row id column references

* move write disposition hint validation

* replace hardcoded row id column names with constant

* add comment

* refactor schema verification to prevent duplicate warnings

* Revert "refactor schema verification to prevent duplicate warnings"

This reverts commit 17f2115.

* comment out test that fails on GitHub CI
* fixes current.pipeline referencing pipeline module

* removes potentially circular dependencies from common to pipeline namespaces
* feat: add config dataset_name_prefix

* fix: remove dataset_name_prefix from RunConfiguration

* chore: replace dataset_name_prefix for dataset_name_layout

* refactor: change dataset_name_layout type and replace ValueError for ConfigurationValueError

* refactor: move dataset_name_layout check to on_resolved method
* allows to generate COPY sql for snowflake as str

* passes tags to sql_client and implements query tagging for snowflake

* fixes copy sql gen

* adds pipeline_name to query tags
* Default to  for both on-prem and cloud

Signed-off-by: Marcel Coetzee <[email protected]>

* Add documentation for new engine family types

Signed-off-by: Marcel Coetzee <[email protected]>

* Typo

Signed-off-by: Marcel Coetzee <[email protected]>

* Minor doc changes

Signed-off-by: Marcel Coetzee <[email protected]>

* Fix local clickhouse deployment timestamp parsing issue with simple configuration setting~

Signed-off-by: Marcel Coetzee <[email protected]>

* Extend support for local deployment time types

Signed-off-by: Marcel Coetzee <[email protected]>

* Adapt test to check whether CH OSS or cloud

Signed-off-by: Marcel Coetzee <[email protected]>

* Defend against CH OSS unsupported dbapi datetime parsing

Signed-off-by: Marcel Coetzee <[email protected]>

* Fix typo

Signed-off-by: Marcel Coetzee <[email protected]>

* Add ClickHouse to local destination tests

Signed-off-by: Marcel Coetzee <[email protected]>

* Update ClickHouse test workflow and remove engine types

Signed-off-by: Marcel Coetzee <[email protected]>

* Use Python 3.10.x for ClickHouse destination tests

Signed-off-by: Marcel Coetzee <[email protected]>

* Add ClickHouse MergeTree support and refactor code

Signed-off-by: Marcel Coetzee <[email protected]>

* Update ClickHouse Docker setup and test workflow

Signed-off-by: Marcel Coetzee <[email protected]>

* Refactor ClickHouse tests to cover both OSS and Cloud versions

Signed-off-by: Marcel Coetzee <[email protected]>

* Disable SSL for ClickHouse OSS tests

Signed-off-by: Marcel Coetzee <[email protected]>

* Use state instead of sentinel tables

Signed-off-by: Marcel Coetzee <[email protected]>

* Remove mention of sentinel table for ClickHouse datasets

Signed-off-by: Marcel Coetzee <[email protected]>

* Refactor ClickHouse deployment type detection

Signed-off-by: Marcel Coetzee <[email protected]>

* Add conditional execution for ClickHouse OSS tests

Signed-off-by: Marcel Coetzee <[email protected]>

* Update ClickHouse compose file path and move to tests directory

Signed-off-by: Marcel Coetzee <[email protected]>

* Update ClickHouse docker-compose file path in test workflow

Signed-off-by: Marcel Coetzee <[email protected]>

* Cast client to ClickHouseSqlClient in get_deployment_type call

Signed-off-by: Marcel Coetzee <[email protected]>

* Revert "Remove mention of sentinel table for ClickHouse datasets"

This reverts commit 409487c.

* Revert "Use state instead of sentinel tables"

This reverts commit e1ac1ce.

* Add tests for ClickHouse table engine configuration and adapter overrides

Signed-off-by: Marcel Coetzee <[email protected]>

* Add configurable default table engine type for ClickHouse

Signed-off-by: Marcel Coetzee <[email protected]>

* Docs

Signed-off-by: Marcel Coetzee <[email protected]>

* Fix comments

Signed-off-by: Marcel Coetzee <[email protected]>

* Add ClickHouse typing module for improved type handling

Signed-off-by: Marcel Coetzee <[email protected]>

* Move ClickHouse configuration options from credentials to client configuration

Signed-off-by: Marcel Coetzee <[email protected]>

* Move table_engine_type from credentials to client configuration

Signed-off-by: Marcel Coetzee <[email protected]>

* Docs

Signed-off-by: Marcel Coetzee <[email protected]>

---------

Signed-off-by: Marcel Coetzee <[email protected]>
* allows bigquery to manage schema inference, table creation and migration

* allows to specify resources with bigquery managed schemas in bigquery_adapter

* adds tests

* updates bigquery and arrow docs

* fixes test deps

* fixes typo
* fixes missing call to tag session

* moves query_tag from credentials to config

* fixes test deps

* makes query_tag optional so sql_client can be instantiated like base
* handle credentials for s3 compatible storage

* fix delta table test

* add missing filesystem driver in skip message
…estination (#1600)

* add upsert merge strategy

* handle destination upsert support

* refactor row id type handling

* black format

* add supported_merge_strategies destination capability

* fix child row id type handling

* add condition to exclude destinations that handle merge, but do not implement any of the defined merge strategies

* repair improper merge conflict resolution

* improve merge strategy validation

* add default merge strategy to dummy destination capabilities

* re-add merge strategies capability

* re-add upsert schema verification

* black format

* change SchemaException to ValueError

* test upsert merge key warning log

* remove obsolete property

* remove obsolete import

* use get_qualified_table_names utility function consistently

* remove obsolete import

* move test because it needs postgres credentials

* correct supported merge strategies

* move dremio supported merge strategies

* remove hardcoded row id column references

* move write disposition hint validation

* replace hardcoded row id column names with constant

* add comment

* refactor schema verification to prevent duplicate warnings

* Revert "refactor schema verification to prevent duplicate warnings"

This reverts commit 17f2115.

* comment out test that fails on GitHub CI

* add update operation to child table upsert logic

* move test utility to utils module

* add supported_merge_strategies to generic_capabilities method

* ensure upsert and scd2 merge strategies always use deterministic row hash as row id for child tables

* rewrite temporary exclusion logic for non standard merge strategies

* add table_format and bucket_subset to destination test configuration utilities

* add basic support for delta table merge write disposition (rogue implementation)

* handle case where supported_merge_strategies is not defined

* fix bucket_subset filtering

* add basic docs for upsert merge strategy

* fix merge strategy verification

* handle none and empty list cases in merge strategy verification

* move delta upsert merge strategy limitations to filesystem doc page
* fixes mypy errors during development

* explicitely ignore error codes to cover both CI as well as local dev env
jorritsandbrink and others added 18 commits July 24, 2024 11:24
* make bucket_subset specific to filesystem destination

* use destination test config setup for delta table tests
* Add clarification for add_limit

* Update docs/website/docs/general-usage/source.md

* Add clarification for add_limit docstrings

---------

Co-authored-by: Alena Astrakhantseva <[email protected]>
* uses pydantic aliases when dumping models

* ignores pickle error when saving trace

* emits python objects instead of pydantic models after validation

* keeps incremental and validation to the end of the pipe by setting right affinity

* fixes fork transform

* adds docs

* fixes sql_database doc typo
* initial decoupling of config generation from toml writer

* keeps pure Python object in docs config provider, adds yaml and json support to vault providers, refactors set_value in formet TomlBaseProvider

* adds a method to register config providers to config accessor

* adds example for yaml loader custom config provider

* implements config provider with user supplied loader function

* typos and small fixes

* adds reference to example in docs

* slightly improve docs

* update one snippet

---------

Co-authored-by: dave <[email protected]>
* add venv hint on first page and update installation page

* Update docs/website/docs/reference/installation.md

Co-authored-by: Anton Burnashev <[email protected]>

* Update docs/website/docs/intro.md

Co-authored-by: Anton Burnashev <[email protected]>

* Update docs/website/docs/reference/installation.md

Co-authored-by: Anton Burnashev <[email protected]>

* Update docs/website/docs/reference/installation.md

Co-authored-by: Anton Burnashev <[email protected]>

* Update docs/website/docs/reference/installation.md

Co-authored-by: Anton Burnashev <[email protected]>

* fix link url on intro

* try to fix link on intro again

---------

Co-authored-by: Anton Burnashev <[email protected]>
* enable upsert merge strategy for athena, bigquery, databricks and mssql

* add new upsert destinations to docs
…rs (#1645)

* prevent accidental wrapping of sources in resources when using adapters

* fix typo

* fix qdrant zendesk example

* another fix to the qdrant example

* rename utils function
add support for source with single resource
add tests

* add logger warning when setting default name for resource

* only use selected resources in get_resource_for_adapter

* switch to value error
docs/514 rest_api: docs on pluggable paginators
* Update book a call link description

* tiny grammar fix

---------

Co-authored-by: rahuljo <[email protected]>
@rudolfix rudolfix added the ci full Use to trigger CI on a PR for full load tests label Jul 31, 2024
@netlify
Copy link

netlify bot commented Jul 31, 2024

Deploy Preview for dlt-hub-docs canceled.

Name Link
🔨 Latest commit 8cddfcf
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/66ad2185b6ee570008751603

willi-mueller and others added 5 commits July 31, 2024 22:02
…mation-function-for-incremental-cursor

docs: documents new `convert` parameter in rest_api source incremental config
…stination (#1617)

* make null column optional in arrow table test case

* handle empty source for delta table format

* upgrade pyarrow and comment lancedb

* use RecordBatchReader for filesystem delta table writes

* Revert "upgrade pyarrow and comment lancedb"

This reverts commit 5bbfba4.

* mark tests that need pyarrow version 17

* assert pyarrow version for delta table format on filesystem destination

* autoskip tests if pyarrow dependency is not satisfied

* fix destination config issue

* add github workflow for needspyarrow17 tests

* fix pyarrow17 github workflow

* fix local filesystem bucket url

* fix typo
)

* adds docs on handling NULL values at incremental cursor path

* uses CAUTION box

Co-authored-by: VioletM <[email protected]>

* incorporates code review

* Update docs/website/docs/general-usage/incremental-loading.md

* Update docs/website/docs/general-usage/incremental-loading.md

* removed duplicate line

---------

Co-authored-by: VioletM <[email protected]>
Co-authored-by: Anton Burnashev <[email protected]>
* switch to docker compose subcommand

* fix compose deployments
@rudolfix rudolfix merged commit e00baa0 into master Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci full Use to trigger CI on a PR for full load tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.