-
Notifications
You must be signed in to change notification settings - Fork 20
[Bug] hash_collissions in dbt snapshot #154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @espenhoh I understand the issue because we have seen these in the past. And yes, we decided not to update {% macro oracle__snapshot_hash_arguments(args) -%}
STANDARD_HASH({%- for arg in args -%}
coalesce(cast({{ arg }} as varchar(4000) ), '')
{% if not loop.last %} || '|' || {% endif %}
{%- endfor -%}, 'SHA256')
{%- endmacro %} For existing snapshots, to migrate, there is an option. dbt snapshot uses MERGE INTO statement. MERGE INTO target t So, before using the new hash function you need to UPDATE the column UPDATE <SNAPSHOT_TABLE>
SET dbt_scd_id = STANDARD_HASH(<args>, 'SHA256') |
|
For anybody who is trying to migrate existing dbt-oracle snapshots with the old With dbt-oracle 1.9.1 in place run the following operation on your snapshot table with
This way is found in the code comments here:
The Oracle error indicating that you have to migrate is an ORA-01790 like: Hope that helps anybody with the same Problem. |
Is there an existing issue for this?
Current Behavior
snapshot is implemented with the ora_hash() function.
This gives a lot of collissions in a large data set and duplicate ids, causing errors in the merge statement for data volumes of millions of rows.
dbt snapshot will fail once a hash collission appears, and the snapshot tables becomes impossible to update.
Expected Behavior
No collissions,
Steps To Reproduce
run dbt snapshot command on data with a few million updates.
First dbt snapshot command succeeds as there are only inserts, but duplicates in dbt_scd_id exists causing trouble on subsequest runs.
Relevant log output using
--debug
flag enabledEnvironment
What Oracle database version are you using dbt with?
19c
Additional Context
No response
The text was updated successfully, but these errors were encountered: