Skip to content

patch/json-bigquery-compatibility #103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jul 8, 2025

Conversation

fivetran-joemarkiewicz
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz commented Jul 2, 2025

PR Overview

Package version introduced in this PR: v0.18.1

This PR addresses the following Issue/Feature(s): Shopify Issue #105

Summary of changes:

Introduces JSON BigQuery datatype support for the impacted models/columns.

Submission Checklist

  • Alignment meeting with the reviewer (if needed)
    • Timeline and validation requirements discussed
  • Provide validation details:
    • Validation Steps: Check for unintentional effects (e.g., add/run consistency & integrity tests)
      • Confirmed datatype JSON datatypes for BQ are converted back to string and there is no change for non JSON datatypes.
    • Testing Instructions: Confirm the change addresses the issue(s)
      • BK tests updated to test JSON and non JSON versions.
      • Also confirmed downstream transform package works as expected per the passing BK in this branch that will not be merged.
    • Focus Areas: Complex logic or queries that need extra attention
      • Confirm the datatype is converting properly, but also not converting when it doesn't need to.

Changelog

  • Draft changelog for PR
  • Final changelog for release review

@fivetran-joemarkiewicz fivetran-joemarkiewicz self-assigned this Jul 2, 2025
@fivetran-joemarkiewicz fivetran-joemarkiewicz marked this pull request as ready for review July 2, 2025 20:06
Copy link
Contributor

@fivetran-catfritz fivetran-catfritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally everything is looking good. However before I approve, curious what you think about my suggestions.

Comment on lines 34 to 37
if [ "$db" = "bigquery" ]; then
dbt run --vars '{shopify_collection_identifier: shopify_collection_data_bq_json, shopify_order_identifier: shopify_order_bq_json_data, shopify_transaction_identifier: shopify_transaction_bq_json_data}' --target "$db" --full-refresh
dbt test --target "$db"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could remove this if we use the +alias in integration_tests/dbt_project.yml (I made a suggestion in that file). Could be easier for maintenance long term.

Suggested change
if [ "$db" = "bigquery" ]; then
dbt run --vars '{shopify_collection_identifier: shopify_collection_data_bq_json, shopify_order_identifier: shopify_order_bq_json_data, shopify_transaction_identifier: shopify_transaction_bq_json_data}' --target "$db" --full-refresh
dbt test --target "$db"
fi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment for why I don't think we should take the alias approach in this case. The alias is great when we need to switch for a given destination. However, in this case we do want to test both files and the alias won't allow us to achieve that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I see. Makes sense in that case!

Copy link
Contributor

@fivetran-catfritz fivetran-catfritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the response and updates! All lgtm!

Comment on lines 34 to 37
if [ "$db" = "bigquery" ]; then
dbt run --vars '{shopify_collection_identifier: shopify_collection_data_bq_json, shopify_order_identifier: shopify_order_bq_json_data, shopify_transaction_identifier: shopify_transaction_bq_json_data}' --target "$db" --full-refresh
dbt test --target "$db"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I see. Makes sense in that case!

CHANGELOG.md Outdated
- `stg_shopify__order`: `total_shipping_price_set` column
- `stg_shopify__transaction`: `receipt` column
- `stg_shopify__collection`: `rules` column
- Introduced new the new `json_to_string()` macro to be used in the BigQuery JSON field support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Introduced new the new `json_to_string()` macro to be used in the BigQuery JSON field support.
- Introduced the new `json_to_string()` macro to be used in the BigQuery JSON field support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted in latest commit


[PR #103](https://github.com/fivetran/dbt_shopify_source/pull/103) includes the following updates:

## Under the Hood
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're adding support for a new data type, isn't that technically a schema/breaking change?

Copy link
Contributor Author

@fivetran-joemarkiewicz fivetran-joemarkiewicz Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not be a schema breaking change since this will not break/change the outcome of any existing workflows. This will only ensure support for the JSON datatype if it's used by the connection. This converts JSON to string, so the intention of this update is that no schema change will actually occur.

- `stg_shopify__collection`: `rules` column
- Introduced new the new `json_to_string()` macro to be used in the BigQuery JSON field support.
- Included json versions to the integration tests to ensure json data type compatibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we also updated the pull request template, should we call that out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure we can. I don't think it's all that necessary, but wouldn't hurt to include.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Contributor

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@fivetran-jamie fivetran-jamie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with a question in run_models.sh (I don't think action is needed there) and a suggestion in the changelog that shouldn't be blocking

CHANGELOG.md Outdated
- `stg_shopify__order`: `total_shipping_price_set` column
- `stg_shopify__transaction`: `receipt` column
- `stg_shopify__collection`: `rules` column
- Introduced the new `json_to_string()` macro to be used in the BigQuery JSON field support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the macro sorta already implies this, but I wonder if we should more explicitly call out that we are converting the JSON fields to strings, in case people are expecting to see JSONs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, updated in the latest commit

@fivetran-joemarkiewicz fivetran-joemarkiewicz merged commit 9352c6c into main Jul 8, 2025
9 checks passed
@fivetran-joemarkiewicz fivetran-joemarkiewicz deleted the patch/json-bigquery-compatibility branch July 8, 2025 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants