Skip to content

feat: Deterministic _cq_id #712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Mar 1, 2023
Merged

Conversation

bbernays
Copy link
Contributor

@bbernays bbernays commented Feb 24, 2023

Summary

When making cq_id deterministic we cannot use any of the existing resolver types because we cannot guarantee that the value has been fetched by that time so in this PR I call a new function after the resource has been fully resolved in resolveResource


Use the following steps to ensure your PR is ready to be reviewed

  • Read the contribution guidelines 🧑‍🎓
  • Run go fmt to format your code 🖊
  • Lint your changes via golangci-lint run 🚨 (install golangci-lint here)
  • Update or add tests 🧪
  • Ensure the status checks below are successful ✅

@github-actions
Copy link

github-actions bot commented Feb 24, 2023

⏱️ Benchmark results

Comparing with b2b8de9

  • DefaultConcurrencyDFS-2 resources/s: 11,617 ⬆️ 1.64% increase vs. b2b8de9
  • DefaultConcurrencyRoundRobin-2 resources/s: 12,485 ⬆️ 9.61% increase vs. b2b8de9
  • Glob-2 ns/op: 297.9 ⬆️ 35.35% increase vs. b2b8de9
  • TablesWithChildrenDFS-2 resources/s: 23,872 ⬇️ 26.73% decrease vs. b2b8de9
  • TablesWithChildrenRoundRobin-2 resources/s: 23,426 ⬇️ 28.16% decrease vs. b2b8de9
  • TablesWithRateLimitingDFS-2 resources/s: 28.23 ⬇️ 0.74% decrease vs. b2b8de9
  • TablesWithRateLimitingRoundRobin-2 resources/s: 836.5 ⬆️ 1.97% increase vs. b2b8de9
  • BufferedScanner-2 ns/op: 12.41 ⬆️ 24.26% increase vs. b2b8de9
  • LogReader-2 ns/op: 41.16 ⬆️ 25.44% increase vs. b2b8de9

@bbernays bbernays changed the title Working Deterministic _cq_id feat: Deterministic _cq_id Feb 24, 2023
@github-actions github-actions bot added the feat label Feb 24, 2023
@bbernays bbernays marked this pull request as ready for review February 24, 2023 20:24
Copy link
Contributor

@candiduslynx candiduslynx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will require a follow-up PR to overwrite _cq_id and _cq_parent_id where they've been skipped.

@codecov-commenter
Copy link

codecov-commenter commented Feb 27, 2023

Codecov Report

Patch coverage: 47.05% and project coverage change: +1.69 🎉

Comparison is base (78530f8) 47.17% compared to head (52444d1) 48.87%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #712      +/-   ##
==========================================
+ Coverage   47.17%   48.87%   +1.69%     
==========================================
  Files          70       70              
  Lines        6849     6877      +28     
==========================================
+ Hits         3231     3361     +130     
+ Misses       3167     3054     -113     
- Partials      451      462      +11     
Impacted Files Coverage Δ
plugins/source/scheduler_dfs.go 65.47% <0.00%> (-5.04%) ⬇️
schema/meta.go 7.14% <ø> (-2.86%) ⬇️
specs/source.go 57.14% <ø> (ø)
schema/resource.go 39.28% <72.72%> (+16.70%) ⬆️
schema/convert.go 35.29% <0.00%> (ø)
schema/json.go 45.16% <0.00%> (+1.61%) ⬆️
schema/float8.go 24.85% <0.00%> (+1.69%) ⬆️
schema/int8.go 25.30% <0.00%> (+1.80%) ⬆️
schema/inet.go 50.00% <0.00%> (+2.45%) ⬆️
schema/int8_array.go 26.77% <0.00%> (+2.51%) ⬆️
... and 14 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@yevgenypats
Copy link
Contributor

Looks good. Can we make it an opt-int option given we don't know the impact of this on compute at scale/real-world scenarios?

Copy link
Contributor

@candiduslynx candiduslynx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with the changes. However, the problems with deterministic CQ ID for append mode in PG, MS SQL & other destinations should be addressed before this could be merged, hence requesting changes just to block the PR merge.

@bbernays
Copy link
Contributor Author

bbernays commented Mar 1, 2023

@candiduslynx- I don't think we need to fix those issues prior to releasing this because the default behavior hasn't changed. Users have to opt in to change the behavior.

Copy link
Contributor

@yevgenypats yevgenypats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed it in prior reviews but we shouldn't use here hashstructure library. We already have a type system in place so no need for reflection. Either adding Hash function for each of them using string and hashing this with io.writer and hasher otherwise we don't really control the hashing mechanism also hashstructure will be much slower.

Copy link
Contributor

@yevgenypats yevgenypats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Few non-blocking comments:

  1. bytes.Buffer (instead of append)
  2. the current slack thread - do we want to have random uuid for things without pk in any case.

@kodiakhq kodiakhq bot merged commit 2e7ad2c into cloudquery:main Mar 1, 2023
kodiakhq bot pushed a commit that referenced this pull request Mar 2, 2023
🤖 I have created a release *beep* *boop*
---


## [1.41.0](v1.40.0...v1.41.0) (2023-03-02)


### Features

* Deterministic _cq_id ([#712](#712)) ([2e7ad2c](2e7ad2c))
* **multiplex:** Detect duplicated clients ([#723](#723)) ([dfb039d](dfb039d))


### Bug Fixes

* Cleanup code ([#710](#710)) ([963f03c](963f03c))
* **deps:** Update golang.org/x/exp digest to c95f2b4 ([#718](#718)) ([de52c10](de52c10))
* **deps:** Update google.golang.org/genproto digest to 9b19f0b ([#719](#719)) ([ecfddea](ecfddea))
* **deps:** Update module github.com/rivo/uniseg to v0.4.4 ([#720](#720)) ([0da69b6](0da69b6))
* **deps:** Update module github.com/stretchr/testify to v1.8.2 ([#721](#721)) ([19c0742](19c0742))
* **pk:** Skip filter for no PK ([#709](#709)) ([d0c2e26](d0c2e26))
* **types-json:** Disable HTML escaping during JSON marshalling ([#714](#714)) ([2f6f1d8](2f6f1d8))
* **types-timestamp:** Ensure timestamp is UTC ([#716](#716)) ([bb33629](bb33629))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants