-
Notifications
You must be signed in to change notification settings - Fork 415
Feat: Custom metrics in the incremental transform #3117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for dlt-hub-docs canceled.
|
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
docs | 7e6d35a | Commit Preview URL Branch Preview URL |
Oct 02 2025, 08:04 AM |
a5ba867 to
c55ce51
Compare
rudolfix
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some changes needed
080c70d to
3c12fe9
Compare
tests/extract/test_incremental.py
Outdated
|
|
||
| # 6. run with two new items as a single batch | ||
| load_id = _run_with_items([{"id": 6, "value": "6.1"}, {"id": 6, "value": "6.2"}], True) | ||
| _assert_custom_metrics(load_id, 5, 2, 3, 0, 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rudolfix my intuition here was to get 0 as the initial_unique_hashes_count, but since it's retrieved from the state it makes sense. However, if the boundary deduplication is off and unique hashes becomes 0, shouldn't this also reset the hash count in state ? 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it should see my comment above
rudolfix
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the TypeVar for the metrics:
- added default type so we are backward compatible if any user derives from
ItemTransform - added type bound to
Mapping
pls take a look at review comments. also two last test cases are suspicious or I do not understand input arguments
tests/extract/test_incremental.py
Outdated
|
|
||
| # 6. run with two new items as a single batch | ||
| load_id = _run_with_items([{"id": 6, "value": "6.1"}, {"id": 6, "value": "6.2"}], True) | ||
| _assert_custom_metrics(load_id, 5, 2, 3, 0, 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it should see my comment above
128009c to
7e6d35a
Compare
rudolfix
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This PR adds the following three custom metrics to the incremental transform:
unfiltered_items_countunfiltered_batches_countunique_hashes_countSpecifically, in the
calldunder of theIncrementalclass.An appropriate test was added.