Skip to content

Fix expression column aggregate calculation with group_by after remove()#2311

Merged
texodus merged 1 commit intomasterfrom
fix-remove-expression-bug
Jul 20, 2023
Merged

Fix expression column aggregate calculation with group_by after remove()#2311
texodus merged 1 commit intomasterfrom
fix-remove-expression-bug

Conversation

@texodus
Copy link
Member

@texodus texodus commented Jul 20, 2023

Fixes a bug which causes expression columns to show null or 0 results when a remove() had taken place before.

When a View with expressions is created from a Table that has had remove() applied to it, some aggregates (like sum) are calculated incorrectly, as in the example below. Interestingly, this issue only occurs when the remove() occurs before the view(). If all remove() calls occur after the view is created, a slightly different code path correctly calculates the expressions.

Ultimately this issue was caused by a few under-tested code paths, in which he wrong mapping was chosen deep in a giant block of data processing code., given a choice between the "original" and remove()-corrected indices when matching primary keys to column's row index. The repro (below), simple as it is, was instrumental in identifying and fixing it.

schema = {"a": str, "b": float, "c": str}
data = [
    {
        "a": "A",
        "b": 46412.3804275,
        "c": "X"
    },
    {
        "a": "B",
        "b": 2317615.875,
        "c": "X"
    },
]

table = Table(schema, index="key")
table.update(data)
table.remove(["A"])
view = table.view(
    group_by=["c"],
    columns=["b", "alias"],
    expressions=['//alias\n"b"'],
)

# This has the wrong value for the `alias` column
view.to_records()

# This however would've worked, `.remove()` after `.view()`
table.remove(["A"])
view.to_records()

Fixes #1710 (???)

@texodus texodus force-pushed the fix-remove-expression-bug branch from 203c92b to 799691c Compare July 20, 2023 04:19
@texodus texodus merged commit 8e1ab71 into master Jul 20, 2023
@texodus texodus deleted the fix-remove-expression-bug branch July 20, 2023 05:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Concrete, reproducible bugs

Development

Successfully merging this pull request may close these issues.

[BUG in Expression] Null or incorrect expression field in JSON output after delete operation

1 participant