Return booleans from expression comparisons, allow for vectors to be defined in expressions#1548
Return booleans from expression comparisons, allow for vectors to be defined in expressions#1548
Conversation
| REGISTER_SCALAR_CONSTANTS() | ||
|
|
||
| // Custom functions from computed_functions.cpp | ||
| REGISTER_COMPUTE_FUNCTIONS(vocab) |
There was a problem hiding this comment.
Do these need to be macros? They impair readability here.
At minimum, they should take sym_table as argument and not name-capture as they do now.
ae1b113 to
2af96ad
Compare
…alar comparator, fix vectors for exprtk
cea8799 to
e9620fb
Compare
e9620fb to
6386af3
Compare
|
Looks good! Thanks for the PR! I've added a step to the pre-processor to convert |
This PR rewrites a good amount of our ExprTk integration so that comparisons such as
==,<,and,oretc. return boolean columns instead of floats. Originally, all comparisons returned floats because ExprTk treated booleans as the number 0 or 1, and passed them into thet_tscalarconstructor as ints and not booleans. Because scalar comparison between different types is not possible, functions had to return float values in order for conditionals to work.In this branch, I've added explicit specializations for more of ExprTk's processing code so that operators and conditional evaluators always return boolean scalars. In combination with the UI tweaks in #1547, expressions now can be easily used as filters on the dataset:
This also works well for defining ranges using
inrange, such as a date range:Finally, users can define vectors inside expressions and use them/return scalars from the vector at will:
Vectors specifically enable a massive amount of features, including functions such as
findandsplitwhich need to return more than one value. Afind(string, regex, output_vector)function, for example, will store its output ofstart_idx, end_idxinoutput_vector, and the user can then create a substring from those indices usingsubstr(output[0], output[1]).The values
TrueandFalsehave been added, replacing the valuestrueandfalse(without capital letters), which resolved to the numbers 1 and 0.TrueandFalse, meanwhile, resolve totrueandfalseboolean scalars, which means they can be used in comparisons against other booleans, whereas the old values will now result in a syntax error.Finally, a
booleanfunction has been added to cast a scalar or column of any type into a boolean column, returning True if a value is set (including "falsy" values such as 0 and ""), and False for nulls.Numerous tests have been added, as always.