Skip to content

Filter and groupby transforms in main bundle #978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Oct 6, 2016
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
6309837
fix linting in lib/filter.js
etpinard Sep 26, 2016
96bcbfc
lib: add groupby module + register it in main index
etpinard Sep 26, 2016
06c805d
lint: rm useless comments in transform modules
etpinard Sep 26, 2016
f6eb2f4
lint: rm uselss Plotly.register calls in test suites
etpinard Sep 26, 2016
0158f45
test: move user-defined transform test to transform_multi
etpinard Sep 26, 2016
64dc72a
transforms: make findArrayAttributes a lib function
etpinard Sep 28, 2016
1238236
transforms: add 'calcTransform' handler in doCalcdata
etpinard Sep 28, 2016
0e1e9a7
transforms: filter take II
etpinard Sep 28, 2016
f722c5d
transforms: fill in groupby descriptions
etpinard Sep 28, 2016
037fa5c
transforms: perf in groupby
etpinard Sep 28, 2016
01f8595
lint: some line spacing in plots.js
etpinard Sep 28, 2016
8a29a89
test: one big nasty commit untangling / adding transforms tests
etpinard Sep 28, 2016
ad5089e
test: make sure all '#graph' nodes are purged after toimage tests
etpinard Sep 29, 2016
7a98f15
test: fixup plotschema case for new filter attributes
etpinard Sep 29, 2016
c8a13b8
test: add case where axis type is set by user
etpinard Sep 29, 2016
f4b9dcc
transforms: replace 'active' attribute by 'enabled'
etpinard Sep 30, 2016
104123e
transforms: replace 'strictinterval' with MORE operation values
etpinard Sep 30, 2016
ba9b933
transforms: improve attribute descriptions
etpinard Sep 30, 2016
7fc7dfa
fix a few typos
etpinard Sep 30, 2016
7cfa4ff
transforms: fix logic for outside intervals
etpinard Sep 30, 2016
3e74c80
api: relax transform module requirements
etpinard Oct 3, 2016
2a7738e
api: improve register transform logging and tests
etpinard Oct 3, 2016
0aa4958
test: update transform case descriptions
etpinard Oct 3, 2016
ad77818
transforms: allow 'filtersrc' to be an arbitrary attr string
etpinard Oct 3, 2016
4cd9edf
transforms: add support for date 3D z-axes
etpinard Oct 3, 2016
f5c64d2
doc: add comment about transform in main index file
etpinard Oct 5, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
201 changes: 92 additions & 109 deletions src/transforms/filter.js
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@
var Lib = require('../lib');
var axisIds = require('../plots/cartesian/axis_ids');

var INEQUALITY_OPS = ['=', '<', '>=', '>', '<='];
var INTERVAL_OPS = ['[]', '()', '[)', '(]', '][', ')(', '](', ')['];
var SET_OPS = ['{}', '}{'];

exports.moduleType = 'transform';

exports.name = 'filter';
Expand All @@ -33,10 +37,31 @@ exports.attributes = {
},
operation: {
valType: 'enumerated',
values: ['=', '<', '>', 'within', 'notwithin', 'in', 'notin'],
values: [].concat(INEQUALITY_OPS).concat(INTERVAL_OPS).concat(SET_OPS),
dflt: '=',
description: [
'Sets the filter operation.'
'Sets the filter operation.',

'*=* filters items equal to `value`',

'*<* filters items less than `value`',
'*<=* filters items less than or equal to `value`',

'*>* filters items greater than `value`',
'*>=* filters items greater than or equal to `value`',

'*[]* filters items inside `value[0]` to value[1]` including both bounds`',
'*()* filters items inside `value[0]` to value[1]` excluding both bounds`',
'*[)* filters items inside `value[0]` to value[1]` including `value[0]` but excluding `value[1]',
'*(]* filters items inside `value[0]` to value[1]` including both bounds`',

'*][* filters items outside `value[0]` to value[1]` and not equal to both bounds`',
'*)(* filters items outside `value[0]` to value[1]`',
'*](* filters items outside `value[0]` to value[1]` and not equal to `value[0]`',
'*)[* filters items outside `value[0]` to value[1]` and not equal to `value[1]`',

'*{}* filters items present in a set of values',
'*}{* filters items not present in a set of values'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the verb "filters" is a bit ambiguous - could be taken to mean "filters out" which is the opposite of what you mean. How about "keeps" or "shows"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point - done in 7cfa4ff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I also like "keeps" or "retains"

].join(' ')
},
value: {
Expand All @@ -51,19 +76,6 @@ exports.attributes = {
'is the lower bound and the second item is the upper bound.',
'When `operation`, is set to *in*, *notin* '
].join(' ')
},
strictinterval: {
valType: 'boolean',
dflt: true,
arrayOk: true,
description: [
'Determines whether or not the filter operation includes data item value,',
'equal to *value*.',
'Has only an effect for `operation` *>*, *<*, *within* and *notwithin*',
'When `operation` is set to *within* and *notwithin*,',
'`strictinterval` is expected to be a 2-item array where the first (second)',
'item determines strictness for the lower (second) bound.'
].join(' ')
}
};

Expand All @@ -77,14 +89,9 @@ exports.supplyDefaults = function(transformIn) {
var enabled = coerce('enabled');

if(enabled) {
var operation = coerce('operation');

coerce('operation');
coerce('value');
coerce('filtersrc');

if(['=', 'in', 'notin'].indexOf(operation) === -1) {
coerce('strictinterval');
}
}

return transformOut;
Expand Down Expand Up @@ -154,33 +161,23 @@ function getDataToCoordFunc(gd, filtersrc) {
function getFilterFunc(opts, d2c) {
var operation = opts.operation,
value = opts.value,
hasArrayValue = Array.isArray(value),
strict = opts.strictinterval,
hasArrayStrict = Array.isArray(strict);
hasArrayValue = Array.isArray(value);

function isOperationIn(array) {
return array.indexOf(operation) !== -1;
}

var coercedValue, coercedStrict;
var coercedValue;

if(isOperationIn(['=', '<', '>'])) {
if(isOperationIn(INEQUALITY_OPS)) {
coercedValue = hasArrayValue ? d2c(value[0]) : d2c(value);

if(isOperationIn(['<', '>'])) {
coercedStrict = hasArrayStrict ? strict[0] : strict;
}
}
else if(isOperationIn(['within', 'notwithin'])) {
else if(isOperationIn(INTERVAL_OPS)) {
coercedValue = hasArrayValue ?
[d2c(value[0]), d2c(value[1])] :
[d2c(value), d2c(value)];

coercedStrict = hasArrayStrict ?
[strict[0], strict[1]] :
[strict, strict];
}
else if(isOperationIn(['in', 'notin'])) {
else if(isOperationIn(SET_OPS)) {
coercedValue = hasArrayValue ? value.map(d2c) : [d2c(value)];
}

Expand All @@ -190,85 +187,71 @@ function getFilterFunc(opts, d2c) {
return function(v) { return d2c(v) === coercedValue; };

case '<':
if(coercedStrict) {
return function(v) { return d2c(v) < coercedValue; };
}
else {
return function(v) { return d2c(v) <= coercedValue; };
}
return function(v) { return d2c(v) < coercedValue; };

case '<=':
return function(v) { return d2c(v) <= coercedValue; };

case '>':
if(coercedStrict) {
return function(v) { return d2c(v) > coercedValue; };
}
else {
return function(v) { return d2c(v) >= coercedValue; };
}

case 'within':

if(coercedStrict[0] && coercedStrict[1]) {
return function(v) {
var cv = d2c(v);
return cv > coercedValue[0] && cv < coercedValue[1];
};
}
else if(coercedStrict[0] && !coercedStrict[1]) {
return function(v) {
var cv = d2c(v);
return cv > coercedValue[0] && cv <= coercedValue[1];
};
}
else if(!coercedStrict[0] && coercedStrict[1]) {
return function(v) {
var cv = d2c(v);
return cv >= coercedValue[0] && cv < coercedValue[1];
};
}
else if(!coercedStrict[0] && !coercedStrict[1]) {
return function(v) {
var cv = d2c(v);
return cv >= coercedValue[0] && cv <= coercedValue[1];
};
}

break;

case 'notwithin':

if(coercedStrict[0] && coercedStrict[1]) {
return function(v) {
var cv = d2c(v);
return cv < coercedValue[0] || cv > coercedValue[1];
};
}
else if(coercedStrict[0] && !coercedStrict[1]) {
return function(v) {
var cv = d2c(v);
return cv < coercedValue[0] || cv >= coercedValue[1];
};
}
else if(!coercedStrict[0] && coercedStrict[1]) {
return function(v) {
var cv = d2c(v);
return cv <= coercedValue[0] || cv > coercedValue[1];
};
}
else if(!coercedStrict[0] && !coercedStrict[1]) {
return function(v) {
var cv = d2c(v);
return cv <= coercedValue[0] || cv >= coercedValue[1];
};
}

break;

case 'in':
return function(v) { return d2c(v) > coercedValue; };

case '>=':
return function(v) { return d2c(v) >= coercedValue; };

case '[]':
return function(v) {
var cv = d2c(v);
return cv >= coercedValue[0] && cv <= coercedValue[1];
};

case '()':
return function(v) {
var cv = d2c(v);
return cv > coercedValue[0] && cv < coercedValue[1];
};

case '[)':
return function(v) {
var cv = d2c(v);
return cv >= coercedValue[0] && cv < coercedValue[1];
};

case '(]':
return function(v) {
var cv = d2c(v);
return cv > coercedValue[0] && cv <= coercedValue[1];
};

case '][':
return function(v) {
var cv = d2c(v);
return cv < coercedValue[0] || cv > coercedValue[1];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't what I thought ][ would mean... shouldn't it be cv <= coercedValue[0] || cv >= coercedValue[1]? ie ] always means <= whether it's on the left side or the right side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I guess that's easier to remember than ][ meaning not in the closed interval.

Copy link
Contributor

@monfera monfera Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of context, the ][ notation would be ambiguous but one of its meanings is the complement interval, and current code is consistent with that meaning (since there's also the () symbols, it's actually disambiguated)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 7cfa4ff

Copy link
Contributor

@monfera monfera Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw ][ means 'not in the open interval' and code is correctly implemented that way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Some authors use ]a, b[ to denote the complement of the interval (a, b); namely, the set of all real numbers that are either less than or equal to a, or greater than or equal to b." - wiki

I think it makes intuitive sense also, in that the retained part, i.e. the part outside the interval, has closed bounds

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard has already changed it per my suggestion (thanks!) so not to beat a dead horse but I find it much more persuasive to think of these as a combination of two independent operators (] always means <=) than to try to relate them to the corresponding "within" forms. Likewise, think about histograms: if you have a bin with an interval like [), then to meet it exactly the next bin must start with [, hence [ is the complement of )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcjohnson I'm not requesting anything one way or another, it just looked like @etpinard's original code was in line with the standard math notation which is why I mentioned this on my initial review. I may be mistaken or we may not want to follow it, I have less experience than you figuring out what's expected by users.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monfera no problem - also somehow I didn't see your previous comment including "]a, b[" before writing mine - which I think ends up being exactly what we ended up with ⭐

Copy link
Contributor

@monfera monfera Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcjohnson just to clarify myself, when I wrote 'complement' it's meant in the set operations sense, for example, the complement of [8, 12] is the union of (-∞, 8) and (12, ∞) which, at least according to my comment you quote, from the wiki, can be denoted as )8, 12( though again, whichever notation you guys settle will work out just fine.

};

case ')(':
return function(v) {
var cv = d2c(v);
return cv <= coercedValue[0] || cv >= coercedValue[1];
};

case '](':
return function(v) {
var cv = d2c(v);
return cv < coercedValue[0] || cv >= coercedValue[1];
};

case ')[':
return function(v) {
var cv = d2c(v);
return cv <= coercedValue[0] || cv > coercedValue[1];
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice run of all combinations, and the notation looks good and math

case '{}':
return function(v) {
return coercedValue.indexOf(d2c(v)) !== -1;
};

case 'notin':
case '}{':
return function(v) {
return coercedValue.indexOf(d2c(v)) === -1;
};
Expand Down
Loading