-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Missing data in scattergl plot when I go juste above 10000 points (regression) #2334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing data in scattergl plot when I go juste above 10000 points (regression) #2334
Comments
cc @dfcreative |
That is a very good edge case for snap-points-2d, thank you @mlaily. In 1.30.0 we use gl-vis fancy-scattergl, which has no points snapping enabled, therefore points are rendered to the plot directly, although slow, in terms of interactions. Now at 1e5 points the quadtree algorithm for points clustering gets triggered, which is good for cases of randomish distributions, but not good for linearly aligned points.
|
Haha, you are welcome! :D If I understand correctly, fixing the general case seems hard, but I think I would be happy with an option to disable clustering altogether. That said, maybe I'm not using the optimal representation for my data. Would you have any advice by any chance? I think some kind of timeline plot with a sliding window would be better, but if I recall correctly, I could not find how to do that with plotly with acceptable performance. Do you think there might be a more appropriate plot type? Would you have any advice to improve performance on the current scattergl plot? The full data set (75000 points) is getting a bit slow. The text properties (I truncated it in my examples) has a lot of repetitive data, but I don't know how to avoid it... Sorry if this is out of place for this issue. Feel free to say so and I will try to find a better place. |
@mlaily can you please show a codepen with the example where it gets slow? Unfortunately I cannot give qualified advice on picking right plot type for your data, I'd recommend reading Edward Tufte books for that. Or just playing around with different plot types, that is win both for us and you :) |
Here is a version with all the data, using v1.30.1 of plotly: https://0.x2a.yt/other/private/plotly-test/all.html The initial delay is quite long but I guess that's to be expected. After that, the performance is good with the latest firefox, but very bad with the latest chrome. (my dataset is too large for a codepen) EDIT: you know what? forget about it, I'm a dumbass. I disabled hardware acceleration a while ago in chrome and forgot to put it back on -_-. |
This would be ideal. |
I don't know when you will be able to work on this, but in the meantime, a workaround is to look for |
This issue seems to still be occuring: https://codepen.io/anon/pen/zmQzOO This should plot a point every millisecond. If you zoom in there are a few gaps (although you can hover over the ghost points): |
The new cutoff - for this shape data anyway, in Chrome on my mac - seems to be >=75564 we get some gaps, < there are no gaps. That's a number I haven't seen before 🤔 That cutoff holds for date or numeric data, and any size plot, but interestingly if I change So the issue isn't quite the same, but symptoms are similar enough that I'll reopen |
Another example from the reports in #3413: |
The problem is most likely in https://github.com/dy/point-cluster |
I'd suggest changing |
Thanks for the hint @dy !! Using https://codepen.io/alexcjohnson/pen/MPdLjV from #2334 (comment): with with as expected, but with probably far worse panning perf when getting closer to 1e6 pts |
@etpinard not necessarily - maxDepth handles edge cases with multiple points at the same coordinate. Making that number ‘127’ should be sufficient too. In fact I rarely saw more than 20 levels for real data, for 1e6 random points we had around 13 levels. With maxDepth=10 it is possible that the artifact is at the beginning of the dataset. Anyways that’s def a bug. |
WIP branch with https://github.com/plotly/plotly.js/compare/scattergl-lower-maxdepth some image tests are failing: more investigation will be needed. |
PR #3578 (set to be released in 1.45.0) fixes the problems reported in:
That solution probably isn't the end of this story. I suspect some graphs with more than 1e5 pts may have "missing" pts due to incorrect clustering, so I'll leave this issue open. |
FWIW, I can confirm that this update has resolved the issues for me reported in #2334, using dash==0.39.0, which uses plotly 1.45.0 |
I can confirm that there are still issues around the 100k - point threshold. See my issue #3405 for details. If you create a trace with more than 100k points and then use Plotly.react to change it to have less than 100k points, many of the points that should be rendered in the second trace will not be rendered. The threshold used to be 10k, but now appears to be 100k. |
Minimal codepen to illustrate the bug. |
For what it's worth, here's the code I've been using as my mitigation:
Basically detects the situation and then calls newPlot instead of react when appropriate. The issue disappears entirely if you always call newPlot but I wanted to take advantage of the extra performance of react for most cases. |
Possible fix illustrated in codepen |
Candidate fix demo |
Hi,
I just updated plotly.js from 1.30 to the latest 1.33, and my plot is broken.
I’m rendering about 75000 points with a scattergl plot, and after the update, it looks like most of the points are not rendered at all. What is odd though, is that when I hover over the plot, I can still see popups showing when my mouse is over places where points are supposed to be.
After trying different things, I noticed everything is rendering fine up to 10000 data points. If I go just above, say 10001, my plot is broken and most of the points are not rendered!
Everything worked fine with plotly.js 1.30.
I made two versions of my code, one with 10000 points, and the other with 10001. See for yourself: https://0.x2a.yt/other/private/plotly-test/10000.html
https://0.x2a.yt/other/private/plotly-test/10001.html
(This bug report is a copy/paste from the following forum topic: https://community.plot.ly/t/regression-missing-data-in-scattergl-plot-when-i-go-juste-above-10000-points/8152)
The text was updated successfully, but these errors were encountered: