Skip to content

Add HoloViews based plotting API #129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 16 commits into from

Conversation

philippjfr
Copy link
Contributor

Adds handling for bar plots on Series and Seriess objects. Also fixes Seriess.plot.line, Seriess.plot.scatter and Seriess.plot.area, ensuring the index is reset and therefore made visible to HoloViews.

@mrocklin mrocklin changed the base branch from plot-holoviews to master November 24, 2017 18:07
@mrocklin
Copy link
Collaborator

FYI I've changed the base of this PR to master

@philippjfr
Copy link
Contributor Author

Okay, sounds good if you're happy with this living on master already. I should have some time over the weekend to go through this more thoroughly and make sure all the options that are currently exposed are hooked up correctly and make a list of other options that might be nice to expose.

@codecov-io
Copy link

codecov-io commented Nov 24, 2017

Codecov Report

Merging #129 into master will decrease coverage by 5.92%.
The diff coverage is 57.58%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #129      +/-   ##
==========================================
- Coverage   92.49%   86.56%   -5.93%     
==========================================
  Files          13       14       +1     
  Lines        1465     1593     +128     
==========================================
+ Hits         1355     1379      +24     
- Misses        110      214     +104
Impacted Files Coverage Δ
streamz/dataframe/core.py 92.13% <ø> (-0.97%) ⬇️
streamz/dataframe/holoviews.py 57.58% <57.58%> (ø)
streamz/utils_test.py 81.33% <0%> (+1.33%) ⬆️
streamz/dataframe/__init__.py 100% <0%> (+22.22%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9052bcc...ced72b2. Read the comment docs.

@philippjfr
Copy link
Contributor Author

Happy to work on some tests over the weekend.

@mrocklin
Copy link
Collaborator

Things are going well. I'm running into some issues in the Jupyter Notebook when my Stream is on a different IOLoop than the Jupyter Notebook's. This commonly arises whenever we are using a Dask Stream, such as can be created in the following example.

https://gist.github.com/d4fffaecfe73d40061ba634e631bc433

@philippjfr
Copy link
Contributor Author

philippjfr commented Nov 26, 2017

Looks like it's being passed a Future, I tried handling that by resolving it with Future.result() but didn't have any luck. I'd have to read up more to understand what's going, do you have any immediate ideas?

@mrocklin
Copy link
Collaborator

Ah, I've added a gather call in a recent commit. This call converts to a local stream. On normal streams it is a safe no-op.

@mrocklin
Copy link
Collaborator

@philippjfr checking on your time availability here. Are you available to work on tests here? "No" is a fine answer, I'm just planning things.

@philippjfr
Copy link
Contributor Author

I probably can't devote a lot of time to it, but I should be able to at least write some tests for the core functionality over the weekend.

@mrocklin
Copy link
Collaborator

mrocklin commented Nov 30, 2017 via email

@mrocklin
Copy link
Collaborator

I apologize for letting this linger for so long. Thank you @philippjfr for your recent efforts. One last thought is that we should maybe add a small docpage or section to the dataframes page?

@mrocklin
Copy link
Collaborator

I'm curious if anyone else has time to review this as well. Maybe @CJ-Wright ?

Copy link
Member

@CJ-Wright CJ-Wright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable.
I agree with @mrocklin it would be good to add some webdocs.
Is there any plan to extend this to base streamz?
Edit: update docs after reading @mrocklin's comment.

@@ -0,0 +1,672 @@
from __future__ import absolute_import
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top level docs?

@philippjfr
Copy link
Contributor Author

That definitely sounds like a good idea, that leaves the question of how to build the page. Since they are streaming plots and empty to begin with it doesn't make much sense to include interactive bokeh plots. Should I record some gifs or just static images to include? I obviously also don't want to bloat the repository too much.

@mrocklin
Copy link
Collaborator

mrocklin commented Dec 28, 2017 via email

@philippjfr philippjfr changed the title Handle Series and Seriess bar plots Add HoloViews based plotting API Jan 7, 2018
@philippjfr
Copy link
Contributor Author

I've added a draft of some fairly detailed documentation, which you can view here. It does include a fair number of images, which take up about 1.8 MB. That would quadruple the size of the repo, so alternatively I could add a script which generates the images. It would complicate the doc building process a bit and take a little bit of time though. Let me know what you'd prefer.

@CJ-Wright
Copy link
Member

Personally I'd prefer the doc building option, but I don't know how read the docs will handle that. However, my preference is not particularly strong.

@jbednar
Copy link
Contributor

jbednar commented Jan 8, 2018

Looks great to me! I'd also love to see HoloViews+Bokeh support like this in pandas itself as discussed for pandas-dev/pandas#14130 .

In addition, options can be passed directly to HoloViews providing greater control over the plots. The options can be provided as dictionaries via the plot_opts and style_opts keyword arguments. You can also apply options using the HoloViews API (for more information see the HoloViews User

Can you add a brief example of doing that? It's not obvious to me where those keyword arguments would go.

@philippjfr
Copy link
Contributor Author

I don't know how read the docs will handle that.

That's a good point, I might be able to do it with the IPython Sphinx Directive. That's what pandas seems to do.


The plotting interface on streamz DataFrame and Series objects
attempts to mirror the pandas plotting API, but instead of plotting
with matplotlib_ uses HoloViews_ to generate dynamically streaming
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need an a pronoun? "with matplotlib it uses HoloViews".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, reads better that way.

@philippjfr
Copy link
Contributor Author

Can you add a brief example of doing that? It's not obvious to me where those keyword arguments would go.

Sure I'll add an example.

@philippjfr
Copy link
Contributor Author

Sorry I've left this sitting for so long, been completely swamped with other stuff. The one good thing about that is that I've now designed a very similar plotting API for the intake project which is about to be open-sourced (intake/intake#36). The API is a bit more comprehensive and robust so I'll likely be porting some of those ideas back here. I'll try to finish this off after the HoloViews 1.10 release next week.

@philippjfr
Copy link
Contributor Author

Quick update, as I mentioned above I've been working on a similar project as part of intake. Given that this is a common need, I've therefore decided that the plotting API should be in a separate repository (https://github.com/pyviz/hvplot). This has the benefit that the code doesn't get duplicated across a bunch of libraries, stopping the API from diverging and that it will be one shared API with the goal that it can be used by pandas, dask, streamz, xarray and geopandas.

@mrocklin
Copy link
Collaborator

mrocklin commented Mar 21, 2018 via email

@philippjfr
Copy link
Contributor Author

Probably very similarly to the way it works now, you'd have a .plot property on the DataFrame/Series objects passing itself to the hvplot object, which offers the varying plotting methods.

@mrocklin
Copy link
Collaborator

mrocklin commented Mar 21, 2018 via email

@philippjfr
Copy link
Contributor Author

philippjfr commented Mar 21, 2018

Is this something that you anticipate having the time to do or is this
something that others here would have to take on? (happy either way, just
curious)

Would be happy to do it, basically I'm hoping to get an initial release out and then start shopping the API around with the various projects and help integrate it if desired. I expect to natively support the following datatypes:

  • pandas: DataFrame, Series
  • streamz: DataFrame(s), Series(s)
  • dask: DataFrame, Series, Array
  • xarray: Dataset, DataArray
  • geopandas: DataFrame
  • intake: DataSource

Optional but essentially supported for free by HoloViews:

  • iris: Cube

@mrocklin
Copy link
Collaborator

mrocklin commented Mar 21, 2018 via email

@CJ-Wright
Copy link
Member

Would it be possible to support Streamz classes too?

@philippjfr
Copy link
Contributor Author

Would it be possible to support Streamz classes too?

Working on that right now, I want to port the material I've written in this PR to the HvPlot site.

@CJ-Wright
Copy link
Member

Sorry I wrote the wrong thing, I meant the Streams classes themselves (not just the Pandas/Series driven classes)

@philippjfr
Copy link
Contributor Author

Sorry I wrote the wrong thing, I meant the Streams classes themselves (not just the Pandas/Series driven classes)

Ah good question, I haven't played around with these much yet but at least in theory I see no reason why we couldn't support them as long as they emit one of the aforementioned data types.

@CJ-Wright
Copy link
Member

Do you mean that the Stream class would need to emit pandas or series data?

@martindurant
Copy link
Member

@philippjfr , should this issue be closed now, with the emergence of "pyviz"? At the very least, I am assuming that any work on the streamz side will now be very different than this PR, so probably best to start over.

@philippjfr
Copy link
Contributor Author

Yes, I'll close. Hoping to have an initial release of holoplot in next week and will open a new PR here.

@philippjfr philippjfr closed this Apr 3, 2018
@martindurant
Copy link
Member

"holoplot" - that will do :) I did not see the name change before.

@philippjfr
Copy link
Contributor Author

It's the best we've got for now, not too late to find something better though :-)

@mrocklin
Copy link
Collaborator

mrocklin commented Apr 3, 2018 via email

@philippjfr
Copy link
Contributor Author

If you're not averse to installing from master you can see how to do it here: https://pyviz.github.io/holoplot/user_guide/Streaming_Plots.html

Currently holoplot simply provides a hook to monkey-patch streamz objects with the plot methods. I can probably also get an early conda dev release out by then if needed.

@martindurant
Copy link
Member

Is it still structured, that you can call holoplot.plot(stream_thing) without the patching?

@philippjfr
Copy link
Contributor Author

Is it still structured, that you can call holoplot.plot(stream_thing) without the patching?

Yes, in fact that's precisely how libraries that want to integrate it properly should use it. They'd define a plot property on the data containers like this:

@property
def plot(self):
    from holoplot import HoloPlot
    return HoloPlot(self)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants