Skip to content

TimeSeries chart too restrictive on required DataFrame structure? #1190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
michaelaye opened this issue Sep 11, 2014 · 5 comments
Closed

TimeSeries chart too restrictive on required DataFrame structure? #1190

michaelaye opened this issue Sep 11, 2014 · 5 comments

Comments

@michaelaye
Copy link

The new TimeSeries chart in 0.6.0 requires a multi-level index in the columns of the pd.DataFrame.
I believe most users would have the time axis data only once in the main index of the pd.DataFrame, e.g. when analyzing a bunch of measurements that were taken with several sensors, but at the same time. It feels like you guys implemented the more advanced version of supporting several different time data axes first without implementing the simple case of having just one time column? Or am I missing something?

@damianavila
Copy link
Contributor

Not missing anything... bokeh.charts was in heavy refactoring and development... an one debt we have yet is support multiple inputs from the most basic ones to the more complex one. Our plans is to work on this in the following weeks/months, so I would like to utilize this issue to discuss the most common input do you think are valuable for the different types of charts.
Can you do that for me? It will help me a lot at the time to implement that "input machinery" able to understand reasonable inputs from the data analyst perspective...

@michaelaye
Copy link
Author

Sure. just a warning, I work a lot with pandas, so maybe my viewpoint is biased that way.
Let's start with TimeSeries, as that is the only chart from bokeh I have played with so far.
In my view the most basic pd.DataFrame structure to support is:

Index Col1 Col 2 ...
t1 data1_1 data2_1 ...
t2 data1_2 data2_2 ...

with the time data only in the Index of the dataframe (accessible via df.index.to_series() as already used in several Bokeh examples) and one or more columns of a DataFrame.
Other points of the interface:

  1. To make it easier for Bokeh for now, one could demand that the input data is a DataFrame (a pd.Series is easily converted to one though with pd.DataFrame(series)).
  2. If the user has a dataframe with lots of columns, but only wants to plot 3 of them, she can do:
    TimeSeries(df[[col1, col2, col3]]), so it's easy for the user to control and Bokeh maybe should just try to plot all columns of the arriving dataframe?
  3. Possibly the user should be able to decide between line and scatter as an option?
  4. It also would be nice if the colors of the lines are auto-advanced like it is currently happening for the MPL based dataframe plotting.

@michaelaye
Copy link
Author

I already started a discussion for better Bokeh interface some weeks ago at pandas here. Some people mentioned there a general pandas API that several plotting engines could hook up to? Maybe it's worth to contact pandas peeps about that? pandas-dev/pandas#6962

@birdsarah
Copy link
Member

Discussion is now more than 3 months old. Can re-open if renewed interest.

Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants