Skip to content

Change Panel.shift to use NDFrame.shift #6605

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 1, 2014

Conversation

dalejung
Copy link
Contributor

closes #4867

This brings Panel.shift in line with DataFrame.shift. The shifted data retains the same dimensions/indexes and doesn't drop the empty rows. So I have two concerns.

  1. This is an API change and could conceivably break existing code.
  2. The perf regresses quite a bit.

Here is the vbench.

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
panel_shift                                  | 578.2867 |   0.0773 | 7478.4491 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

The previous implementation did not create a copy and thus had a constant time. The general shift scales linearly.

I think the current implementation just defers work done in later alignments, but I can only speak to my own workflow.

@jreback
Copy link
Contributor

jreback commented Mar 11, 2014

hmm. I bet the rolling stuff that makes a 2-d much faster makes this much slower.

I think you mnay have to roll this in a different way somehow.

@dalejung
Copy link
Contributor Author

Hm. You talking about #6404? I'm going to setup a test battery just to get an idea of what the timings are. I think that the shift isn't that much worse than a straight copy. The regression is going to look disgusting compared to grabbing a view :P

@jreback
Copy link
Contributor

jreback commented Mar 22, 2014

#6404 was merged in, pls rebase and see about the perf tests and what if anything can do about them

@jreback jreback added this to the 0.14.0 milestone Mar 22, 2014
@dalejung
Copy link
Contributor Author

dalejung commented Apr 1, 2014

With #6747 the panel shift speed is on par with pre-quick-shift.

http://nbviewer.ipython.org/gist/dalejung/9731798/panel%20shift.ipynb
I monkey patched the old shift code onto the Panel to compare.

@dalejung
Copy link
Contributor Author

dalejung commented Apr 1, 2014

@jreback ping on green.

@jreback
Copy link
Contributor

jreback commented Apr 1, 2014

gr8 can you add a release note (referencing the issue)

and squash downto 1 commit

ping me whengreen

PRF: Adding Panels.shift test

DOC: Panel.shift no longer drops periods.
@dalejung
Copy link
Contributor Author

dalejung commented Apr 1, 2014

@jreback green

@jreback jreback merged commit 81b67c4 into pandas-dev:master Apr 1, 2014
@jreback
Copy link
Contributor

jreback commented Apr 6, 2014

I thiought this was fixed?

panel_shift                                  | 263.2827 |   0.0746 | 3528.0756 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [e84efe5] : Merge pull request #6825 from cpcloud/replace-dict-vbenches

BENCH: add vbench for issue 6697
Base   [d10a658] : RLS: set released to True. v0.13.1

@jreback
Copy link
Contributor

jreback commented Apr 6, 2014

#6826

@dalejung
Copy link
Contributor Author

dalejung commented Apr 6, 2014

That particularly perf hit won't be fixable if Panel uses the general shift. The 0.13.1 Panel.shift just creates a view, which is near instant and constant time.

The quick shift fixes #6747 fixes speed for shift generally which included the Panel general shift case.

So the timeline:

  1. Panel.shift takes views on data. 0ms
  2. General shift PR. Perf hit. 500ms
  3. quick shift PR (np.roll) Quick Frame Shift Implementation #6404. Panel.shift/F-ordered shift degrade 1.2s
  4. perf fixes for quick shift Doc fixes #6746 Panel.shift back down to 500ms

Sorry, I think the fix from 3->4 was conflated with fixing the perf hit from 1->2.

Let me know if I need to back out the PR.

@jreback
Copy link
Contributor

jreback commented Apr 6, 2014

ok

either back it out completely or
incorporate into the same shift logic in core/internals

so it's at least in 1 place (even if special cased)

@dalejung
Copy link
Contributor Author

dalejung commented Apr 6, 2014

moved discussion to #6826

@dalejung dalejung mentioned this pull request Apr 26, 2014
dalejung added a commit to dalejung/pandas that referenced this pull request Apr 27, 2014
…#6826

TST: Make sure Panel.shift retains dtypes

DOC: removed previous doc entries for pandas-dev#6605

Re-add note about dropping shifted periods

DOC: added note about bug fix

don't pass on freq
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Internals Related to non-user accessible pandas implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generalize shift infrastructure.
2 participants