Skip to content

ffill & bfill methods #1696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

ffill & bfill methods #1696

wants to merge 1 commit into from

Conversation

max-sixty
Copy link
Collaborator

  • Closes ENH: Forward & back fill methods #1651
  • Tests added / passed
  • Passes git diff upstream/master **/*py | flake8 --diff
  • Fully documented, including whats-new.rst for all changes and api.rst for new API

No docs / docstrings. No backfill.

But otherwise is this a reasonable layout?
Do we prefer fillna with options (back / forward / replacement), or ffill, bfill, fillna?

@max-sixty max-sixty changed the title Initial go at ffill ffill & bfill methods Nov 7, 2017
@shoyer
Copy link
Member

shoyer commented Nov 8, 2017

Do we prefer fillna with options (back / forward / replacement), or ffill, bfill, fillna?

I like separate methods. It keeps things logically well separated.


if limit is None:
# bottleneck raises an error if you pass `None`
data.values = bn.push(data.values, axis=axis_num)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try apply_ufunc + transpose() afterwards instead? That would work on arbitrary xarray objects. It's awkward to need to restore the original dimension order, but it adds almost no overhead given NumPy's memory model. @jhamman is doing something like that in #1640.

I think you could even use dask='parallelized' here, to allow for doing multiple forward fills at the same time.

Parallelizing across the pushed axis with dask would be harder (and certainly isn't necessary for a first pass), because you would need something hierarchical that can push data between chunks. But I think you could still make this work with multiple passes to get forward fill-values on the boundaries.

@jhamman
Copy link
Member

jhamman commented Nov 8, 2017

@MaximilianR - I have rough implementation of both of these on my interpolate branch:

https://github.com/jhamman/xarray/blob/56f37144f2dc827b61aa61f94c69277e6a79c262/xarray/core/missing.py#L206-L234

some slight modifications gives this:

def _bfill(arr, n=None, axis=-1):
    '''inverse of ffill'''
    import bottleneck as bn

    arr = np.flip(arr, axis=axis)

    # fill
    arr = bn.push(arr, axis=axis, n=n)

    # reverse back to original
    return np.flip(arr, axis=axis)

def ffill(self, dim=None, limit=None):
    ''' '''
    import bottleneck as bn

    return apply_ufunc(bn.push, self,
                       dask='forbidden',
                       keep_attrs=True,
                       kwargs=dict(n=limit, axis=self.get_axis_num(dim))).transpose(*arr.dims)

def bfill(self, dim=None, limit=None):
    ''' '''
    return apply_ufunc(_bfill, self,
                       dask='forbidden',
                       keep_attrs=True,
                       kwargs=dict(n=limit, axis=self.get_axis_num(dim))).transpose(*arr.dims)

As @shoyer says, it may be possible to change to dask='parallelized' but I have tried it yet.

Feel free to take what you want from this snippet, I am moving pretty slowly on my interpolation branch but it would be great if you can get this in as a stand alone feature.

@jhamman
Copy link
Member

jhamman commented Nov 11, 2017

@MaximilianR - quick update. My implementation of ffill/bfill in #1640 is a bit cleaner now and seems to be working. It probably makes sense for me to just finish off this feature in that PR. Maybe you could give it a review when you get a chance?

@max-sixty
Copy link
Collaborator Author

@jhamman Perfect! I had a read through yours and it looks great, excited to get this merged.
I'll close this

@max-sixty max-sixty closed this Nov 12, 2017
@max-sixty max-sixty deleted the fill branch November 12, 2017 00:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants