ffill & bfill methods #1696

max-sixty · 2017-11-07T21:32:19Z

Closes ENH: Forward & back fill methods #1651
Tests added / passed
Passes git diff upstream/master **/*py | flake8 --diff
Fully documented, including whats-new.rst for all changes and api.rst for new API

No docs / docstrings. No backfill.

But otherwise is this a reasonable layout?
Do we prefer fillna with options (back / forward / replacement), or ffill, bfill, fillna?

shoyer · 2017-11-08T01:38:48Z

Do we prefer fillna with options (back / forward / replacement), or ffill, bfill, fillna?

I like separate methods. It keeps things logically well separated.

shoyer · 2017-11-08T02:07:56Z

xarray/core/ops.py

+
+    if limit is None:
+        # bottleneck raises an error if you pass `None`
+        data.values = bn.push(data.values, axis=axis_num)


Try apply_ufunc + transpose() afterwards instead? That would work on arbitrary xarray objects. It's awkward to need to restore the original dimension order, but it adds almost no overhead given NumPy's memory model. @jhamman is doing something like that in #1640.

I think you could even use dask='parallelized' here, to allow for doing multiple forward fills at the same time.

Parallelizing across the pushed axis with dask would be harder (and certainly isn't necessary for a first pass), because you would need something hierarchical that can push data between chunks. But I think you could still make this work with multiple passes to get forward fill-values on the boundaries.

jhamman · 2017-11-08T06:01:57Z

@MaximilianR - I have rough implementation of both of these on my interpolate branch:

https://github.com/jhamman/xarray/blob/56f37144f2dc827b61aa61f94c69277e6a79c262/xarray/core/missing.py#L206-L234

some slight modifications gives this:

def _bfill(arr, n=None, axis=-1):
    '''inverse of ffill'''
    import bottleneck as bn

    arr = np.flip(arr, axis=axis)

    # fill
    arr = bn.push(arr, axis=axis, n=n)

    # reverse back to original
    return np.flip(arr, axis=axis)

def ffill(self, dim=None, limit=None):
    ''' '''
    import bottleneck as bn

    return apply_ufunc(bn.push, self,
                       dask='forbidden',
                       keep_attrs=True,
                       kwargs=dict(n=limit, axis=self.get_axis_num(dim))).transpose(*arr.dims)

def bfill(self, dim=None, limit=None):
    ''' '''
    return apply_ufunc(_bfill, self,
                       dask='forbidden',
                       keep_attrs=True,
                       kwargs=dict(n=limit, axis=self.get_axis_num(dim))).transpose(*arr.dims)

As @shoyer says, it may be possible to change to dask='parallelized' but I have tried it yet.

Feel free to take what you want from this snippet, I am moving pretty slowly on my interpolation branch but it would be great if you can get this in as a stand alone feature.

jhamman · 2017-11-11T08:30:35Z

@MaximilianR - quick update. My implementation of ffill/bfill in #1640 is a bit cleaner now and seems to be working. It probably makes sense for me to just finish off this feature in that PR. Maybe you could give it a review when you get a chance?

max-sixty · 2017-11-12T00:14:26Z

@jhamman Perfect! I had a read through yours and it looks great, excited to get this merged.
I'll close this

Initial go at ffill

3c9359c

max-sixty changed the title ~~Initial go at ffill~~ ffill & bfill methods Nov 7, 2017

shoyer reviewed Nov 8, 2017

View reviewed changes

max-sixty closed this Nov 12, 2017

max-sixty deleted the fill branch November 12, 2017 00:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ffill & bfill methods #1696

ffill & bfill methods #1696

Uh oh!

max-sixty commented Nov 7, 2017

Uh oh!

shoyer commented Nov 8, 2017

Uh oh!

shoyer Nov 8, 2017

Uh oh!

jhamman commented Nov 8, 2017

Uh oh!

jhamman commented Nov 11, 2017

Uh oh!

max-sixty commented Nov 12, 2017

Uh oh!

Uh oh!

Uh oh!

ffill & bfill methods #1696

ffill & bfill methods #1696

Uh oh!

Conversation

max-sixty commented Nov 7, 2017

Uh oh!

shoyer commented Nov 8, 2017

Uh oh!

shoyer Nov 8, 2017

Choose a reason for hiding this comment

Uh oh!

jhamman commented Nov 8, 2017

Uh oh!

jhamman commented Nov 11, 2017

Uh oh!

max-sixty commented Nov 12, 2017

Uh oh!

Uh oh!