-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
ffill & bfill methods #1696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ffill & bfill methods #1696
Conversation
I like separate methods. It keeps things logically well separated. |
|
||
if limit is None: | ||
# bottleneck raises an error if you pass `None` | ||
data.values = bn.push(data.values, axis=axis_num) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try apply_ufunc
+ transpose()
afterwards instead? That would work on arbitrary xarray objects. It's awkward to need to restore the original dimension order, but it adds almost no overhead given NumPy's memory model. @jhamman is doing something like that in #1640.
I think you could even use dask='parallelized'
here, to allow for doing multiple forward fills at the same time.
Parallelizing across the pushed axis with dask would be harder (and certainly isn't necessary for a first pass), because you would need something hierarchical that can push data between chunks. But I think you could still make this work with multiple passes to get forward fill-values on the boundaries.
@MaximilianR - I have rough implementation of both of these on my interpolate branch: some slight modifications gives this: def _bfill(arr, n=None, axis=-1):
'''inverse of ffill'''
import bottleneck as bn
arr = np.flip(arr, axis=axis)
# fill
arr = bn.push(arr, axis=axis, n=n)
# reverse back to original
return np.flip(arr, axis=axis)
def ffill(self, dim=None, limit=None):
''' '''
import bottleneck as bn
return apply_ufunc(bn.push, self,
dask='forbidden',
keep_attrs=True,
kwargs=dict(n=limit, axis=self.get_axis_num(dim))).transpose(*arr.dims)
def bfill(self, dim=None, limit=None):
''' '''
return apply_ufunc(_bfill, self,
dask='forbidden',
keep_attrs=True,
kwargs=dict(n=limit, axis=self.get_axis_num(dim))).transpose(*arr.dims) As @shoyer says, it may be possible to change to Feel free to take what you want from this snippet, I am moving pretty slowly on my interpolation branch but it would be great if you can get this in as a stand alone feature. |
@MaximilianR - quick update. My implementation of ffill/bfill in #1640 is a bit cleaner now and seems to be working. It probably makes sense for me to just finish off this feature in that PR. Maybe you could give it a review when you get a chance? |
@jhamman Perfect! I had a read through yours and it looks great, excited to get this merged. |
git diff upstream/master **/*py | flake8 --diff
whats-new.rst
for all changes andapi.rst
for new APINo docs / docstrings. No backfill.
But otherwise is this a reasonable layout?
Do we prefer
fillna
with options (back / forward / replacement), orffill
,bfill
,fillna
?