Faster calculation of moving averages in Koalas.  

I am trying to calculate the exponential moving average to a koalas dataframe. I am able to achieve this as below

```
import pandas as pd
import databricks.koalas as ks
from databricks.koalas import pandas_wraps

df = ks.DataFrame({'cust_id':['a', 'a', 'a', 'b', 'b'],
                   'sales': [100, 200, 300, 400, 500]})
def fun(col1) -> ks.Series[np.float64]:
    return col1.apply(lambda x: x.ewm(alpha=0.5, adjust=False).mean())  # Arbitrary pandas code.
df['moving_average'] = fun(df.groupby('cust_id').sales)
df.head()
```

However, when I try to implement the above in a dataset that has 30M records it takes 4 hrs to complete. Is there any way to speed this up

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Faster calculation of moving averages in Koalas. #1213

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Faster calculation of moving averages in Koalas. #1213

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions