Using groupby with custom index

Hello,

I have 6 hourly data (ERA Interim) for around 10 years. I want to calculate the annual 6 hourly climatology, i.e,  366*4 values, with each value corresponding to a 6 hourly interval. I am chunking the data along longitude.
I'm using xarray 0.9.1 with Python 3.6 (Anaconda).

For a daily climatology on this data, I do the usual:
```python
mean = data.groupby('time.dayofyear').mean(dim='time').compute()
```
For the 6 hourly version, I am trying the following:
```python
test = (data['time.hour']/24 + data['time.dayofyear'])
test.name = 'dayHourly'
new_test = data.groupby(test).mean(dim='time').compute()
```
The first one (daily climatology) takes around 15 minutes for my data, whereas the second one ran for almost 30 minutes after which I gave up and killed the process.

Is there some obvious reason why the first is much faster than the second? ```data``` in both cases is the 6 hourly dataset. And is there an alternative way of expressing this computation which would make it faster?

TIA,
Joy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Using groupby with custom index #1308

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Using groupby with custom index #1308

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions