apply_ufunc(dask='parallelized') output_dtypes for datasets

When a Dataset has variables with different dtypes, there's no way to tell apply_ufunc that the same function applied to different variables will produce different dtypes:

```
ds1 = xarray.Dataset(data_vars={'a': ('x', [1, 2]), 'b': ('x', [3.0, 4.5])}).chunk()
ds2 = xarray.apply_ufunc(lambda x: x + 1, ds1, dask='parallelized', output_dtypes=[float])
ds2

<xarray.Dataset>
Dimensions:  (x: 2)
Dimensions without coordinates: x
Data variables:
    a        (x) float64 dask.array<shape=(2,), chunksize=(2,)>
    b        (x) float64 dask.array<shape=(2,), chunksize=(2,)>

ds2.compute()

<xarray.Dataset>
Dimensions:  (x: 2)
Dimensions without coordinates: x
Data variables:
    a        (x) int64 2 3
    b        (x) float64 4.0 5.5
```

### Proposed solution
When the output is a dataset, apply_ufunc could accept either ``output_dtypes=[t]`` (if all output variables will have the same dtype) or ``output_dtypes=[{var1: t1, var2: t2, ...}]``. In the example above, it would be ``output_dtypes=[{'a': int, 'b': float}]``.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

apply_ufunc(dask='parallelized') output_dtypes for datasets #1699

Proposed solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

apply_ufunc(dask='parallelized') output_dtypes for datasets #1699

Description

Proposed solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions