Closed
Description
ncdat=xr.open_mfdataset(nclist, concat_dim='time')
ncdat['lat']=ncdat['lat'].isel(time=0).drop('time')
ncdat['lon']=ncdat['lon'].isel(time=0).drop('time')
ncdat=ncdat.rename({'north_south':'lat', 'east_west':'lon'})
lat_coords=ncdat.lat[:,0] #Extract latitudes
lon_coords=ncdat.lon[0,:] #Extract longitudes
ncdat=ncdat.drop(['lat','lon'])
reformatted_ncdat=ncdat.assign_coords(lat=lat_coords,lon=lon_coords, time=ncdat.coords['time'])
ncdat = reformatted_ncdat.sortby('time')
ncdat.to_netcdf('testing.nc')
Problem description
After some processing, I am left with this xarray dataset ncdat
which I want to export to a netCDF file.
<xarray.Dataset>
Dimensions: (lat: 59, lon: 75, time: 500)
Coordinates:
* time (time) datetime64[ns] 2007-01-22 ... 2008-06-04
* lat (lat) float32 -4.25 -4.15 ... 1.4500003 1.5500002
* lon (lon) float32 29.049988 29.149994 ... 36.450012
Data variables:
Streamflow_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)>
RiverDepth_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)>
RiverFlowVelocity_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)>
FloodedFrac_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)>
SurfElev_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)>
SWS_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)>
Attributes:
missing_value: -9999.0
NUM_SOIL_LAYERS: 1
SOIL_LAYER_THICKNESSES: 1.0
title: LIS land surface model output
institution: NASA GSFC
source: model_not_specified
history: created on date: 2019-04-19T09:11:12.992
references: Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007
conventions: CF-1.6
comment: website: http://lis.gsfc.nasa.gov/
MAP_PROJECTION: EQUIDISTANT CYLINDRICAL
SOUTH_WEST_CORNER_LAT: -4.25
SOUTH_WEST_CORNER_LON: 29.05
DX: 0.1
DY: 0.1
But the problem is it takes an inordinately long time to export. Almost 10 mins for this particular file which is only 35M.
How can I expedite this process? Is there anything wrong with the structure of ncdat
?
Expected Output
A netCDF file
Output of xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.3 | packaged by conda-forge | (default, Mar 27 2019, 23:01:00)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 3.0.101-0.47.105-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.2
xarray: 0.12.1
pandas: 0.24.2
numpy: 1.16.2
scipy: 1.2.1
netCDF4: 1.5.0.1
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudonetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.2.1
dask: 1.2.0
distributed: 1.27.0
matplotlib: 3.0.3
cartopy: 0.17.0
seaborn: 0.9.0
setuptools: 41.0.0
pip: 19.0.3
conda: None
pytest: None
IPython: 7.4.0
sphinx: None