Xarray: unlimited_dims generates 0-length dimensions named as letters of unlimited dimension

Created on 15 May 2018  路  5Comments  路  Source: pydata/xarray

I'm not sure I understand how the unlimited_dims option to to_netcdf() is supposed to work. Consider the following:

ds = xr.Dataset()
ds['time'] = xr.DataArray(pd.date_range('2000-01-01', '2000-01-10'), dims='time')
ds.to_netcdf('timedim.cdf', unlimited_dims='time')

This results in a file that looks like this:

$ ncdump timedim.cdf
netcdf timedim {
dimensions:
    t = UNLIMITED ; // (0 currently)
    i = UNLIMITED ; // (0 currently)
    m = UNLIMITED ; // (0 currently)
    e = UNLIMITED ; // (0 currently)
    time = UNLIMITED ; // (10 currently)
variables:
    int64 time(time) ;
        time:units = "days since 2000-01-01 00:00:00" ;
        time:calendar = "proleptic_gregorian" ;
data:

 time = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ;
}

Note the dimensions named t, i, m, e all with zero length. The time dimension (which is the only one that should exist) is properly set to UNLIMITED but we shouldn't have the four extra dimensions. What's going on here? The same behavior occurs when setting via ds.encoding['unlimited_dims'] = 'time'. Everything is as expected without the unlimited_dims option (but the time dimension is not UNLIMITED, of course).

I thought it could be related to the variable and dimension having the same name, but this also happens when they are different.

Expected Output

There shouldn't be extra 0-length dimensions

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

xarray: 0.10.3
pandas: 0.22.0
numpy: 1.14.3
scipy: 1.0.0
netCDF4: 1.3.1
h5netcdf: 0.5.0
h5py: 2.7.1
Nio: None
zarr: 2.2.0
bottleneck: 1.2.1
cyordereddict: None
dask: 0.16.1
distributed: 1.20.2
matplotlib: 2.2.2
cartopy: 0.16.0
seaborn: None
setuptools: 36.5.0.post20170921
pip: 9.0.1
conda: 4.5.3
pytest: None
IPython: 6.3.1
sphinx: 1.7.1

bug

All 5 comments

What if you do unlimited_dims=['time']? It might be expecting a list and then incorrectly parsing the string as a sequence.

Yep that does it, thanks! :thumbsup:

I guess I could have read the "sequence of str" description in the docs more closely. Maybe it would make sense to accept a single string in addition to a sequence of strings?

I still think this is a bug. I really don't know the best way to check that an object is a sequence other than a string, but it must be solved elsewhere in xarray.

We usually write something like:

if isinstance(unlimited_dims, basestring):
    unlimited_dims = [unlimited_dims]

(This does come up quite commonly, but the work-around is short enough that we haven't written a utility function for it.)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

aseyboldt picture aseyboldt  路  5Comments

duncanwp picture duncanwp  路  4Comments

Yefee picture Yefee  路  4Comments

equaeghe picture equaeghe  路  4Comments

mathause picture mathause  路  4Comments