Skip to content

Incorrect y-coordinates for non-georeferenced data from open_rasterio #1686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
maaleske opened this issue Nov 3, 2017 · 8 comments
Closed

Comments

@maaleske
Copy link
Contributor

maaleske commented Nov 3, 2017

For a non-georeferenced dataset, open_rasterio() results in y-coordinates from the raster height to two times the height, instead of the expected 0 .. height range:

import numpy as np
import rasterio
from xarray import open_rasterio

tmp_file = 'no_transform.tif'
nx, ny, nz = 4, 3, 3 
data = np.arange(nx*ny*nz,
                 dtype=rasterio.float32).reshape(nz, ny, nx)
with rasterio.open(
              tmp_file, 'w', 
              driver='GTiff', height=ny, width=nx, count=nz,
              dtype=rasterio.float32) as s:
        s.write(data)

open_rasterio(tmp_file)
<xarray.DataArray (band: 3, y: 3, x: 4)>
array( ... )
Coordinates:
  * band     (band) int64 1 2 3
  * y        (y) float64 3.5 4.5 5.5
  * x        (x) float64 0.5 1.5 2.5 3.5
Attributes:
     res:        (1.0, -1.0)
     is_tiled:   0
     transform:  (0.0, 1.0, 0.0, 0.0, 0.0, 1.0)

Passing transform=from_origin(0,3,1,1) (from rasterio.transforms) to rasterio.open() in the above code seems to give the expected result with y running down from 2.5:

<xarray.DataArray (band: 3, y: 3, x: 4)>
array( ... )
Coordinates:
  * band     (band) int64 1 2 3
  * y        (y) float64 2.5 1.5 0.5
  * x        (x) float64 0.5 1.5 2.5 3.5
Attributes:
    res:        (1.0, 1.0)
    is_tiled:   0
    transform:  (0.0, 1.0, 0.0, 3.0, 0.0, -1.0)

I'm not sure whether there's something amiss in the xarray coordinate calculations or the rasterio default transform logic. Looking at the code, I feel like there's some mismatch between the rasterio res property logic and the xarray coordinate generation (namely, the sign of res[1]):

https://github.com/mapbox/rasterio/blob/fcd361c49cca9c4ad32a2ac1f5d66b967f4b6cd4/rasterio/_base.pyx#L611-L619

# Get geo coords
nx, ny = riods.width, riods.height
dx, dy = riods.res[0], -riods.res[1]
x0 = riods.bounds.right if dx < 0 else riods.bounds.left
y0 = riods.bounds.top if dy < 0 else riods.bounds.bottom
coords['y'] = np.linspace(start=y0 + dy/2, num=ny,
stop=(y0 + (ny - 1) * dy) + dy/2)
coords['x'] = np.linspace(start=x0 + dx/2, num=nx,
stop=(x0 + (nx - 1) * dx) + dx/2)

I'm not quite sure if the xarray code is quite correct for georefenced data, since it doesn't utilize the given coordinate transform. I don't' have files to test it with though.

@fmaussion
Copy link
Member

fmaussion commented Nov 3, 2017

I'm not quite sure if the xarray code is quite correct for georefenced data

We don't test for correct georeferencing in the xarray codebase but I did a bunch of tests locally and it seemed to work well (at least I don't expect such a logical flaw as data being completely upside down or shifted by 2). e.g. http://xarray.pydata.org/en/latest/auto_gallery/plot_rasterio.html#sphx-glr-auto-gallery-plot-rasterio-py

For your example something seems to go wrong though. What version of rasterio are you using?

@maaleske
Copy link
Contributor Author

maaleske commented Nov 3, 2017

@fmaussion I've tested with both rasterio 0.36.0 and the current master branch (1.0a11), both seem to have the same behaviour.

@fmaussion
Copy link
Member

OK, I will investigate but this might have to wait for next week. Also, I'll do some more georeferencing tests offline (we can easily check with external tools if what we are doing is correct or not)

@maaleske
Copy link
Contributor Author

maaleske commented Nov 3, 2017

@fmaussion Thanks!
I get the feeling there's probably something odd with rasterio here. I played around a bit and found that rasterio doesn't seem to care about the signs in the transform if the given upper left is (0,0) . Giving any of from_origin(0, 0, i, j) for i, j = +-1 as the transform gives a warning about using the default geotransform and doesn't save it to the file.

@fmaussion
Copy link
Member

I'm afraid I don't know rasterio well enough to be able to take a decision here. However the fact that rasterio throws a warning when using your code above is not a good sign, and I don't think we have to take action in xarray (yet).

I am however also not super happy with this bit of code here:

dx, dy = riods.res[0], -riods.res[1]
x0 = riods.bounds.right if dx < 0 else riods.bounds.left
y0 = riods.bounds.top if dy < 0 else riods.bounds.bottom
coords['y'] = np.linspace(start=y0 + dy/2, num=ny,
stop=(y0 + (ny - 1) * dy) + dy/2)
coords['x'] = np.linspace(start=x0 + dx/2, num=nx,
stop=(x0 + (nx - 1) * dx) + dx/2)

It works for all the cases I've encountered until now, but it feels like this logic should be available in rasterio already.

@fmaussion
Copy link
Member

It looks like we could use rasterio's Affine object for this: https://stackoverflow.com/questions/27861197/how-to-i-get-the-coordinates-of-a-cell-in-a-geotif

Will have a look at it

@maaleske
Copy link
Contributor Author

Just looking at the rasterio quickstart, it seems that it'd be easy to generate the coordinates using the Affine instance (though that is only available in Rasterio > 1.0a, with 0.36.0 you just have the tuple containing the multipliers). I think that might solve the problem though, since I would expect the sign of resand the transform to be synchronized to the same convention on the rasterio side, which might be breaking with the current code.

@fmaussion
Copy link
Member

Testing against all files used in the rasterio test base yields following results:

import numpy as np
import rasterio
import xarray as xr
from affine import Affine
from glob import glob
import os

# Test all files from rasterio's test base
rdir = '/home/mowglie/Downloads/rasterio-master/tests/data/*.tif'
ok = []
for path in glob(rdir):
    ds = xr.open_rasterio(path)
    rio = rasterio.open(path, mode='r')

    # Get geo coords
    T0 = rio.transform
    T1 = T0 * Affine.translation(0.5, 0.5)
    rc2xy = lambda r, c: (c, r) * T1

    xy0r = rc2xy(0, 0)
    xy0x = (ds['x'].values[0], ds['y'].values[0])
    xy1r = rc2xy(rio.height-1, rio.width-1)
    xy1x = (ds['x'].values[-1], ds['y'].values[-1])

    t1 = np.allclose(xy0r, xy0x)
    t2 = np.allclose(xy1r, xy1x)
    if t1 and t2:
        print(os.path.basename(path) + ': OK')
    else:
        print(os.path.basename(path) + ': NOT OK', xy0r, xy0x, xy1r, xy1x)

Out:

shade.tif: OK
rgb_lzw.tif: OK
rgb_deflate.tif: OK
rgb4.tif: OK
RGBA.byte.tif: OK
world.rgb.tif: OK
world.byte.tif: OK
alpha.tif: OK
rgb1.tif: OK
RGB.byte.tif: OK
float_nan.tif: NOT OK (50.0, 50.0) (50.0, 250.0) (250.0, 150.0) (250.0, 350.0)
float.tif: NOT OK (50.0, 50.0) (50.0, 250.0) (250.0, 150.0) (250.0, 350.0)
rgb3.tif: OK
rgb2.tif: OK

The two files which are not working also miss a proper crs and transform, but still we should be consistent with what rasterio is doing :-(

At least I think that the majority of "standard" files are working properly with our current code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants