Can we set a shapefile as a mask for each netcdf file and run xray methods for values within the shapefile region?
for example if I want to create a timeseries of monthly mean temperature for 'mystate' from a netcdf file that contains data for the whole country:
filepath = r"DATA/temp/_/_temp.nc"
shapefile = r"DATA/mystate.shp"
ds=xray.open_mfdataset(filepath)
ds_variable=ds['temp']
monthlymean=ds_variable.resample('1MS', dim='time', how='mean')
meanmonthlyofmystate=monthlymean.groupby('time').mean() #add somewhere here the shapefile
meanmonthlyofmystate.to_pandas().plot()
xray itself doesn't have any way to handle shapefiles. You'll might look at other packages such as shapely or geopy or geopandas to generate a raster mask that can be used by xray to select portions of your netCDF file.
rasterio and geopandas can be combined with xray to make converting shapefiles into raster masks pretty easy. Here's a quick demo:
import geopandas
from rasterio import features
from affine import Affine
def transform_from_latlon(lat, lon):
lat = np.asarray(lat)
lon = np.asarray(lon)
trans = Affine.translation(lon[0], lat[0])
scale = Affine.scale(lon[1] - lon[0], lat[1] - lat[0])
return trans * scale
def rasterize(shapes, coords, fill=np.nan, **kwargs):
"""Rasterize a list of (geometry, fill_value) tuples onto the given
xray coordinates. This only works for 1d latitude and longitude
arrays.
"""
transform = transform_from_latlon(coords['latitude'], coords['longitude'])
out_shape = (len(coords['latitude']), len(coords['longitude']))
raster = features.rasterize(shapes, out_shape=out_shape,
fill=fill, transform=transform,
dtype=float, **kwargs)
return xray.DataArray(raster, coords=coords, dims=('latitude', 'longitude'))
# this shapefile is from natural earth data
# http://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-1-states-provinces/
states = geopandas.read_file('/Users/shoyer/Downloads/ne_10m_admin_1_states_provinces_lakes')
us_states = states.query("admin == 'United States of America'").reset_index(drop=True)
state_ids = {k: i for i, k in enumerate(us_states.woe_name)}
shapes = [(shape, n) for n, shape in enumerate(us_states.geometry)]
ds = xray.Dataset(coords={'longitude': np.linspace(-125, -65, num=5000),
'latitude': np.linspace(50, 25, num=3000)})
ds['states'] = rasterize(shapes, ds.coords)
# example of applying a mask
ds.states.where(ds.states == state_ids['California']).plot()

Once you have the rasterized geometries, you can use them as arrays to do arithmetic: https://github.com/xray/xray/issues/503
When we figure out how to represent coordinate reference systems properly in xray we might add in a direct wrapper for some of these rasterio functions.
This example is very helpful. Thank you.
I think the where() method you refer to will be very useful - When will the
version 0.5.3 be released?
Thanks
Sarah
On 31 July 2015 at 06:00, Stephan Hoyer [email protected] wrote:
rasterio and geopandas can be combined with xray to make converting
shapefiles into raster masks pretty easy. Here's a quick demo:
``python
import geopandas
from rasterio import features
from affine import Affinedef transform_from_latlon(lat, lon):
lat = np.asarray(lat)
lon = np.asarray(lon)
trans = Affine.translation(lon[0], lat[0])
scale = Affine.scale(lon[1] - lon[0], lat[1] - lat[0])
return trans * scaledef rasterize(shapes, coords, fill=np.nan, *
_kwargs): """Rasterize a list of (geometry, fill_value) tuples onto the
given xray coordinates. This only works for 1d latitude and longitude
arrays. """ transform = transform_from_latlon(coords['latitude'],
coords['longitude']) out_shape = (len(coords['latitude']),
len(coords['longitude'])) raster = features.rasterize(shapes,
out_shape=out_shape, fill=fill, transform=transform, *_kwargs)
return xray.DataArray(raster, coords=coords, dims=('latitude',
'longitude'))states =
geopandas.read_file('/Users/shoyer/Downloads/ne_10m_admin_1_states_provinces_lakes')
geometries = states.query("admin == 'United States of America'").geometry
shapes = [(shape, n) for n, shape in enumerate(geometries)]ds = xray.Dataset(coords={'longitude': np.linspace(-125, -65, num=2000),
'latitude': np.linspace(50, 25, num=1000)})
ds['states'] = rasterize(shapes, ds.coords)
plotting requires the dev version of xrayds.states.plot()
Once you have the rasterized geometries, you can use them as arrays to do arithmetic: https://github.com/xray/xray/issues/503
When we figure out how to represent coordinate reference systems properly in xray we might add in a direct wrapper for some of these rasterio functions.
—
Reply to this email directly or view it on GitHub
https://github.com/xray/xray/issues/501#issuecomment-126461466.
Dr Sarah Harris
Research Fellow
School of Earth, Atmosphere and Environment
Room 227, 9 Rainforest Walk
Monash University
Ph: 03 9902 4243
Email: sarah.[email protected]
We'll probably release v0.6 in several weeks, once we have the key
functionality we want for new the plotting module.
On Wed, Aug 5, 2015 at 7:24 PM, slharris [email protected] wrote:
This example is very helpful. Thank you.
I think the where() method you refer to will be very useful - When will the
version 0.5.3 be released?Thanks
SarahOn 31 July 2015 at 06:00, Stephan Hoyer [email protected] wrote:
rasterio and geopandas can be combined with xray to make converting
shapefiles into raster masks pretty easy. Here's a quick demo:
``python
import geopandas
from rasterio import features
from affine import Affinedef transform_from_latlon(lat, lon):
lat = np.asarray(lat)
lon = np.asarray(lon)
trans = Affine.translation(lon[0], lat[0])
scale = Affine.scale(lon[1] - lon[0], lat[1] - lat[0])
return trans * scaledef rasterize(shapes, coords, fill=np.nan, *
_kwargs): """Rasterize a list of (geometry, fill_value) tuples onto the
given xray coordinates. This only works for 1d latitude and longitude
arrays. """ transform = transform_from_latlon(coords['latitude'],
coords['longitude']) out_shape = (len(coords['latitude']),
len(coords['longitude'])) raster = features.rasterize(shapes,
out_shape=out_shape, fill=fill, transform=transform, *_kwargs)
return xray.DataArray(raster, coords=coords, dims=('latitude',
'longitude'))states =
geopandas.read_file('/Users/shoyer/Downloads/ne_10m_admin_1_states_provinces_lakes')
geometries = states.query("admin == 'United States of America'").geometry
shapes = [(shape, n) for n, shape in enumerate(geometries)]ds = xray.Dataset(coords={'longitude': np.linspace(-125, -65, num=2000),
'latitude': np.linspace(50, 25, num=1000)})
ds['states'] = rasterize(shapes, ds.coords)
plotting requires the dev version of xrayds.states.plot()
Once you have the rasterized geometries, you can use them as arrays to
do arithmetic: https://github.com/xray/xray/issues/503When we figure out how to represent coordinate reference systems
properly in xray we might add in a direct wrapper for some of these
rasterio functions.—
Reply to this email directly or view it on GitHub
https://github.com/xray/xray/issues/501#issuecomment-126461466.
Dr Sarah Harris
Research Fellow
School of Earth, Atmosphere and Environment
Room 227, 9 Rainforest Walk
Monash University
Ph: 03 9902 4243
Email: sarah.[email protected]—
Reply to this email directly or view it on GitHub
https://github.com/xray/xray/issues/501#issuecomment-128216372.
Now that I am using xray version 0.6.0 I cannot find any examples that use where() to mask out values from one xray dataset using a rasterized shapefile that has been turned into an xray dataset.
Referring to my original post in this thread can I resample a timeseries, find the mean, groupby by time and plot using only the values that fall within one state?
Any feedback will be greatly appreciated
@slharris I just updated the earlier example to show how to use where for masking.
Thank you, but can we use the mask and apply it to another xray dataset - so you only take the values from one dataset that fall in the region of the mask)? I have tried below (but this doesn't work).
Thanks
ds.states.where(ds.states == state_ids['California']).plot()
dstemp=xray.open_mfdataset(filepath)
ds_variable=dstemp['temp']
monthlymean=ds_variable.resample('1MS', dim='time', how='mean')
meanmonthlycaliforniatemp=ds.states.where(ds.states==state_ids['California']).monthlymean.groupby('time').mean()
meanmonthlycaliforniatemp.to_pandas().plot()
@slharris I think you need to modify your second to last line like so:
monthlymean_california = monthlymean.where(ds.states == state_ids['California']).groupby('time').mean()
Thank you for fixing the last line, I can get your example to run with my own data without any errors being raised however the output does not appear to have any masked applied!
I can use where() to plot a selected area but when it comes to being used with another dataset it doesn't appear to be working for me. Is there an example that shows something like temperature data over time for different states, extracted using the method above?
This method will be so useful for me - if I can get it to work!
Any feedback will be greatly appreciated.
Hmm. My code does seem to have a bug when given a dataset with more than 2 dimensions. I'll see if I can fix this up and add it to the examples page on the docs.
On Mon, Sep 28, 2015 at 4:20 PM, slharris [email protected]
wrote:
Thank you for fixing the last line, I can get your example to run with my own data without any errors being raised however the output does not appear to have any masked applied!
I can use where() to plot a selected area but when it comes to being used with another dataset it doesn't appear to be working for me. Is there an example that shows something like temperature data over time for different states, extracted using the method above?
This method will be so useful for me - if I can get it to work!Any feedback will be greatly appreciated.
Reply to this email directly or view it on GitHub:
https://github.com/xray/xray/issues/501#issuecomment-143899899
I am trying not to be annoying but is there any chance you were able to fix this bug? I checked on the examples page but could not find anything.
thanks
Here's a notebook that includes a working example: https://gist.github.com/shoyer/0eb96fa8ab683ef078eb
To get the facetted plot to work, you'll need to be running the development version of xray.
I'm going to close this in favor of salem. @fmaussion feel free to chime in if you don't think that is appropriate.
@shoyer , the website can't opem. Can you share the code or new website? Moreover, is there any new way to solve the nc file extracting?
If you refer the shapefile to mask procedure, I know about two ways to do it :
@liyaojun have you tried the website http://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-1-states-provinces/ ? It still works for me.
I am doing a project where we need to segment a geospatial data. The data set comprises the Fire Weather Index (FWI) -- a fire hazard classification system -- for the whole world from the 1980's to 2018. We want to segment data according to the NUTS (or GADM) in order to be able to predict and investigate possible patterns over time.
We have to our disposal:
Our methodology was the following:
A couple of issues were identified (as shown in the picture below):

As can be seen there seems to exist an offset between the segmented data and the actual map areas. Moreover, there are some islands that are not even captured by the segmentation.
The only modification to the original code was the introduction of the argument all_touched=True to the features.rasterize method.
Does anyone ever got a similar error?
Most helpful comment
rasterioandgeopandascan be combined with xray to make converting shapefiles into raster masks pretty easy. Here's a quick demo:Once you have the rasterized geometries, you can use them as arrays to do arithmetic: https://github.com/xray/xray/issues/503
When we figure out how to represent coordinate reference systems properly in xray we might add in a direct wrapper for some of these rasterio functions.