Geopandas: Issues with MultiIndexed columns

Created on 18 Jun 2018  路  3Comments  路  Source: geopandas/geopandas

When using a GeoDataFrame /w pandas.MultiIndex'ed columns, some operations are broken:

  • the geometry column's type changes from geopandas.geoseries.GeoSeries to pandas.series.Series after applying a pandas.MultiIndex on a GeoDataFrame
In [135]: db.data.head()
Out[135]: 
           A_src     B_src     C_src     D_src    E_src  \
1+e  #                                                    
1.01 0  123334.0   81637.6  262758.0  105860.0  24666.9   
     1  126234.0   97052.0  261232.0   91970.2  25246.8   
     2  118364.0  121797.0  268455.0   94535.2  23672.8   
     3  123334.0  120658.0  278210.0   69905.6  24666.9   
     4  126052.0  127851.0  283827.0   61846.1  25210.4   

                                                 geometry  
1+e  #                                                     
1.01 0  LINESTRING (1234.8809442278 5504.880930993706,...  
     1  LINESTRING (1234.8809442278 5504.880930993706,...  
     2  LINESTRING (1234.8809442278 5504.880930993706,...  
     3  LINESTRING (1234.8809442278 5504.880930993706,...  
     4  LINESTRING (1234.8809442278 5504.880930993706,...  

In [136]: type(db.data.geometry)
Out[136]: geopandas.geoseries.GeoSeries

In [137]: db.data.columns = pd.MultiIndex.from_tuples([('Source', 'A'), ('Source', 'B'), ('Source', 'C'), ('Source', 'D'), ('Source', 'E'), ('Paths', 'Geometry')])

In [138]: db.data.head()
Out[138]: 
          Source                                         \
               A         B         C         D        E   
1+e  #                                                    
1.01 0  123334.0   81637.6  262758.0  105860.0  24666.9   
     1  126234.0   97052.0  261232.0   91970.2  25246.8   
     2  118364.0  121797.0  268455.0   94535.2  23672.8   
     3  123334.0  120658.0  278210.0   69905.6  24666.9   
     4  126052.0  127851.0  283827.0   61846.1  25210.4   

                                                    Paths  
                                                 Geometry  
1+e  #                                                     
1.01 0  LINESTRING (1234.8809442278 5504.880930993706,...  
     1  LINESTRING (1234.8809442278 5504.880930993706,...  
     2  LINESTRING (1234.8809442278 5504.880930993706,...  
     3  LINESTRING (1234.8809442278 5504.880930993706,...  
     4  LINESTRING (1234.8809442278 5504.880930993706,...  

In [141]: db.data.set_geometry(('Paths', 'Geometry'), inplace=True)

In [142]: type(db.data.geometry)
Out[142]: pandas.core.series.Series
  • this leads to various exceptions when trying to operate on such a frame
db.data.to_crs({'init', 'epsg:4326'})
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-144-ccff62b7af80> in <module>()
----> 1 db.data.to_crs({'init', 'epsg:4326'})

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/geodataframe.pyc in to_crs(self, crs, epsg, inplace)
    384         else:
    385             df = self.copy()
--> 386         geom = df.geometry.to_crs(crs=crs, epsg=epsg)
    387         df.geometry = geom
    388         df.crs = geom.crs

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/pandas/core/generic.pyc in __getattr__(self, name)
   4370             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   4371                 return self[name]
-> 4372             return object.__getattribute__(self, name)
   4373 
   4374     def __setattr__(self, name, value):

AttributeError: 'Series' object has no attribute 'to_crs'

In [145]: # but this is still working 

In [146]: db.data.geometry.head()
Out[146]: 
1+e   #
1.01  0    LINESTRING (1234.8809442278 5504.880930993706,...
      1    LINESTRING (1234.8809442278 5504.880930993706,...
      2    LINESTRING (1234.8809442278 5504.880930993706,...
      3    LINESTRING (1234.8809442278 5504.880930993706,...
      4    LINESTRING (1234.8809442278 5504.880930993706,...
Name: (Paths, Geometry), dtype: object

In [147]: # also this is still correct

In [148]: type(db.data)
Out[148]: geopandas.geodataframe.GeoDataFrame
  • same issue if manually creating a GeoSeries and setting it as geometry of an existing GeoDataFrame
In [150]: import geopandas

In [151]: geos = geopandas.GeoSeries(data=db.data.geometry, name=('Paths', 'Geometry'))

In [152]: geos.head()
Out[152]: 
1+e   #
1.01  0    LINESTRING (1234.8809442278 5504.880930993706,...
      1    LINESTRING (1234.8809442278 5504.880930993706,...
      2    LINESTRING (1234.8809442278 5504.880930993706,...
      3    LINESTRING (1234.8809442278 5504.880930993706,...
      4    LINESTRING (1234.8809442278 5504.880930993706,...
Name: (Paths, Geometry), dtype: object

In [153]: db.data.geometry = geos

In [154]: type(db.data.geometry)
Out[154]: pandas.core.series.Series

In [155]: type(geos)
Out[155]: geopandas.geoseries.GeoSeries
  • creating a new GeoDataFrame /w a dedicated geometry series named as the desired column (e.g. ('Paths', 'Geometry')) raises no error but set's the column name to the wrong value (geometry)
In [166]: geos.name
Out[166]: ('Paths', 'Geometry')

In [167]: type(geos)
Out[167]: geopandas.geoseries.GeoSeries

In [168]: df = geopandas.GeoDataFrame(data=db.data['Source'], geometry=geos)

In [169]: df.head()
Out[169]: 
               A         B         C         D        E  \
1+e  #                                                    
1.01 0  123334.0   81637.6  262758.0  105860.0  24666.9   
     1  126234.0   97052.0  261232.0   91970.2  25246.8   
     2  118364.0  121797.0  268455.0   94535.2  23672.8   
     3  123334.0  120658.0  278210.0   69905.6  24666.9   
     4  126052.0  127851.0  283827.0   61846.1  25210.4   

                                                 geometry  
1+e  #                                                     
1.01 0  LINESTRING (1234.8809442278 5504.880930993706,...  
     1  LINESTRING (1234.8809442278 5504.880930993706,...  
     2  LINESTRING (1234.8809442278 5504.880930993706,...  
     3  LINESTRING (1234.8809442278 5504.880930993706,...  
     4  LINESTRING (1234.8809442278 5504.880930993706,...  

In [170]: df.geometry.head()
Out[170]: 
1+e   #
1.01  0    LINESTRING (1234.8809442278 5504.880930993706,...
      1    LINESTRING (1234.8809442278 5504.880930993706,...
      2    LINESTRING (1234.8809442278 5504.880930993706,...
      3    LINESTRING (1234.8809442278 5504.880930993706,...
      4    LINESTRING (1234.8809442278 5504.880930993706,...
Name: geometry, dtype: object

In [171]: type(df.geometry)
Out[171]: geopandas.geoseries.GeoSeries
  • geopandas.io.file.to_file() raises different errors if called by wrapper of (column-)MultiIndex'ed GeoDataFrames (no matter/w or /wo a valid type'd geometry column)
In [172]: df.to_file('working.shp')

In [173]: df.columns = pd.MultiIndex.from_tuples([('Source', 'A'), ('Source', 'B'), ('Source', 'C'), ('Source', 'D'), ('Source', 'E'), ('Paths', 'Geometry')])

In [174]: df.to_file('surelynotworking.shp')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-174-a0643f06bb42> in <module>()
----> 1 df.to_file('working.shp')

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/geodataframe.pyc in to_file(self, filename, driver, schema, **kwargs)
    363         """
    364         from geopandas.io.file import to_file
--> 365         to_file(self, filename, driver, schema, **kwargs)
    366 
    367     def to_crs(self, crs=None, epsg=None, inplace=False):

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/io/file.pyc in to_file(df, filename, driver, schema, **kwargs)
     58     """
     59     if schema is None:
---> 60         schema = infer_schema(df)
     61     filename = os.path.abspath(os.path.expanduser(filename))
     62     with fiona.drivers():

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/io/file.pyc in infer_schema(df)
     86     ])
     87 
---> 88     geom_type = _common_geom_type(df)
     89     if not geom_type:
     90         raise ValueError("Geometry column cannot contain mutiple "

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/io/file.pyc in _common_geom_type(df)
    100     # Some (most?) providers expect a single geometry type:
    101     # Point, LineString, or Polygon
--> 102     geom_types = df.geometry.geom_type.unique()
    103 
    104     from os.path import commonprefix   # To find longest common prefix

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/pandas/core/generic.pyc in __getattr__(self, name)
   4370             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   4371                 return self[name]
-> 4372             return object.__getattribute__(self, name)
   4373 
   4374     def __setattr__(self, name, value):

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/geodataframe.pyc in _get_geometry(self)
     67         if self._geometry_column_name not in self:
     68             raise AttributeError("No geometry data set yet (expected in"
---> 69                                  " column '%s'." % self._geometry_column_name)
     70         return self[self._geometry_column_name]
     71 

AttributeError: No geometry data set yet (expected in column 'geometry'.

In [176]: df.set_geometry(('Paths', 'Geometry'), inplace=True)

In [177]: df.to_file('notworking.shp')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-177-a0643f06bb42> in <module>()
----> 1 df.to_file('working.shp')

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/geodataframe.pyc in to_file(self, filename, driver, schema, **kwargs)
    363         """
    364         from geopandas.io.file import to_file
--> 365         to_file(self, filename, driver, schema, **kwargs)
    366 
    367     def to_crs(self, crs=None, epsg=None, inplace=False):

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/io/file.pyc in to_file(df, filename, driver, schema, **kwargs)
     58     """
     59     if schema is None:
---> 60         schema = infer_schema(df)
     61     filename = os.path.abspath(os.path.expanduser(filename))
     62     with fiona.drivers():

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/io/file.pyc in infer_schema(df)
     86     ])
     87 
---> 88     geom_type = _common_geom_type(df)
     89     if not geom_type:
     90         raise ValueError("Geometry column cannot contain mutiple "

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/io/file.pyc in _common_geom_type(df)
    100     # Some (most?) providers expect a single geometry type:
    101     # Point, LineString, or Polygon
--> 102     geom_types = df.geometry.geom_type.unique()
    103 
    104     from os.path import commonprefix   # To find longest common prefix

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/pandas/core/generic.pyc in __getattr__(self, name)
   4370             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   4371                 return self[name]
-> 4372             return object.__getattribute__(self, name)
   4373 
   4374     def __setattr__(self, name, value):

AttributeError: 'Series' object has no attribute 'geom_type'

In [178]: df = geopandas.GeoDataFrame(data=db.data, geometry=geos)

In [179]: df.head()
Out[179]: 
          Source                                         \
               A         B         C         D        E   
1+e  #                                                    
1.01 0  123334.0   81637.6  262758.0  105860.0  24666.9   
     1  126234.0   97052.0  261232.0   91970.2  25246.8   
     2  118364.0  121797.0  268455.0   94535.2  23672.8   
     3  123334.0  120658.0  278210.0   69905.6  24666.9   
     4  126052.0  127851.0  283827.0   61846.1  25210.4   

                                                    Paths  \
                                                 Geometry   
1+e  #                                                      
1.01 0  LINESTRING (1234.8809442278 5504.880930993706,...   
     1  LINESTRING (1234.8809442278 5504.880930993706,...   
     2  LINESTRING (1234.8809442278 5504.880930993706,...   
     3  LINESTRING (1234.8809442278 5504.880930993706,...   
     4  LINESTRING (1234.8809442278 5504.880930993706,...   

                                                 geometry  

1+e  #                                                     
1.01 0  LINESTRING (1234.8809442278 5504.880930993706,...  
     1  LINESTRING (1234.8809442278 5504.880930993706,...  
     2  LINESTRING (1234.8809442278 5504.880930993706,...  
     3  LINESTRING (1234.8809442278 5504.880930993706,...  
     4  LINESTRING (1234.8809442278 5504.880930993706,...  

In [180]: df.to_file('notworking.shp')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-180-dcf3f0b1527a> in <module>()
----> 1 df.to_file('notworking.shp')

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/geodataframe.pyc in to_file(self, filename, driver, schema, **kwargs)
    363         """
    364         from geopandas.io.file import to_file
--> 365         to_file(self, filename, driver, schema, **kwargs)
    366 
    367     def to_crs(self, crs=None, epsg=None, inplace=False):

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/io/file.pyc in to_file(df, filename, driver, schema, **kwargs)
     62     with fiona.drivers():
     63         with fiona.open(filename, 'w', driver=driver, crs=df.crs,
---> 64                         schema=schema, **kwargs) as colxn:
     65             for feature in df.iterfeatures():
     66                 colxn.write(feature)

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/fiona/__init__.pyc in open(path, mode, driver, schema, crs, encoding, layer, vfs, enabled_drivers, crs_wkt)
    173         c = Collection(path, mode, crs=crs, driver=driver, schema=this_schema,
    174                        encoding=encoding, layer=layer, vsi=vsi, archive=archive,
--> 175                        enabled_drivers=enabled_drivers, crs_wkt=crs_wkt)
    176     else:
    177         raise ValueError(

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/fiona/collection.pyc in __init__(self, path, mode, driver, schema, crs, encoding, layer, vsi, archive, enabled_drivers, crs_wkt, **kwargs)
    154             elif self.mode in ('a', 'w'):
    155                 self.session = WritingSession()
--> 156                 self.session.start(self, **kwargs)
    157         except IOError:
    158             self.session = None

fiona/ogrext.pyx in fiona.ogrext.WritingSession.start()

AttributeError: 'tuple' object has no attribute 'encode'
  • Consequently, plotting a GeoDataFrame /w a "degeoficated" multiindexed geometry series object also doesn't work
In [182]: db.data.plot()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-182-4b3c47a7be6c> in <module>()
----> 1 db.data.plot()

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/geodataframe.pyc in plot(self, *args, **kwargs)
    467     def plot(self, *args, **kwargs):
    468 
--> 469         return plot_dataframe(self, *args, **kwargs)
    470 
    471     plot.__doc__ = plot_dataframe.__doc__

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/plotting.pyc in plot_dataframe(df, column, cmap, color, ax, categorical, legend, scheme, k, vmin, vmax, figsize, **style_kwds)
    395     if column is None:
    396         return plot_series(df.geometry, cmap=cmap, color=color, ax=ax,
--> 397                            figsize=figsize, **style_kwds)
    398 
    399     if df[column].dtype is np.dtype('O'):

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/geopandas/plotting.pyc in plot_series(s, cmap, color, ax, figsize, **style_kwds)
    270         style_kwds['vmax'] = style_kwds.get('vmax', values.max())
    271 
--> 272     geom_types = s.geometry.type
    273     poly_idx = np.asarray((geom_types == 'Polygon')
    274                           | (geom_types == 'MultiPolygon'))

/Users/jan/anaconda3/envs/cartopy/lib/python2.7/site-packages/pandas/core/generic.pyc in __getattr__(self, name)
   4370             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   4371                 return self[name]
-> 4372             return object.__getattribute__(self, name)
   4373 
   4374     def __setattr__(self, name, value):

AttributeError: 'Series' object has no attribute 'geometry'

In [183]: db.data.head()
Out[183]: 
          Source                                         \
               A         B         C         D        E   
1+e  #                                                    
1.01 0  123334.0   81637.6  262758.0  105860.0  24666.9   
     1  126234.0   97052.0  261232.0   91970.2  25246.8   
     2  118364.0  121797.0  268455.0   94535.2  23672.8   
     3  123334.0  120658.0  278210.0   69905.6  24666.9   
     4  126052.0  127851.0  283827.0   61846.1  25210.4   

                                                    Paths  \
                                                 Geometry   
1+e  #                                                      
1.01 0  LINESTRING (1234.8809442278 5504.880930993706,...   
     1  LINESTRING (1234.8809442278 5504.880930993706,...   
     2  LINESTRING (1234.8809442278 5504.880930993706,...   
     3  LINESTRING (1234.8809442278 5504.880930993706,...   
     4  LINESTRING (1234.8809442278 5504.880930993706,...   

                                                 geometry  

1+e  #                                                     
1.01 0  LINESTRING (1234.8809442278 5504.880930993706,...  
     1  LINESTRING (1234.8809442278 5504.880930993706,...  
     2  LINESTRING (1234.8809442278 5504.880930993706,...  
     3  LINESTRING (1234.8809442278 5504.880930993706,...  
     4  LINESTRING (1234.8809442278 5504.880930993706,...  

In sum, it seams that 'pandas.MultiIndex' is only "half-supported" and that the geometry column has to be named 'geometry' in such a case to operate at least on the supported functionalities.

All 3 comments

@J4nJ4nsen Thanks for the report!
That indeed seems buggy, I suppose not many people have already used geopandas with multi-indexed columns.

Investigations and PRs with fixes are certainly welcome.

What would also already be helpful is to convert the above code examples in reproducible (copy pastable) examples (eg the original db.data is not available. I suppose that manually creating a small toy geodataframe will just as well show the problem). That will make it easier for somebody to run the code and inspect the problems, and then the example can later be used to add test cases.

I've run into these issues too when using long-form space-time data (e.g. with @knaaptime's data), though in multi-column (not index) ways (#722)

A simple way to mimick this is:

import geopandas
import pandas

source = geopandas.read_file(geopandas.datasets.get_path('nybb'))

long = pandas.concat((source.assign(year=2010), source.assign(year=2020)))

wide = long.pivot(columns='year', values=['geometry','BoroName'])

Now, you get some strange behavior

type(wide) # is GeoDataFrame
wide.plot() # errors on AttributeError in block manager
wide.geometry # errors on AttributeError in GeoSeries about `_name`
wide[('geometry',2010)].plot() # fails, since no sub-series is still a GeoSeries
wide.set_geometry(('geometry',2010), inplace=True) # works, but doesn't solve:
wide.plot() # because it's still a series, not GeoSeries. 

When I put this down into multi-index (single column) I don't get too many issues, though:

long.index = pandas.MultiIndex.from_tuples(zip(long.BoroName, long.year))
long.plot() # plots all geometries
long.to_crs(epsg=5070) # completes, appears to be correct

Not sure what the appropriate resolution to these various things are in general though, like in terms of what happens in long.to_file() or wide.to_file().

Hi, also running into this error.

Was this page helpful?
0 / 5 - 0 ratings