Set-up: To give some idea of the set-up, I have a 2-column pandas DataFrame, called german_cars_price. The first column of this dataframe is labeled state and denotes the relevant German federal state. The second column is labeled price and refers to the average used car sales price in that German federal state from a sample of ads on eBay.
I also have a GeoJSON file, data/federal_state_borders.geojson, which gives the boundaries of the German federal states, and I have imported that into Python using geopandas, to get a (geo) DataFrame called federal_state_borders. The column in this DataFrame identifying the German federal states is called VAR_NAME1. It is a modified version of the geoJSON found here (my translations of the names into English are different).
For the full set-up of a minimal working example producing this problem, see here: https://github.com/krinsman/minimal_working_example
If nothing else, hopefully GitHub's display of the workbook can give someone a better idea of the structure of the relevant data, if that is necessary to understand the question.
The environment.yml and the Makefile to create the conda virtual environment from it is included for completeness, but it seems unlikely that this would be necessary in all cases.
Problem: Trying to create a choropleth map using this data leads to the error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
This seems to be an issue involving how the color_fill attribute is interpreted.
Here is the minimal working example of the code:
import pandas as pd
import numpy as np
import folium
import geopandas as gpd
import os
german_cars_price=pd.read_csv('data/german_cars_price.csv',index_col=0)
federal_state_borders = gpd.read_file(os.path.join('data', 'federal_state_borders.geojson'))
m = folium.Map(location=[52.5194, 13.4067], tiles='Mapbox Bright', zoom_start=5)
m.choropleth(geo_data=federal_state_borders, key_on=federal_state_borders.VARNAME_1, data=german_cars_price, columns=['state', 'price'], fill_color='YlGn')
It seems fairly likely of course that I am doing something incorrectly, but it is unclear to me based on the documentation.
Caveat lector: this is probably my mistake as an end-user rather than a bug of the software.
I have not found the problem documented anywhere, and the example in the documentation is too sparse (it does not specify the required formats of the data, or show what the structure of the arguments being passed to choropleth() in that call are) to deduce a solution. In particular, what feature is supposed to refer to, in either the example or in folium.py, is beyond me.
Hi,
I replicated your problem, thanks for the good example. It seems you didn't use the key_on parameter correctly. You provided it with federal_state_borders.VARNAME_1, which is a Pandas Series object, while it expects a string. Your example works if you use key_on='feature.properties.VARNAME_1'. Hope that helps!
Not sure if we should change something about Folium. It's not convenient to test if the on_key fits on geo_data, since the latter is initiated after the style_function is made. Maybe throw an exception if on_key is not a string and doesn't start with 'feature' as the docstring demands?
That make sense to me. It's also like other pandas-based visualization packages, like Seaborn or Plotnine, which take column-name arguments as strings.
My confusion was mostly that it was unclear to me which DataFrame's column was supposed to be keyed on (geo_data or data?) and also what 'features' or 'properties' or 'features.properties' refers to, since the DataFrames don't have any attributes with any of those names.
But honestly this probably would make as much sense as a change/addition to the documentation as it would to TypeError/ValueError checks in choropleth().
Also it was my fault that I didn't think to check the docstring where all of this is explained, so again in hindsight I'm not sure if there's really an issue at all.
Thanks for commenting on this, I updated the PR.
Done in #797
Most helpful comment
Hi,
I replicated your problem, thanks for the good example. It seems you didn't use the
key_onparameter correctly. You provided it withfederal_state_borders.VARNAME_1, which is a Pandas Series object, while it expects a string. Your example works if you usekey_on='feature.properties.VARNAME_1'. Hope that helps!Not sure if we should change something about Folium. It's not convenient to test if the
on_keyfits ongeo_data, since the latter is initiated after thestyle_functionis made. Maybe throw an exception ifon_keyis not a string and doesn't start with 'feature' as the docstring demands?