I'm (consistently) getting a value is trying to be set on a copy of a slice... error on the following bit of code, and can't figure out why:
i['gender'] = i.gender.replace({'male':'m', 'female':'f'})
The full error:
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
"""Sympy configuration"""
/Users/Nick/anaconda/lib/python3.4/site-packages/spyderlib/widgets/externalshell/start_ipython_kernel.py:16: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
It seems dependent on something about the i DataFrame, but said DataFrame is way to big to include here (143 rows, 43 columns).
When I try and recreate this with a small example dataset, I don't get the error any more:
df = pd.DataFrame({'col1':['a','b','c'], 'col2':[0,1,2]})
df['col1'] = df.col1.replace({'b':'z', 'c':'z'})
Does anyone have any idea why this is happening?
OK, I guess I see why this is happening -- i.gender is generating a copy of that series and passing it to replace().
Perhaps this is infeasible, but is there any chance of refining this warning so it only comes up when it's likely to affect something? It pops up so often that it's not particularly informative of actual problems...
@nickeubank this warning is showing that you are doing something which might not work if for example you have different dtypes or perform operations in a different order
if u are getting the warning you should heed it
it rarely has false positives these days
@jreback OK, that's good to know, especially since it suggests there's something I don't understand. Would you mind letting me know what's incorrect/dangerous with the following?
i['gender'] = i.gender.replace({'male':'m', 'female':'f'})
this statement is fine
you are doing a selection before this
show your code up to this point
dfs = dict()
dfs['exec_2'] = pd.read_csv('state/StateExecutivesOnlyALLOffices_20150626.csv', encoding='mac_roman')
target = 'exec_2'
i = dfs[target]
# Thin the sample
execs = i.title.str.contains('(mayor|mayer|executive)')
i = i[execs]
i['level'] = 'local'
i['type'] = 'exec'
i['fname'] = i.firstname
i['lname'] = i.lastname
i['legalname'] = i.addressee
i['gender'] = i.gender.replace({'male':'m', 'female':'f'})
so after thin the sample
u need to put a .copy() to make t not s view
Ah, ok. Thanks!
Most helpful comment
so after thin the sample
u need to put a .copy() to make t not s view