When I am trying to do rows.sort_index(inplace = True)
I get SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
rows = rows.sort_index()
can be used as a workaround to avoid the warning.
Pandas version 0.17.0 installed by pip.
Can you share an example? I wasn't able to reproduce:
In [4]: df = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'c']}, index=[1, 0, 2])
In [5]: df
Out[5]:
A B
1 1 a
0 2 b
2 3 c
In [6]: df.sort_index(inplace=True) # no warning
I'm guessing your rows
is a slice from another DataFrame?
rows
is created by filtering from a different DataFrame. This is a more complete example. sort_index(inplace = True)
works for rows1 without warning, but produces warning for rows2.
initial_rows = pd.DataFrame([{'a': 1, 'b': 'x'}, {'a': 2, 'b': 'c'}])
rows = rows[initial_rows['b'] == 'x']
rows.sort_index(inplace = True)
@phaethon this of course won't work, you cannot sort inplace on something ELSE which is filtered. Furthermore this is not a good patten. never use inplace
In my code I actually update index after filtering with a function of one of the columns:
initial_rows = pd.DataFrame([{'a': 1, 'b': 'x'}, {'a': 2, 'b': 'c'}])
rows = rows[initial_rows['b'] == 'x']
rows.index = function_of_a(rows['a'])
rows.sort_index(inplace = True)
What would be suggested pattern of usage in this case?
rows = rows.sort_index()
will work. Using inplace
is tricky since you (the user) _has_ to be sure it's not a view. It's not always clear when reading the code whether this is the case. And using inplace=True
usually won't have any performance benefits, so it's just easier to always reassign rows = rows.sort_index()
.
Actually rows.sort_index(inplace = True)
works, too. Just displaying annoying warning.
If there was no such option inplace
mentioned in documentation, I would have used rows = rows.sort_index()
, but I was expecting some performance benefit. Adding additional comment on this in documentation certainly would help. Or removing inplace option if it brings no benefits.
Most helpful comment
Actually
rows.sort_index(inplace = True)
works, too. Just displaying annoying warning.If there was no such option
inplace
mentioned in documentation, I would have usedrows = rows.sort_index()
, but I was expecting some performance benefit. Adding additional comment on this in documentation certainly would help. Or removing inplace option if it brings no benefits.