[x] I have searched the [pandas] tag on StackOverflow for similar questions.
[x] I have asked my usage related question on StackOverflow.
import pandas as pd
def test_func(row):
row['c'] = str(row['a']) + str(row['b'])
row['d'] = row['a'] + 1
return row
df = pd.DataFrame({'a': [1,2,3], 'b': ['i','j', 'k']})
df.apply(test_func, axis=1)
The above code ran on pandas 1.1.0 returns:
a b c d
0 1 i 1i 2
1 1 i 1i 2
2 1 i 1i 2
While in pandas 1.0.5 it returns:
a b c d
0 1 i 1i 2
1 2 j 2j 3
2 3 k 3k 4
Using python 3.8.3 and IPython 7.16.1.
:question: What is the right way of getting the v1.0.5 behavior in v1.1.0?
I did see this release note but honestly can't figure out if this is an intended/unintended side effect of it: https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.1.0.html#apply-and-applymap-on-dataframe-evaluates-first-row-column-only-once
thanks
In great generality, one should not mutate containers when iterating over them.
def test_func(row):
row = row.copy()
row['c'] = str(row['a']) + str(row['b'])
row['d'] = row['a'] + 1
return row
gives
a b c d
0 1 i 1i 2
1 2 j 2j 3
2 3 k 3k 4
Of course, the vectorized version of this will be much faster:
````
%%timeit
df['c'] = df['a'].astype(str) + df['b']
df['d'] = df['a'] + 1
````
gives 564 碌s 卤 5.97 碌s per loop whereas your version is 5.34 ms 卤 16.9 碌s per loop.
Thanks @manihamidi for the report. Same issue as #35462 so closing as duplicate.
Most helpful comment
In great generality, one should not mutate containers when iterating over them.
def test_func(row): row = row.copy() row['c'] = str(row['a']) + str(row['b']) row['d'] = row['a'] + 1 return rowgives
a b c d 0 1 i 1i 2 1 2 j 2j 3 2 3 k 3k 4Of course, the vectorized version of this will be much faster:
````
%%timeit
df['c'] = df['a'].astype(str) + df['b']
df['d'] = df['a'] + 1
````
gives
564 碌s 卤 5.97 碌s per loopwhereas your version is5.34 ms 卤 16.9 碌s per loop.