path1 = '/some.xls'
df1 = pd.read_excel(path1)
columns_values_map={
'positive': {
'姝i潰':1,
'涓珛': 1,
'璐熼潰':0
}
}
df1.replace(columns_values_map)
got error: TypeError: Cannot compare types 'ndarray(dtype=int64)' and 'unicode'
Actually df1['positive'] only has value in (0, 1) , but I think it should not throw exception here.
pls show a copy-pastable example, IOW construct df1 here.
It's simple
columns_values_map={
'positive': {
'姝i潰':1,
'涓珛': 1,
'璐熼潰':0
}
}
df1 = pd.DataFrame({'positive': np.ones(10)})
df1.replace(columns_values_map)
# TypeError: Cannot compare types 'ndarray(dtype=int64)' and 'unicode'
df2 = pd.DataFrame({'positive': ['姝i潰', '璐熼潰']})
df2.replace(columns_values_map)
# this work
I am using pandas to couple some excels with some common column but different value.
Now I have to use something like
for col, v_map in self.columns_values_map.items():
cats = df[col].astype('category')
cat_map = {k:v for k, v in v_map.items() if k in cats}
if cat_map:
df[col] = df[col].map(lambda x: cat_map[x])
This looks correct to me. You are trying to replace integers with string-likes, none of which match. Are you objecting over the error message?
FYI
In [35]: df1['positive'].map(columns_values_map['positive'])
Out[35]:
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
Name: positive, dtype: float64
Though for the reverse we let this pass
In [40]: df = DataFrame({'A': [1., 2.], 'B': ['foo', 'bar']})
In [41]: df.replace({'A':{20:1}})
Out[41]:
A B
0 1.0 foo
1 2.0 bar
@chris-b1 @jorisvandenbossche @TomAugspurger
comments?
For consistency, and since replace is a general purpose find / replace method, it'd be nice if this didn't raise a TypeError.
Kindly run above cells if you are using Jupyter notebook. I had same problem which shorted out by that.
I'm having the same problem. I say a flag like the one that to_numeric has would do great here.
code sample in https://github.com/pandas-dev/pandas/issues/16784#issuecomment-311563562 doesn't raise on master
>>> pd.__version__
'1.2.0.dev0+261.g9fea06cec'
>>>
>>> columns_values_map = {"positive": {"姝i潰": 1, "涓珛": 1, "璐熼潰": 0}}
>>> df1 = pd.DataFrame({"positive": np.ones(10)})
>>> df1.replace(columns_values_map)
positive
0 1.0
1 1.0
2 1.0
3 1.0
4 1.0
5 1.0
6 1.0
7 1.0
8 1.0
9 1.0
>>>
maybe fixed by #36093? cc @jbrockmendel
Most helpful comment
I'm having the same problem. I say a flag like the one that
to_numerichas would do great here.