def test_unary():
df = pd.DataFrame({'x': np.array([0.11, 0], dtype=np.float32)})
res = df.eval('(x > 0.1) | (x < -0.1)')
assert np.array_equal(res, np.array([True, False])), res
This is related to #11235.
on python 3.6, pandas 20.1, this raises an error the traceback ends with:
File ".../envs/py3/lib/python3.6/site-packages/pandas/core/computation/expr.py", line 370, in _maybe_downcast_constants
name = self.env.add_tmp(np.float32(right.value))
AttributeError: 'UnaryOp' object has no attribute 'value'
In that case the right is -(0.1)
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.8.0-49-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.20.1
pytest: None
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 6.0.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
pandas_gbq: None
pandas_datareader: None
Another example:
>>> df = pd.DataFrame({'x':[1,2,3,4,5]})
>>> df.eval('x.shift(-1)')
I am looking at this as part of the PyCon2017 sprints
Not really a fix. But if you need a workaround just use float64.
Worked for me.
Using float64 does not work for me, and in any case does not address that attribute value is being sought from UnaryOp.
Left the sprints early, but looked in to this and realised I don't understand the Pandas Op class behaviour well enough.
The problem is that UnaryOp returns True for isscalar, which on first inspection seems a little strange. Also any descendent of Op (e.g. BinaryOp) also returns True for isscalar, in similar circumstances. This is because of the following in the Op class:
@property
def isscalar(self):
return all(operand.isscalar for operand in self.operands)
Seems like incorrect behaviour to me. If I make isscalar simply return False, then the problem here is fixed, but I have little idea of the far reaching consequences of such a change. I searched for all references to isscalar through the core code-base and it seems that it is only called in this method and one other, so perhaps there is little problem.
Does anyone have any thoughts on this?
I've run the test suite with isscalar set to False in the Op class, and it doesn't seem to break anything. In my opinion I think someone got the notion of what a scalar in this case confused with the notion of a scalar in terms of numpy arrays, somewhere along the way. I think only objects of type Term and descendants should return True for isscalar.
Any thoughts?
A smaller version of the original test case is:
def test_unary():
df = pd.DataFrame({'x': np.array([0], dtype=np.float32)})
res = df.eval('x < -0.1')
assert np.array_equal(res, np.array([False])), res
Note that it's not just a problem with np.float32, it also fails with string data (which is my original use case that motivated #16833):
def test_unary():
df = pd.DataFrame({'x': ["one", "two"]})
df.eval('x.shift(-1)')
Agreed. It is not just np.float32 that is causing the trouble.
I think that my suggested fix is the correct way forward, having run the full test suite and seen no problems, and thinking about how the design notionally should work. I believe someone got confused with the notion of isscalar from numpy - that an expression shouldn't be considered a "scalar" just because it returns scalar values as opposed to array/list values, versus the idea here which should be a test whether the expression is actually a scalar as opposed to an expression that could be further broken down or an op.
Hi,
I am wondering if this is resolved? I'm running into a similar issue using pandas df.query() with negative numbers.
Thank you!
@ksw9 I'll submit a fix for this. That way at least a moderator will have to respond.
Great, thank you!
Would it be possible to update this thread if this has been fixed? Thanks again!
@james-nichols there might be a problem with your approach though. It seems doing your change would completely skip over this section of code which would downcast the type of the unary term to float32 and would result in a series of dtype of float32. With your changes the result would be of dtype of float64.
With the silly fix I suggested in #19697 (self.value = operand.value), the return type would be float32 which seems what was intended, but the results are wrong (the negative is ignored)
Neither though seems to solve #16833. Setting the isscalar to False would just push the error further down the line. Add self.value = operand.value pushes the code further along and it will instead error out with TypeError: 'Series' objects are mutable, thus they cannot be hashed
I ran into this recently and would like to help with a patch. As best I can tell, the problem is that _maybe_downcast_constants not only tries to downcast constants but also UnaryOp's, which isn't possible, since UnaryOp instances don't have a value attribute like constants/scalars do.
I am new to the pandas code, and the expressions code is a bit tricky, but I think we could catch the AttributeError in _maybe_downcast_constants or explicitly check in each case that left or right has the attribute value.
In short, the problem is that an operation like df.eval(x < -.1) fails when x is a np.float32 because the right side of the equation is seen as a UnaryOp node instead of as a np.float32 and is subjected to _maybe_downcast_constants by visit_BinOp. OTOH, df.eval(x < @y) works when y = -.1, because pandas doesn't have to parse it. I think a small change might fix this, but I could be overlooking something bigger and would appreciate feedback.
I just wanted to mention that this issue still remains in 0.24.1. I just ran into it.
best way to fix is to submit a PR
there are 2800 other issues
Most helpful comment
I ran into this recently and would like to help with a patch. As best I can tell, the problem is that
_maybe_downcast_constantsnot only tries to downcast constants but alsoUnaryOp's, which isn't possible, sinceUnaryOpinstances don't have avalueattribute like constants/scalars do.I am new to the pandas code, and the expressions code is a bit tricky, but I think we could catch the
AttributeErrorin_maybe_downcast_constantsor explicitly check in each case thatleftorrighthas the attributevalue.In short, the problem is that an operation like
df.eval(x < -.1)fails whenxis anp.float32because the right side of the equation is seen as aUnaryOpnode instead of as anp.float32and is subjected to_maybe_downcast_constantsbyvisit_BinOp. OTOH,df.eval(x < @y)works wheny = -.1, because pandas doesn't have to parse it. I think a small change might fix this, but I could be overlooking something bigger and would appreciate feedback.