Black: Math expressions and readability

Created on 20 Apr 2018  路  20Comments  路  Source: psf/black

It is not a bug, just a feedback. I start to use black for a project with math formula and I think black sometimes makes the code less readable.

Two simple examples:

Equations not fitting on one line

if True:
    if True:
        PK1_fft = np.real(
            + vx_fft.conj() * fx_fft
            + vy_fft.conj() * fy_fft
            + vz_fft.conj() * fz_fft
        )

is transformed by black -l 82 into

if True:
    if True:
        PK1_fft = np.real(
            +vx_fft.conj()
            * fx_fft
            + vy_fft.conj()
            * fy_fft
            + vz_fft.conj()
            * fz_fft
        )

The result is clearly less readable. It would be nice that black keeps the code as it is in such cases.

Spaces around pow operator

result = 2 * x**2 + 3 * x**(2/3)

is transformed by black into

result = 2 * x ** 2 + 3 * x ** (2 / 3)

which is less readable.

It seems to me it would make sense to allow "no space around **" because of the high precedence of the exponentiation (https://docs.python.org/3/reference/expressions.html#operator-precedence).

enhancement

Most helpful comment

People who participated in this discussion may be interested in issue https://github.com/psf/black/issues/1252, which unfortunately also got closed early. There, I argue that Black would be much more useful, if it was strict, but not obsessed with foolish consistency. I would welcome your comments.

All 20 comments

The first issue is solvable. The other issue isn't as this is incompatible with PEP 8.

(also: the first + is a unary operator so by PEP 8 it shouldn't have a space between it and the value; that will also not change; just skip the initial +)

From PEP 8 (https://www.python.org/dev/peps/pep-0008/#other-recommendations):

"If operators with different priorities are used, consider adding whitespace around the operators with the lowest priority(ies). Use your own judgment; however, never use more than one space, and always have the same amount of whitespace on both sides of a binary operator.

Yes:

i = i + 1
submitted += 1
x = x*2 - 1
hypot2 = x*x + y*y
c = (a+b) * (a-b)

No:

i=i+1
submitted +=1
x = x * 2 - 1
hypot2 = x * x + y * y
c = (a + b) * (a - b)

"

I think black should keep result = 2 * x**2 + 3 * x**(2/3) as it is in order to follow PEP 8.

@paugier, sorry, you're absolutely right! This is going to be a bit tricky to implement but I see the value of it.

Hm, I find the operator hugging recommendation in PEP 8 a bit hard to accept since operands might not be trivial or short. While I agree that this is clearly better than what Black is doing currently:

result = 2 * x**2 + 3 * x**(2/3)

...this is less readable than what Black is currently doing (even with syntax highlighting):

np.real(vx_fft.conj()*fx_fft + vy_fft.conj()*fy_fft + vz_fft.conj()*fz_fft)

Counter-example for hugging: and and or also have different priorities (and should be hugged) but they can't be for syntactical reasons. Nobody seems to mind that and if it's really unclear, we can always put additional parentheses that make it obvious to the reader.

So, operator hugging I think we will leave as is. But we'll be implementing proper line splitting for sure!

It gets better. Now that I look at it, this example has another additional complication:

result = 2 * x**2 + 3 * x**(2/3)

The division on this line doesn't get any whitespace because its priority is affected by parentheses and the fact that the parenthesized operation is itself an operand of a high-priority operator.

This looks like a very hard rule to properly formalize. It would for sure require sniffing whether operands are "trivial" and follow the priority graph.

Summing up, it doesn't look like we can implement this properly anytime soon. I agree the current auto-formatting isn't perfect.

Splitting by priorities is way easier to do properly and we'll do that.

One way to handle the operator hugging would be to preserve the user's choice of hugging-or-not, if it's consistent with precedence. Or maybe only for **. I don't know if it's a good way, but it's an option :-). (I guess it might lead to unfortunate results when you have someone who just leaves out all spaces and then lets black clean it up? I feel like I've reviewed PRs by those people.)

If operators with different priorities are used, consider adding whitespace around the operators with the lowest priority(ies).

I use Python for math, and this issue is blocking me from using Black. For example, here's a Black's diff of some code I might write:

-    y = 4*x**2 + 2*x + 1
+    y = 4 * x ** 2 + 2 * x + 1

I would prefer to keep the code I wrote. I find that formatting more readable and more suitable for my use.

I guess #fmt: off and #fmt: on are good enough. Most of my code isn't straight math.

Is this issue addressed? The commit that closed it seems to be about multiline expressions.

As far as operator hugging goes: would a conservative approach be reasonable? I imagine the operator hugging is relatively uncommon, so covering the most clear-cut cases and keeping the spaces otherwise would do most of the trick.

It seems to me that this issue is not fully addressed and that there is still a problem with the operator **. It has a so high precedence that adding spaces around it nearly always decrease the readability for math expressions involving ** with some other operators with lower precedence.

See https://github.com/python/black/issues/538#issuecomment-425352526

result = 2 * x**2 + 3 * x**(2/3)

would make sense, but already

result = 2 * x**2 + 3 * x**(2 / 3)

would be much better than

result = 2 * x ** 2 + 3 * x ** (2 / 3)

This example is actually too simple but for complicated math expressions, removing the spaces around ** really improves readability.

@ambv could you open this issue again?

I commented about the non-obvious nature of operator hugging above and in a few other issues. Nothing changed in this regard. The only thing I could probably do, would be to always hug operands to ** but that goes against the letter of PEP 8.

@ambv, I don't think that goes against PEP8, here I read:
Yes:

i = i + 1
submitted += 1
x = x*2 - 1
hypot2 = x*x + y*y
c = (a+b) * (a-b)

No:

i=i+1
submitted +=1
x = x * 2 - 1
hypot2 = x * x + y * y
c = (a + b) * (a - b)

"If operators of different priorities are used"

And there is no way for black to ever know whether there are other mathematical operators in the expression?

If it does know, then one could just always hug ** whenever there are other operators in the same statement or assignment.

I really like black and have switched to it exclusively. I don't like having to think about the formatting. It just does it. But I have a problem with the way it w

I do a lot of work with equations and would like to provide an example. I do a lot of prototyping in Jupyter for other developers to base there work off. It is more important for my code to look as close to the latex as possible to help minimize errors and make it easier to read - from a mathematics perspective. Not only for the developers but the other non-coder experts.

I was taught that there is no spacing around multiplication and division. It follows that there is no spacing around exponentiation because it is a shorthand way to express multiplication. According to the AMA (https://www.amamanualofstyle.com/view/10.1093/jama/9780195176339.001.0001/med-9780195176339-div1-230), there is indeed spacing between the explicit operators. I don't think that is a problem for relatively simple equations. More complex equations that is different.

Is there a way to add an option to Black that could be "Math" mode that would allow this type of formatting to be preserved? I suppose it could be a problem when applied to a module that contained a mixture of mathematical formalism and regular code, but I think that would be a trade-off for people that would use it.

As an example, here is a method that I wrote to calculate the material properties of rocks (NOTE: the variable names fall in-line with the field, and those are acceptable names and can easily be found in the literature)

Before Black:

def nu_k_model(rho=None, nu=None, K=None):
    """
    A function that will calculate all of the elastic properties of the rock given:
    - The density         (rho)
    - Poisson's Ratio     (nu)
    - Bulk Modulus        (K)


    Parameters
    ----------
    rho         - number - Rock density       (kg/m^3)
    nu          - number - Poisson's Ratio    (unitless) 
    K           - number - Bulk Modulus       (Pa)


    Returns
    -------
    A tupple containing the following, in that order:

    vp     - number - P-wave velocity    (m/s) 
    vs     - number - S-wave velocity    (m/s)
    gamma  - number - Velocity Ratio     (unitless)
    lambda - number - 1st Lam茅 Parameter (Pa) <- cannot use lambda as a name (reserved) using lame instead
    K      - number - Bulk Modulus       (Pa)
    E      - number - Young's Modulus    (Pa) 
    mu     - number - Shear Modulus      (Pa)
    nu     - number - Poisson's Ratio    (unitless) 
    M      - number - P-wave Modulus     (Pa)

    """    

    vp = np.sqrt((3*K*(1 - nu))/(rho*(1 + nu)))

    vs = np.sqrt(-1*(3*K*(2*nu - 1))/(2*rho*(nu + 1)))

    gamma = np.sqrt((2*nu - 2)/(2*nu - 1))

    lame = (3*K*nu)/(1 + nu)

    E = 3*K*(1 - 2*nu)

    mu = (3*K*(1 - 2*nu))/(2*(1 + nu))

    M = (3*K*(1 - nu))/(1 + nu)

    return vp, vs, gamma, lame, K, E, mu, nu, M 

Here it is after black:

def nu_k_model(rho=None, nu=None, K=None):

    vp = np.sqrt((3 * K * (1 - nu)) / (rho * (1 + nu)))

    vs = np.sqrt(-1 * (3 * K * (2 * nu - 1)) / (2 * rho * (nu + 1)))

    gamma = np.sqrt((2 * nu - 2) / (2 * nu - 1))

    lame = (3 * K * nu) / (1 + nu)

    E = 3 * K * (1 - 2 * nu)

    mu = (3 * K * (1 - 2 * nu)) / (2 * (1 + nu))

    M = (3 * K * (1 - nu)) / (1 + nu)

    return vp, vs, gamma, lame, K, E, mu, nu, M

Here is the latex for some of the equations. If you render them you'll notice very little space between the multiplication:

$V_{P} = \sqrt{\frac{3K(1 - \nu)}{\rho(1 + \nu)}}$   
$V_{S} = \sqrt{-\frac{3K(2\nu - 1)}{2\rho (\nu + 1)}}$
$\mu = \frac{3K(1-2\nu)}{2(1+\nu)}$

What I am trying to do is make sure that this:

vp = np.sqrt((3K(1 - nu))/(rho*(1 + nu)))

Looks as close to this as possible:
$V_{P} = \sqrt{\frac{3K(1 - \nu)}{\rho(1 + \nu)}}$

I understand the difficulties @ambv and is also true that PEP8 points this out in Other Recommendations , however the lack of readability of Black formatting when it comes to math formulas is quite annoying. Can you point out the part of the code that is handling this, maybe someone out of the many people interested in this matter will find a nice solution to it.

People who participated in this discussion may be interested in issue https://github.com/psf/black/issues/1252, which unfortunately also got closed early. There, I argue that Black would be much more useful, if it was strict, but not obsessed with foolish consistency. I would welcome your comments.

For anyone who's still interested in this, yapf added a knob in 0.26.0 (2019-02-08) which makes it remove spaces around arithmetic operators with higher precedence in each expression (ARITHMETIC_PRECEDENCE_INDICATION).

Was this page helpful?
0 / 5 - 0 ratings