Pandas: ENH: add option to suppress scientific notation (for small values?)

Created on 18 Feb 2016  路  5Comments  路  Source: pandas-dev/pandas

I find myself running into a situation where I don't want to see small numbers as scientific notation fairly frequently, things like:

In [3]: pd.set_option('display.precision', 2)

In [4]: pd.DataFrame(np.random.randn(5, 5)).corr()
Out[4]: 
      0     1         2         3     4
0  1.00 -0.57  2.15e-02 -3.48e-02 -0.64
1 -0.57  1.00  2.59e-01 -5.56e-01  0.51
2  0.02  0.26  1.00e+00  2.91e-03 -0.06
3 -0.03 -0.56  2.91e-03  1.00e+00  0.36
4 -0.64  0.51 -6.21e-02  3.63e-01  1.00

or

In [16]: pd.Series(np.random.poisson(size=1000)).value_counts(normalize=True)
Out[16]: 
0    3.80e-01
1    3.63e-01
2    1.75e-01
3    5.70e-02
4    1.80e-02
5    5.00e-03
7    1.00e-03
6    1.00e-03
dtype: float64

Scientific notation isn't helpful when you are trying to make quick comparisons across elements, and have a well-defined notion of a -1 to 1 or 0 to 1 range.

I propose adding some sort of display flag to suppress scientific notation on small numbers, and just report zeros in these cases instead. Alternatively we could also suppress it on large numbers, but I am not sure how helpful that is. I usually only find myself going up against it on small numbers, in exactly the use cases (correlations, proportions) above.

API Design Clean Output-Formatting

Most helpful comment

(and I volunteer to work on this if others are okay with the idea)

All 5 comments

(and I volunteer to work on this if others are okay with the idea)

http://pandas.pydata.org/pandas-docs/stable/options.html#number-formatting

there are already 4 related options to do things like this:
display.precision, display.chop_threshold, display.float_format, and pd.set_eng_float_format(accuracy=3, use_eng_prefix=True).

So what I think we need is some consolidation and maybe some docs.

love for you to have a look to see how this can be done better

Hmm, embarrassing that I hadn't seen chop_threshold before, I've made changes to display.precision and edited its docs and yet not seen this. That sounds like what I want, though I can still get it to behave poorly:

In [25]: pd.set_option('display.precision', 2)
In [26]: pd.set_option('chop_threshold', 0.01)  # maybe this should be 0.005, not sure of order of operations, but I get issues either way
...
In [30]: pd.DataFrame(np.random.randn(5, 5)).corr()
Out[30]: 
      0         1     2     3         4
0  1.00 -3.14e-01  0.07 -0.28  1.42e-01
1 -0.31  1.00e+00 -0.82 -0.35  0.00e+00
2  0.07 -8.19e-01  1.00  0.54 -4.71e-01
3 -0.28 -3.50e-01  0.54  1.00  1.21e-01
4  0.14  0.00e+00 -0.47  0.12  1.00e+00

Thanks for pointing me to it though. I'll play around with this for a while and see if there's some clean-up that can be done. I would love it I could change display.precision while working on some data and have the chop_threshold update to match rather than having to keep them in sync.

Was this page helpful?
0 / 5 - 0 ratings