# Your code here
>>>import pandas as pd
>>>a=pd.Series([0,1,2,3,4])
>>> a.pct_change()
0 NaN
1 inf
2 1.000000
3 0.500000
4 0.333333
dtype: float64
This is fraction change and not percent change.
Dictionary definition of percent is
"One part in every hundred."
Either change the documentation to state that it is fractional change and not percent change with the formula (for backward compatibility) or rename it (to get rid of the obvious error).
a.pct_change()*100
Val
2017-12-31 NaN
2018-12-31 100.000000 %
2019-12-31 50.000000 %
2020-12-31 33.333333 %
2021-12-31 25.000000 %
pd.show_versions()
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Darwin
OS-release: 17.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.22.0
pytest: 3.5.0
pip: 10.0.0
setuptools: 39.0.1
Cython: None
numpy: 1.14.2
scipy: 1.0.1
pyarrow: None
xarray: None
IPython: 6.3.1
sphinx: None
patsy: None
dateutil: 2.7.2
pytz: 2018.4
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.6.0
[
How would you re-phrase the first sentence? Do the examples not make it clear?
Hi TomAugspurger,
Are you asking about re-phrasing the title or the first sentence of the Problem Description in my post?
Which examples are you referring to?
The first sentence & examples in the docstring:
http://pandas-docs.github.io/pandas-docs-travis/generated/pandas.Series.pct_change.html
On Thu, Apr 19, 2018 at 9:06 PM, sunilshah notifications@github.com wrote:
Hi TomAugspurger,
Are you asking about re-phrasing the title or the first sentence of the
Problem Description in my post?Which examples are you referring to?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/pandas-dev/pandas/issues/20752#issuecomment-382943291,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABQHIt3woWkn7RnlSOmmMPcZbkk89FOIks5tqUKjgaJpZM4Tcu9g
.
I would change "Percent Change" to "Fractional Change" for every occurrence.
The examples do imply fractional changes. When I looked at
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pct_change.html
there were no examples.
I haven't heard the term "Fractional Change" before, is it standard?
The docs I linked to are the developer docs, with a recently improved
docstring. The ones you saw were (understandably) for the currently
released version.
Are the improved docs sufficiently clear?
On Thu, Apr 19, 2018 at 9:15 PM, sunilshah notifications@github.com wrote:
I would change "Percent Change" to "Fractional Change" for every
occurrence.The examples do imply fractional changes. When I looked at
https://pandas.pydata.org/pandas-docs/stable/generated/
pandas.DataFrame.pct_change.htmlthere were no examples.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/pandas-dev/pandas/issues/20752#issuecomment-382945266,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABQHIkHqnUs8TbQ4_uBDqpF9wwpi5cCHks5tqUS-gaJpZM4Tcu9g
.
The dictionary meaning of fractional (adjective) is precisely that: pertaining to or being a fraction.
http://www.dictionary.com/browse/fractional
https://www.merriam-webster.com/dictionary/fractional
It could also be called relative change. Percent change implies 100 times the fractional or relative change. Therefore the name and the description are misleading.
The improved docs are better than the original because of the examples. The user still has to go through the numbers in the examples to compute and figure out that the author really meant fractions and not percentage, even though it mentions percent eight times. IMHO it should be corrected.
this is the first time anyone has brought this up, even though this method has existed since the beginning of pandas. this is splitting hairs about the semantic meaning. This is a won't fix.
The only reason to consider a change (to the docs), is to help new users learn faster. That was my motivation to spend my time filing the request.
Thanks. I think we're OK with the current docs for now. We have this issue
as a record in case someone else is confused, and we can change it then.
On Fri, Apr 20, 2018 at 10:27 AM, sunilshah notifications@github.com
wrote:
The only reason to consider a change (to the docs), is to help new users a
faster ramp in using pandas. That was my motivation to spend my time filing
the request.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/pandas-dev/pandas/issues/20752#issuecomment-383132730,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABQHIjb3wxCRbOUbjX9YcjpyKkOYkI3Mks5tqf5pgaJpZM4Tcu9g
.
For what it's worth, I was calculating percent differences in a class today and had a student use this function. They neglected to multiply by 100 because they didn't check the docs (and I was surprised myself at the behavior after reading them). I used it as a good teachable moment for why you should _always_ check the docs, but I don't think that's a particularly good reason for keeping it that way. I volunteer to make the change if you're open to it, but I have to agree with sunilshah here that the standard definition of percent change is out of 100.
Yup, the description should specify that it gives fractional output instead of the expected percentage change output.
We just came across this too, when a student asked why the values weren't percents. I can't say why people have not complained, but it's very confusing to see that the method is called pct_change
and there is "percentage change" through the docstring, when in fact it is fractional change. In particular you have:
Percentage change between the current and a prior element.
Computes the percentage change from the immediately previous row by
default. This is useful in comparing the percentage of change in a time
series of elements.
This is just incorrect. It should read:
Fractional change between the current and a prior element.
*Be careful* - despite the name of this method, it calculates fractional
change and not percentage change.
Computes the fractional change from the immediately previous row by
default. This is useful in comparing the fraction of change in a time
series of elements.
If you need the percentage change, multiply these values by 100.
All the other instances of "percent" should be "fraction".
I agree that this is extremely confusing. At least the docs should be changed. The current Doc says thus, which is very much misleading. Percentage is scaled to 100.
`Percentage change between the current and a prior element.
Computes the percentage change from the immediately previous row by default. This is useful in comparing the percentage of change in a time series of elements.`
I also find this highly misleading as this function simply does not involve any percentage. I end up not using it as it makes my code less understandable.
Another name which may be more common and intuitive than fractional change would be relative change.
@charron - of course I completely agree, but I'm afraid we didn't get anywhere when I tried bringing this up on the list - starting here. The argument against doing anything about this appears to be, that when some presumably large proportion of Pandas users see a 'pct_change' of 1.0, from the values [1, 2], they instinctively think of this as being a proportion (rather than a percent), and they regard the actual number - 1.0 - as representing 100%, just not displayed as such. I think. But you're probably best reading it yourself, I doubt I'm doing the argument justice.
@matthew-brett thank you for linking the extended discussion in the mailing-list, I was not aware of it. Maybe this is a matter of background as one poster mentioned. I am only surprised that this issue did not get more traction, which unfortunately lends support to the argument against doing anything.
It seems obvious to me that there's at least a significant proportion of users who find this behavior somewhere between surprising and clearly incorrect. As someone who uses and teaches Pandas a lot, has contributed some commits and the original wheel build machinery, and wants it to succeed, it's sad for me that this one doesn't have a better solution than "It's fine as it is".
Maybe this should go into a separate issue, but I think in addition to the name of pct_change
, the formula on my interpretation is not actually correct for negative values:
In [2]: pd.Series([-2, -1]).pct_change()
Out[2]:
0 NaN
1 -0.5
dtype: float64
I would expect a change from -2 to -1 to be a 50% increase, not a decrease. But perhaps there is a scenario where the negative value is what is desired?
Most helpful comment
We just came across this too, when a student asked why the values weren't percents. I can't say why people have not complained, but it's very confusing to see that the method is called
pct_change
and there is "percentage change" through the docstring, when in fact it is fractional change. In particular you have:This is just incorrect. It should read:
All the other instances of "percent" should be "fraction".