Would come in handy for a number of a different applications from basic statistics, to understanding one's data, to estimating machine learning algorithmic load.
dupe of #2749
FYI, you can easily make your own and patch it in. Just put in your startup code / application.
def describe(self):
""" describe of a series """
l = [ ('nobs' , len(self.index)),
('valid' , self.count() ),
('mean' , self.mean() ),
('min' , self.min() ),
('max' , self.max() ),
('std' , self.std() ),
('10%' , self.quantile(0.10)),
('25%' , self.quantile(0.25)),
('50%' , self.median() ),
('75%' , self.quantile(0.75)),
('90%' , self.quantile(0.90)),
('skew' , self.skew() ),
('kurt' , self.kurt() ) ]
s = Series(dict(l), index = [ k for k, v in l ])
s[s.abs()<0.000001] = 0.0
return s
Series.describe = describe
median out-of-the-box would really be a great thing to have!
Not just for Series but also for GroupBy's.
Note that median is already in describe, as in 50%
@jorisvandenbossche indeed! Sorry, my bad. :baby:
Hijacking this issue to propose a variant (and steal more from dplyr)
Why not have a DataFrame.agg that is identical to DataFrame.groupby.agg, but with a "single group"? This also goes along with the new .resample.agg, yay for synergy. Basically it takes a function/str or list of functions/strs or a dict of column names to functions/str and aggregates accordingly.
Related to https://github.com/pydata/pandas/issues/1623
Most helpful comment
Note that median is already in describe, as in
50%