Pandas: Add order_by parameter to pandas.Series.value_counts()

Created on 25 Jul 2013  路  4Comments  路  Source: pandas-dev/pandas

If possible, it would be really useful to allow for an extra parameter that can override the default behaviour of the pandas.Series.value_counts() function to allow for sorting by by another series. we are looking to push the result out to a bar plot, and it makes sense to have the category (x-axis) displayed in a particular order.

Thanks!

Most helpful comment

Why can't you just sort the results yourself after getting the value counts? That's going to be exactly the same as an internal function...

eg.

dat = pandas.Series([1,1,1,2,2,3,3,3,3,4,4,4,5,5,5,5,6])
fig = plt.figure(figsize=(6,6))
freq= dat.value_counts().sort_index()

Bam, now it's sorted 1-6

All 4 comments

can you give an example of you want?

does order/sort_index not do what you want?

Here is an example of the code we are trying to run:

import pandas
import matplotlib.pyplot as plt
import numpy as np

dat = pandas.Series([1,1,1,2,2,3,3,3,3,4,4,4,5,5,5,5,6])
fig = plt.figure(figsize=(6,6))
freq= dat.value_counts()
width = 1.0
plt.ylabel('Frequency')
plt.xlabel('Items')
ax = plt.axes()
ax.set_xticks(np.arange(len(freq)) + (width / 2))
ax.set_xticks(freq)
freq.plot(kind='bar')
plt.show()

You will see that items appear on the x-axis in order 5,3,4,1,2,6 - as opposed to 1,2,3,4,5,6.

Maybe its possible to somehow incorporate the collections module code into the pandas value_counts() routine; the values() and keys() from collections are exactly what is needed for the y and x axes respectively.

import collections
a = [1,1,4,4,2,3,4,1,5,1,4,2,2,4,2,2,3,1,2,2,3,3,4,5,5]
a.sort()
print a
counter=collections.Counter(a)
print(counter) # ({2: 7, 4: 6, 1: 5, 3: 4, 5: 3})
print(counter.keys()) # [1, 2, 3, 4, 5]
print(counter.values()) # [5, 7, 4, 6, 3]
print(counter.most_common(3)) # [(2, 7), (4, 6), (1, 5)]

Why can't you just sort the results yourself after getting the value counts? That's going to be exactly the same as an internal function...

eg.

dat = pandas.Series([1,1,1,2,2,3,3,3,3,4,4,4,5,5,5,5,6])
fig = plt.figure(figsize=(6,6))
freq= dat.value_counts().sort_index()

Bam, now it's sorted 1-6

Yes, that works,. My apologies - I was told by the programmer that they had tried it and it did not work.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

MarkiesFredje picture MarkiesFredje  路  42Comments

rvernica picture rvernica  路  46Comments

jreback picture jreback  路  61Comments

simonjayhawkins picture simonjayhawkins  路  53Comments

mpenning picture mpenning  路  48Comments