If possible, it would be really useful to allow for an extra parameter that can override the default behaviour of the pandas.Series.value_counts()
function to allow for sorting by by another series. we are looking to push the result out to a bar plot, and it makes sense to have the category (x-axis) displayed in a particular order.
Thanks!
can you give an example of you want?
does order/sort_index
not do what you want?
Here is an example of the code we are trying to run:
import pandas
import matplotlib.pyplot as plt
import numpy as np
dat = pandas.Series([1,1,1,2,2,3,3,3,3,4,4,4,5,5,5,5,6])
fig = plt.figure(figsize=(6,6))
freq= dat.value_counts()
width = 1.0
plt.ylabel('Frequency')
plt.xlabel('Items')
ax = plt.axes()
ax.set_xticks(np.arange(len(freq)) + (width / 2))
ax.set_xticks(freq)
freq.plot(kind='bar')
plt.show()
You will see that items appear on the x-axis in order 5,3,4,1,2,6 - as opposed to 1,2,3,4,5,6.
Maybe its possible to somehow incorporate the collections
module code into the pandas value_counts()
routine; the values() and keys() from collections are exactly what is needed for the y and x axes respectively.
import collections
a = [1,1,4,4,2,3,4,1,5,1,4,2,2,4,2,2,3,1,2,2,3,3,4,5,5]
a.sort()
print a
counter=collections.Counter(a)
print(counter) # ({2: 7, 4: 6, 1: 5, 3: 4, 5: 3})
print(counter.keys()) # [1, 2, 3, 4, 5]
print(counter.values()) # [5, 7, 4, 6, 3]
print(counter.most_common(3)) # [(2, 7), (4, 6), (1, 5)]
Why can't you just sort the results yourself after getting the value counts? That's going to be exactly the same as an internal function...
eg.
dat = pandas.Series([1,1,1,2,2,3,3,3,3,4,4,4,5,5,5,5,6])
fig = plt.figure(figsize=(6,6))
freq= dat.value_counts().sort_index()
Bam, now it's sorted 1-6
Yes, that works,. My apologies - I was told by the programmer that they had tried it and it did not work.
Most helpful comment
Why can't you just sort the results yourself after getting the value counts? That's going to be exactly the same as an internal function...
eg.
Bam, now it's sorted 1-6