Julia: function to choose element at random from an array

Created on 11 May 2013  Â·  16Comments  Â·  Source: JuliaLang/julia

I think a useful utility function is to chose a random element from an array. Python has this in random.choice. Here is what a Julia implementation might look like:

function choice(a::Array)
    n = length(a)
    idx = mod(rand(Uint),n)+1
    return a[idx]
end

An option would be to have an additional argument for drawing some number of samples. This would be sampling with replacement from an array with uniform probability.

Another useful function could be to have sampling without replacement.

Most helpful comment

rand(a::Array) works after https://github.com/JuliaLang/julia/pull/9049.

An efficient implementation for Set (and Dict?) seems non-trival, and a PR might be accepted. In the mean time you can use rand(collect(set)).

All 16 comments

Using idx = rand(1:n) would be a better solution, as using mod can't guarantee randomness, IIRC.

Sampling with replacement already exists in the Stats package as the randsample function. Adding sampling without replacement is an ongoing issue.

Yes I think a good way to do this is just a[rand(1:end)].

I think you mean a[1:rand(1:end)]

I've seen a handful of @JeffBezanson coding slip ups. This is not one of them ;-)

a[1:rand(1:end)] will produce a random-length prefix of a, rather than a random sample from a.

Since this issue is very old, I just wanted to updated @johnmyleswhite's comment for 2014: sampling is in StatsBase.jl and the sample function does some very clever sampling.

How does StatsBase.sample work with dataframe? I know I can convert it to
multi-dimensional array first but I'd like to keep my original
structure/headers.

On Fri, Sep 12, 2014 at 3:03 PM, Iain Dunning [email protected]
wrote:

Since this issue is very old, I just wanted to updated @johnmyleswhite
https://github.com/johnmyleswhite's comment for 2014: sampling is in
StatsBase.jl and the sample function does some very clever sampling.

—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/3075#issuecomment-55446642.

Just ask for a subset of rows generated by sampling from 1:size(df, 1).

-- John

On Sep 12, 2014, at 1:12 PM, arshak [email protected] wrote:

How does StatsBase.sample work with dataframe? I know I can convert it to
multi-dimensional array first but I'd like to keep my original
structure/headers.

On Fri, Sep 12, 2014 at 3:03 PM, Iain Dunning [email protected]
wrote:

Since this issue is very old, I just wanted to updated @johnmyleswhite
https://github.com/johnmyleswhite's comment for 2014: sampling is in
StatsBase.jl and the sample function does some very clever sampling.

—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/3075#issuecomment-55446642.

—
Reply to this email directly or view it on GitHub.

@SamChill This does not work for non-indexable collections, like a Set. What's an efficient way to get a random element out of a Set?

rand(a::Array) works after https://github.com/JuliaLang/julia/pull/9049.

An efficient implementation for Set (and Dict?) seems non-trival, and a PR might be accepted. In the mean time you can use rand(collect(set)).

choose(xs, n) = xs[randperm(end)][1:n]

@undwad That seems inefficient, particularly when n is small compared to length(xs).

Please don't necropost on old, resolved issues.

Yes I think a good way to do this is just a[rand(1:end)].
@JeffBezanson
Thanks, it worked.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Keno picture Keno  Â·  3Comments

i-apellaniz picture i-apellaniz  Â·  3Comments

StefanKarpinski picture StefanKarpinski  Â·  3Comments

ararslan picture ararslan  Â·  3Comments

musm picture musm  Â·  3Comments