Go: proposal: x/exp/rand: add NewLockedSource

Created on 21 Jun 2018  路  19Comments  路  Source: golang/go

This related issue makes a case for math/rand exporting some implementation of Source that is concurrency safe.

Pursuant to that, this change exports LockedSource from golang.org/x/exp/rand. That seems like a nice start, however, with LockedSource's attributes still being unexported and with there being no constructor-like function for instantiating a new LockedSource, there remains no practical way to initialize variables of that type.

I want to propose the addition of a new package-level function NewLockedSource(seed uint64) Source in golang.org/x/exp/rand.

I'm happy to PR this myself if there's some consensus that this is a reasonable change.

Proposal Proposal-Hold

Most helpful comment

The NewPassword function shows this. It acquires the same mutex 32 times per call.

Good point. And that's where your Pool option is better.

I do, however, stand by my assertion that there being no out-of-the-box option that is both concurrency safe _and_ guaranteed not to be re-seeded by someone else is going to catch novice gophers unaware. I've been writing Go code for four years and when I first became aware of this, I was like, "Wut??"

Another way to go is to remove rand.Seed() (in Go 2). It should take care of your reseed concern. I don't think it can be used reliably anyway with multiple users of the global source.

猬嗭笍So much 鉂わ笍 for that idea. As long as the seed automatically is initialized with something reasonable (like Unix time nanoseconds or something) that could work. The only downside I could see to that is it might sometimes be desirable to use a hard-coded seed to ensure determinism when using rand across multiple runs of a test suite. But that is probably an outlying use case and not a sufficient impetus for keeping a dangerous package level function like rand.Seed() around.

All 19 comments

CC @robpike

This seems reasonable, but I'd like to hold off changing exp/rand's API too much until the various proposals about the future of the math/rand package are resolved.

This seems reasonable, but I'd like to hold off changing exp/rand's API too much until the various proposals about the future of the math/rand package are resolved.

@robpike, I'd like to start contributing. Can you help me understand the above? It seems to me that exp/rand like many other exp/... packages is an experiment that occurs in parallel to sorting out the future of some existing package. I guess what I don't understand is why it wasn't premature for you to create exp/rand while the future of math/rand gets hammered out, but it's premature to add one simple function to it.

EDIT: I'm worried the above sounds much more confrontational than intended. I'm genuinely trying to understand the process here so I can help.

x/exp/rand was created as a possible replacement for math/rand for Go 2. As such, x/exp/rand is intended to provide the same API as math/rand. Changing the API of x/exp/rand complicates the decision of whether to use it as a replacement for math/rand.

@ianlancetaylor that's a great explanation and it makes a lot of sense. Thank you.

I'll point out, however, that there are already changes to the API in exp/rand, such as LockedSource now being exported when it previously was not. Exporting it, however, has not resulted in any meaningful improvement because you still cannot initialize one. All I am proposing here is to correct a mistake. I'm not sure how that would complicate the future direction of exp/rand or math/rand.

Makes sense, but my point about holding off still holds. The interface and some internals for math/rand have also changed lately, and I would like to bring exp/rand into sync before moving on with exp/rand. I'll plan to do that soon.

Thanks @robpike. Is there anything I can do then to help you sync these packages and solidify their future direction?

It is part of the point of exp/rand to not have to lock the source, by making it cheap enough to have lots of them and high enough quality not to worry about correlations between sources. Putting a NewLockedSource API in the package goes against the idea for the package's model.

@robpike I've understood that that was a goal. I see how that makes it feasible to give a very large number of goroutines their _own_ Source, thus eliminating any contention. However, I think there is a very broad range of patterns not supported by that.

Consider, for instance, a discrete function such as the one below. This is already concurrency safe (but is at risk of some other code changing the seed) and is utilized concurrently by many, many goroutines.

package generator

const (
        lowerAlphaChars  = "abcdefghijklmnopqrstuvwxyz"
    upperAlphaChars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    numberChars       = "0123456789"
    passwordLength  = 16
    passwordChars    = lowerAlphaChars + upperAlphaChars + numberChars
)

// NewPassword generates a strong, random password
func NewPassword() string {
    b := make([]byte, passwordLength)
    // Passwords need to include at least one character from each of the three
    // groups. To ensure that, we'll fill each of the first three []byte elements
    // with a random character from a specific group.
    b[0] = lowerAlphaChars[rand.Intn(len(lowerAlphaChars))]
    b[1] = upperAlphaChars[rand.Intn(len(upperAlphaChars))]
    b[2] = numberChars[rand.Intn(len(numberChars))]
    // The remainder of the characters can be completely random and drawn from
    // all three character groups.
    for i := 3; i < passwordLength; i++ {
        b[i] = passwordChars[rand.Intn(len(passwordChars))]
    }
    // For good measure, shuffle the elements of the entire []byte so that
    // the 0 character isn't predicatably lowercase, etc...
    for i := range b {
        j := rand.Intn(len(b))
        b[i], b[j] = b[j], b[i]
    }
    return string(b)
}

Re-writing this to utilize the model you're proposing in exp/rand gives me two unpalatable choices for moving past the risk of some other code altering the seed.

  1. I allocate a new Source every time this function is called-- which seems wasteful.
  2. I pollute the function signature by requiring every caller to pass in the Source belonging to their goroutine.

Neither of these is a good option and afaict, they're the only two options that the model you're proposing would give me.

@krancour You can use a sync.Pool of sources. It should also have much less contention than a single shared locked source.

@thaustad I could. It might be slightly more efficient than writing my own wrapper around a Source and a sync.Mutex (which is what I actually do in these scenarios), but it's not any more elegant...

That is to say, let's bring this back to my chief complaint about math/rand that I expressed in #21393. The standard library puts us between a rock and a hard place, forcing us to choose:

  1. Concurrency safety (by using math/rand package level functions), but risk the shared Source being stupidly re-seeded by other code with a value like 1.
  2. A seeded Source that you know for sure hasn't been / cannot be stupidly re-seeded by other code (i.e. _not_ math/rand package level Source), but is _not_ concurrency safe.

My question is: _Who doesn't want both of those?_

So whether using a Mutex or a Pool, we are forced to write additional code to accomplish something that a novice Gopher might very reasonably assume the standard library accounted for already.

This isn't about me not wanting to write a few extra lines of code. _This is about eliminating surprising and dangerous behavior from the standard library._

I see. At least you have a third option.

I can't say I have had much use for both features. I usually need more than one random number and then I prefer a single lock for all of them. The NewPassword function shows this. It acquires the same mutex 32 times per call. With other concurrent goroutines calling the same function, that lock is pretty busy.

It might be suprising, but I think it encourages better code. Novice Gophers has to understand how to access shared objects early on anyway.

Another way to go is to remove rand.Seed() (in Go 2). It should take care of your reseed concern. I don't think it can be used reliably anyway with multiple users of the global source.

The NewPassword function shows this. It acquires the same mutex 32 times per call.

Good point. And that's where your Pool option is better.

I do, however, stand by my assertion that there being no out-of-the-box option that is both concurrency safe _and_ guaranteed not to be re-seeded by someone else is going to catch novice gophers unaware. I've been writing Go code for four years and when I first became aware of this, I was like, "Wut??"

Another way to go is to remove rand.Seed() (in Go 2). It should take care of your reseed concern. I don't think it can be used reliably anyway with multiple users of the global source.

猬嗭笍So much 鉂わ笍 for that idea. As long as the seed automatically is initialized with something reasonable (like Unix time nanoseconds or something) that could work. The only downside I could see to that is it might sometimes be desirable to use a hard-coded seed to ensure determinism when using rand across multiple runs of a test suite. But that is probably an outlying use case and not a sufficient impetus for keeping a dangerous package level function like rand.Seed() around.

Since Rob said to hold off, marking this proposal-hold.

The only downside I could see to that is it might sometimes be desirable to use a hard-coded seed to ensure determinism when using rand across multiple runs of a test suite. But that is probably an outlying use case and not a sufficient impetus for keeping a dangerous package level function like rand.Seed() around.

Determinism for randomised analyses is a crucial attribute of any language that is used for these kinds of things. Go is used for these kinds of things, so removing the capacity to specify a seed is a non-starter.

@kortschak I don't think the suggestion I was commenting on (the one you quoted) would have ruled out specifying any seed you want for a given source (e.g. at the time it is created). I think the suggestion had more to do with eliminating the avenue for potentially _re-seeding_ a shared, package-level source, which leaves anyone who uses it vulnerable to the possibility that someone has reseeded using a constant value.

I don't see how you can have one without the other. The PRNG rand.Source interface Seed method is called on the concrete type at construction, so you need to have that method. How would you differentiate providing a seed at package initialisation without allowing it later or requiring that the seed be set manually?

If the shared package-level source were automatically seeded with a sensible (and not static) value, which could not be altered after the fact with a package-level function call, you could still do what you want to do (provide your own seed) by _not_ using the package-level source and initializing you own source instead-- which is what you should _already_ be doing now anyway if you care about never having the rug pulled out from under you.

Yeah, you're right.

Was this page helpful?
0 / 5 - 0 ratings