Yii2: generateRandomInt() and generateRandomString()

Created on 15 Jan 2016  路  31Comments  路  Source: yiisoft/yii2

I have half-finished implementations of two new features for yii\base\Security

Before finishing them I want to ask: is there interest in these?

generateRandomInt() is useful when you want a secure random integer in arbitrary range. For example, you might prefer to use a secure source when generating a captcha. But it's not so simple to get random integers from random bytes. The implementation is quite well tested but manually, not using PHPUnit tests. This is what needs work.

The idea for adding an $alphabet parameter to generateRandomString() came from looking at the same captcha code. The method name generateRandomString() sounds like it ought to do this when it instead does something weirdly specific. This isn't such a big deal because most people can manage to make a random string _given a random integer generator_. I haven't tested this at all yet.

security under discussion enhancement

All 31 comments

Seems to be useful
@yiisoft/core-developers

@tom--

What could be wrong with the $range is INT?
Refactored code https://gist.github.com/githubjeka/43a04484a0c271409ba0#file-security-php-L515-L562

P.S.
And what about

If none of the aforementioned sources are available, then an Exception will be thrown.

from random_int(). Is should to ?

try 
return random_int($min, $max);
catch 
// yii2 implementation`

Yes. Both are useful.

I'm also thinking about introducing UUID helper since in order to generate these RNG is used.

@githubjeka random_int() throws exceptions. If the function is present then I don't want to interfere with them.

At line 555 of your gist, $value is uniformly distributed random number in the range PHP_INT_MIN thru PHP_INT_MAX. If, for example, $min and $max are -10 and 10 respectively then it will usually take very many tries to get a value in range. So we discard all except the bottom 5 bits of $value so it is uniformly distributed in the range 0..31 and keep generating $values until we get one in the range 0..21 which we then adjust down to the range -10..10.

In the worst cases (e.g. 0..512 making $range 513), this algorithm will have approx. 50% chance of failure on each try.

@samdark Good idea.

Every time I research UUID I get confused about standards. (The old joke about standards applies.) So I had the idea that if we support only one standard at first (preferably the most popular one) then people will tell us which others are important in issues.

Does anyone know the most popular standard for UUID?

@tom-- i've seen this lib https://github.com/ramsey/uuid
Probably you can take a look the code and find something intresting

@unclead Yikes! That's a good example of old joke: The great thing about standards is there are so many to choose from.

If https://github.com/ramsey/uuid is decent then I'm inclined to write a wiki on how to use this wheel with generateRandomKey() instead of reinventing (well, copy-pasting) one in our framework.

Iiuc, you can do in PHP 7 (not tested!)

$uuidFactory = new \Ramsey\Uuid\UuidFactory();
$uuidFactory->setRandomGenerator(
    new class implements \Ramsey\Uuid\Generator\RandomGeneratorInterface {
        public function generate($length) {
            return \Yii::$app->security->generateRandomKey($length);
        }
    }
);
\Ramsey\Uuid\Uuid::setFactory($uuidFactory);

$uuid = \Ramsey\Uuid\Uuid::uuid4();

In PHP 5 you must declare a class implementing RandomGeneratorInterface.

@samdark It was your idea. What do you prefer?

+1 to wiki "rightway ramsey/uuid + yii2."

ramsey library is good and is flexible but it allows unsafe choices for RNG which our securtity component doesn't. If we aren't going to have UUID generation helper in the core, I'm OK with wiki article. If we're going to implement it, it would be much slimmer than ramsey one.

Something like https://gist.github.com/dahnielson/508447 but with our integer method used instead of mt_rand.

The Andrew Moore/Anders Dahnielson version is clear and easy to understand. Looking at that I propose a generateRandomUUID() method. The name- and time-based UUID versions don't belong in Security component anyway.

In Yii 2 style, is it generateRandomUUID() or generateRandomUuid()?

There's a bunch of libraries out there with quality UUID implementations. Seems much better to leave these things to extra libs instead of core

@bpicolo any references besides ramsey one? If we aren't going to have it at least there should be something to recommend.

@tom-- I don't think any of UUIDs belong to security component. I was thinking more about a searate helper or component rather than additional methods in security component.

@samdark I'm a bit unclear on the issue with ramsey actually. (That is what I would recommend).

UUIDs aren't intended for cryptographic uses, so security doesn't seem like the right place for them in general.

Edit: Saw your update, I agree. Security is a weird place for them : )

@bpicolo The issue with ramsey is that people have been reporting collisions at an alarming rate, probably because on some platforms the random sources are poor. Furthermore, they've been unable to figure out why for months, probably because the random getter code is so complex.

In contrast, making v4 UUIDs is very simple in Yii 2.

I think there is value in providing this as an alternative

  1. It is simple enough for anyone to thoroughly review the code and decide if they are confident in it, and even to debug it if something goes wrong.
  2. It is hardwired to a random getter that we are confident is good.

@samdark If you want to add name- and time-based UUID generators, this method would belong together with them.

On the other hand, I think we have the most to offer with the random UUIDs so if that's all we do, it's ok in Security beside all the other generateRandomXyz() methods.

Reproduceable v3 and v5 could be useful as well but I don't see these used as often. Even if these aren't implemented, I think it's a good idea to separate UUID from Security. It's indeed using Security but isn't part of it.

@bpicolo regarding collisions, @tom-- is talking about https://github.com/ramsey/uuid/issues/80

OK. I'll work on ints and strings. The random uuid generator with unit test is ready when you have a place for it.

I've never needed such an UUID, most of the times a random number or an alphanumeric code does the job for identifying something. Look at YouTube, they use just a length of 11 with 64 chars. So this gives us 64^11=73'786'976'294'838'206'464 which are 73 quintillion possible IDs and that is quite enough for most cases.

And of course they must be unique, so if I use such IDs it is highly recommended to check the DB if there is such an ID stored.

I use a simple class for this with a function which needs just the length and a string with chars which should be used. For integers it is the length and if there should not be a zero at the beginning of the number.

@thehereward extra DB query to get unique ID could complicate things a lot in many cases.

@thehereward I too have never needed these UUIDs.

But 1. In many distributed systems, e.g. data-centers around the world, you must avoid the uniqueness test. 2. Given that we have 64-bit processors, do we use 1 word or 2? Hence 128-bit UUIDs. 3. Some people also need interoperability, hence the need for standards, hence RFC 4122.

And think ahead, like the the y2k problem makers didn't. How many records will exist in y2038? How many tweets, for example, will there be? Now you want to insert into _that_ table without a uniqueness test and still with _practically_ no chance of collision. It just seems like saving one 64-bit word isn't the right place to be spending your optimization efforts.

YouTube's URLs are not necessarily random and may be unrelated to engineering of distributed systems. Those could be created later once things are reconciled.

I think DBs should work like this. Probably some of the no-squealers do.

@tom--, how is your progress on this (and the tests?)

Hi @dynasource

I haven't done anything with it :( And I don't think I'll be able to get to it soon :'(

But it should be easy to adapt code from this gist to do these things.

The readme for the statistical tests shows use of dieharder. Since then I have come to prefer PractRand.

Thanks for your quick reply and your references!
We will put this issue on hold for the time being, no worries.

@tom-- thanks for good research. Let's wait for more people to request it.

For UUID why reinvent the wheel? Leverage what has already been to a high quality: https://github.com/ramsey/uuid

As for the randomInt: \mt_rand($min, $max)

Only my opinion here but seem counter productive to expend time and energy on logic that is already available via the SPL or well maintained packages.

@davidjeddy it makes sense not to blindly rely on any libraries when it comes to security matters. See comments above about the library you're referencing.

@davidjeddy Because ramsey/uuid was at that time experiencing collisions, which for a unique id is suboptimal.

When I tried to understand ramsey/uuid I discovered that it was too complicated for my skill to comprehend. I found that five lines of code accomplish the same thing. From my point if view, this is not reinventing the same wheel.

generateRandomString($length, $type='alphanumeric') must be helpful.
What if i don't want special characters in my string?

Was this page helpful?
0 / 5 - 0 ratings