Chapel: Provide UUID library

Created on 3 Feb 2018  路  10Comments  路  Source: chapel-lang/chapel

UUIDs are a great way to identify objects without having to maintain serial ids. Once again, Python has a pretty nifty library for UUIDs I use often.

Libraries / Modules good first issue Feature Request user issue

Most helpful comment

Proof of concept here:

https://gist.github.com/ajoshpratt/393673ec1525a0729706498d2a4a9a19

It's a bit rough and has a few extra pieces (such as custom RNGs, which aren't used here; I'm just streaming directly from /dev/urandom) and I'm sure there are problems. I'm also indexing _some_ arrays with 0, and others with 1. That's fixable.

(there's also inconsistent case, some constants and variables that could be handled differently, blah blah blah).

All 10 comments

I'm not familiar with how UUIDs are generated in practice and was curious as to whether there are unique challenges with implementing a UUID library in a parallel/distributed setting. I.e., to generate good UUIDs is there a need to coordinate between tasks or locales (as there might be to generate good random numbers, for example), or is the generation of UUIDs sufficiently parallel / isolated that no such coordination would be needed?

I assume there's some challenge in guaranteeing unique UUIDs when generating them in parallel, but that should be relatively easy to handle in Chapel, especially if it directly on depends on parallel-safe RNGs.

This is a (soft) dependency in an application @ajoshpratt is considering porting to Chapel.

There is always a chance of collision. Under conditions where you have a single entropy source, it's vanishingly small. I do imagine it's greater when you have multiple entropy sources at play, although it seems unlikely that each RNG generating the input data would happen to have the same seed (not sure what the probability of that is).

Not ready to show any code, but just a little progress; the basic idea is you pull in 128 bits of random data, set some bits according to the standard, then convert to hexadecimal and add hyphens as necessary.

It's a little less straightforward to pull 128 random bits in Chapel, but it's easy enough to pull 16 bytes from /dev/urandom then convert them to hexadecimal. I've written some code that does this and I believe it's correct. To test, I've generated a 64 bit integer, manually converted it into binary, manually calculated the eight 8-bit integers that the integer could be represented as, then did the bitwise operations to convert them to hexadecimal. My answer agrees with what's it in Python, so now would be the time to test against a 128 bit UUID and see if I can spit it out.

Seems to work fine. I'm not so sure whether or not we have to worry about Endianness in Chapel, but the algorithm does rely on setting the most significant bits so for a compliant implementation, we might need to check for that.

Proof of concept here:

https://gist.github.com/ajoshpratt/393673ec1525a0729706498d2a4a9a19

It's a bit rough and has a few extra pieces (such as custom RNGs, which aren't used here; I'm just streaming directly from /dev/urandom) and I'm sure there are problems. I'm also indexing _some_ arrays with 0, and others with 1. That's fixable.

(there's also inconsistent case, some constants and variables that could be handled differently, blah blah blah).

It looks like the Python implementation uses MAC addresses (https://docs.python.org/3.6/library/uuid.html#uuid.getnode) as part of the UUID. It seems to me that's something to do also in a Chapel version?

Oh, I see in https://tools.ietf.org/html/rfc4122.html that the node field might be random data:

For UUID version 1, the node field consists of an IEEE 802 MAC

address, usually the host address. For systems with multiple IEEE
802 addresses, any available one can be used. The lowest addressed
octet (octet number 10) contains the global/local bit and the
unicast/multicast bit, and is the first octet of the address
transmitted on an 802.3 LAN.

For systems with no IEEE address, a randomly or pseudo-randomly
generated value may be used; see Section 4.5. The multicast bit must
be set in such addresses, in order that they will never conflict with
addresses obtained from network cards.

For UUID version 3 or 5, the node field is a 48-bit value constructed
from a name as described in Section 4.3.

For UUID version 4, the node field is a randomly or pseudo-randomly
generated 48-bit value as described in Section 4.4.

Do you know if you are aiming to support a particular UUID version?

For my purposes, I really just need a random UUID, so 4 is sufficient (and also seems to be the easiest to generate). I implemented it as a UUID4 procedure so that we could theoretically build the others in.

I'm honestly not sure what the use case for the earlier revisions is, anymore.

Going to close this now that a UUID package is available via mason (https://github.com/chapel-lang/mason-registry/pull/17).

UUID-specific feature requests / bug reports can be opened here: https://github.com/ajoshpratt/chplUUID

Was this page helpful?
0 / 5 - 0 ratings