Runtime: Proposal: Create method Guid.NewSequentialGuid()

Created on 11 Jul 2017  路  20Comments  路  Source: dotnet/runtime

Make a new method to create a sequential Guid.

api-needs-work area-System.Runtime

Most helpful comment

I believe "sequential GUIDs" are designed to solve a very specific problem: you want to use GUIDs as a clustered primary key in a database. As such, I think this method might belong in code dealing with databases, but not in the general-purpose Guid type. Especially if the implementation does not use any of the standard GUID versions and does not guarantee that the generated GUIDs are globally unique, as I believe the one you showed does.

All 20 comments

What is a "sequential Guid"? Do you mean one that uses the machine's MAC address rather than being entirely random?

Note too that "sequential ids" of any sort have all kinds of problems, mostly dealing with recovery/concurrency scenarios. What sort of problem are you trying to solve here?

@stephentoub I use Guid (Prefix) + Bytes from DateTime (Ticks) to create a new guid.

@Clockwork-Muse currently I use to minimize index fragmentation when using Guids at database table PKs.

My class:

```c#
public static class IdentityGenerator
{
///


/// Create new sequential Guid
///

/// Guid
public static Guid NewSequentialGuid()
{
byte[] uid = Guid.NewGuid().ToByteArray();
byte[] binDate = BitConverter.GetBytes(DateTime.UtcNow.Ticks);

        byte[] secuentialGuid = new byte[uid.Length];

        secuentialGuid[0] = uid[0];
        secuentialGuid[1] = uid[1];
        secuentialGuid[2] = uid[2];
        secuentialGuid[3] = uid[3];
        secuentialGuid[4] = uid[4];
        secuentialGuid[5] = uid[5];
        secuentialGuid[6] = uid[6];
        // set the first part of the 8th byte to '1100' so     
        // later we'll be able to validate it was generated by us   

        secuentialGuid[7] = (byte)(0xc0 | (0xf & uid[7]));

        // the last 8 bytes are sequential,    
        // it minimizes index fragmentation   
        // to a degree as long as there are not a large    
        // number of Sequential-Guids generated per millisecond
        secuentialGuid[9] = binDate[0];
        secuentialGuid[8] = binDate[1];
        secuentialGuid[15] = binDate[2];
        secuentialGuid[14] = binDate[3];
        secuentialGuid[13] = binDate[4];
        secuentialGuid[12] = binDate[5];
        secuentialGuid[11] = binDate[6];
        secuentialGuid[10] = binDate[7];

        return new Guid(secuentialGuid);
    }
}

```

So, I call IdentityGenerator.NewSequentialGuid() to create a new Guid using "sequential" logic... Works fine, but I'm thinking about "NewSequentialGuid" as static member at "root" Guid struct...

EDIT: @karelz fixed code block formatting - ```c# instead of just ` (which is for inline code)

How do you guarantee uniqueness? You have to assume multiple threads, multiple processes, and multiple machines all running this logic at the same time.

Side note: SQL Server at least can generate unique, sequential ids for you. They even mention using them for index issues.
This wouldn't help with your "we generated it" prefix, though. And I don't know what RDBMS vendor you're using, either.

How do you guarantee uniqueness?

MAC addresses and having the sequence system-wide is how the Windows UuidCreateSequential function does it (and how UuidCreate used to do it before the leaking of MAC addresses when not specifically requested was deemed a security risk).

I believe "sequential GUIDs" are designed to solve a very specific problem: you want to use GUIDs as a clustered primary key in a database. As such, I think this method might belong in code dealing with databases, but not in the general-purpose Guid type. Especially if the implementation does not use any of the standard GUID versions and does not guarantee that the generated GUIDs are globally unique, as I believe the one you showed does.

I agree with @svick it's too especial use and not general-purpose... Tks!

Another reason is performance, as (in Windows anyway) the sequential algorithm is faster, especially for large batches.

How I could validate the generation of unique sequential Guids? I made some tests and everything works good... @jnm2

@balivo Even across processes and machines?

@balivo -
Validate them how? For uniqueness? For sequential-ness? Both would require you to keep a (presumably large) database of the ones you've seen, timestamped with the finest granularity you can manage (which may not be enough if your code is fast enough).

Note that attempting to validate the individual values is not only problematic, but doomed to failure. It's like trying to validate that alloc will give you a specific memory address.

@Clockwork-Muse uniqueness validation... The sequential-ness its ok for me...
I use this logic for last 9 years (approximately) and no conflicts between created Guids (I make some validations to verify this question, just for "duble"-safety)... I have a logic for validation of questions above (multi-thread and multi-process testing), I kept the applications running 30 days and no conflicts...

And I just use NewSequentialGuid at specific scope. In mobile apps, I create a new Order (SFA, example) and make an API request to save at database.

So @jnm2 ask about uniqueness, I don't have a sufficient powerful environment to make 100% test before Universe rest in peace (LoL)... I can upload my validation logic and we can make the test happens! ;)

@balivo I think for global uniqueness, you need theory, not tests. According to Wikipedia:

the number of random version 4 UUIDs which need to be generated in order to have a 50% probability of at least one collision is 2.71 quintillion

Version 4 GUIDs have 122 random bits. If I use the same formula, but with 60 random bits, which is what your IdentityGenerator has, then it says that to get 50 % probability of collision, you need 1.26 billion GUIDs. But to collide, all those GUIDs would have to be generated with the same value for DateTime.UtcNow.Ticks. Pessimistically, the period during which that's true can be 15 ms (I believe that's the value for older OSes).

That sounds good, until you realize this would mean 50 % probability of collision during every 15 ms interval. To get 50 % probability of collision during a larger interval, say, 10 years, the probability of collision during each interval would have to be much smaller. If I get my math right, it means the probability of collision for each 15 ms interval can be only 2.20 脳 10-12. For that probability, you can have only 8720 GUIDs in each 15 ms interval, or 581 GUIDs per millisecond.

TLDR: If my calculations are correct, if you generate more than about 500 sequential GUIDs per millisecond, world-wide, then over a 10 year period, you will get a 50 % chance of at least one collision. That might be more than okay for your specific use-case, but I don't think it can be called "globally unique".

@svick tks...

DateTime.UtcNow have a 10ms precision...
https://msdn.microsoft.com/en-us/library/system.datetime.utcnow(v=vs.110).aspx

DateTime.Now is 15ms precision...
https://msdn.microsoft.com/en-us/library/system.datetime.now(v=vs.110).aspx

Sounds good at specific scope... But not for general-purpose.

Tks!

@balivo - Not on Windows on corefx, it's not (I'm unsure where Linux has gone).

@svick

Version 4 GUIDs have 122 random bits.

I would have assumed from the name that the proposal was to have Version 1 GUIDs, so if I'd wanted them in Windows-only code I'd use:

C# [DllImport("rpcrt4.dll", SetLastError=true)] static extern int UuidCreateSequential(out Guid guid);

I understand that linux's uuid program lets one select version 1, and perhaps getting a batch-per-call could be used to reduce overheads, but I've never dealt with UUIDs in Linux myself except when a database was doing the generation work for me, so I'm not sure.

Sequential guids compatible with Sql server implementation:
http://developmenttips.blogspot.com/2008/03/generate-sequential-guids-for-sql.html

```
public class SqlServerSequentialGuidGenerator
{
// With UIDs As(
//Select ID = 3, UID = cast('01000000-0000-0000-0000-000000000000' as uniqueidentifier)
//Union Select ID = 2, UID = cast('00010000-0000-0000-0000-000000000000' as uniqueidentifier)
//Union Select ID = 1, UID = cast('00000100-0000-0000-0000-000000000000' as uniqueidentifier)
//Union Select ID = 0, UID = cast('00000001-0000-0000-0000-000000000000' as uniqueidentifier)
//Union Select ID = 5, UID = cast('00000000-0100-0000-0000-000000000000' as uniqueidentifier)
//Union Select ID = 4, UID = cast('00000000-0001-0000-0000-000000000000' as uniqueidentifier)
//Union Select ID = 7, UID = cast('00000000-0000-0100-0000-000000000000' as uniqueidentifier)
//Union Select ID = 6, UID = cast('00000000-0000-0001-0000-000000000000' as uniqueidentifier)
//Union Select ID = 8, UID = cast('00000000-0000-0000-0100-000000000000' as uniqueidentifier)
//Union Select ID = 9, UID = cast('00000000-0000-0000-0001-000000000000' as uniqueidentifier)
//Union Select ID = 10, UID = cast('00000000-0000-0000-0000-010000000000' as uniqueidentifier)
//Union Select ID = 11, UID = cast('00000000-0000-0000-0000-000100000000' as uniqueidentifier)
//Union Select ID = 12, UID = cast('00000000-0000-0000-0000-000001000000' as uniqueidentifier)
//Union Select ID = 13, UID = cast('00000000-0000-0000-0000-000000010000' as uniqueidentifier)
//Union Select ID = 14, UID = cast('00000000-0000-0000-0000-000000000100' as uniqueidentifier)
//Union Select ID = 15, UID = cast('00000000-0000-0000-0000-000000000001' as uniqueidentifier)
//)
//Select* From UIDs Order By UID
private static readonly int[] SqlOrderMap = { 3, 2, 1, 0, 5, 4, 7, 6, 9, 8, 15, 14, 13, 12, 11, 10 };

    private readonly byte[] _currentGuidArray;
    private Guid _currentGuid;

    public Guid CurrentGuid
    {
        get { return _currentGuid; }
    }

    public SqlServerSequentialGuidGenerator()
        :this(Guid.NewGuid())
    {
    }

    public SqlServerSequentialGuidGenerator(Guid initialGuid)
    {
        _currentGuid = initialGuid;
        _currentGuidArray = initialGuid.ToByteArray();
    }

    public  Guid Generate()
    {
        byte[] guidArray = _currentGuidArray;
        var orderMap = SqlOrderMap;
        for (int mapIndex = 0; mapIndex < 16; mapIndex++)
        {
            int bytesIndex = orderMap[mapIndex];
            guidArray[bytesIndex]++;
            if (guidArray[bytesIndex] != 0)
            {
                break; // No need to increment more significant bytes
            }
        }
        _currentGuid =  new Guid(guidArray);
        return _currentGuid;
    }
}

```

Um, if you're trying to use sequential GUIDs on SQL Server, I'd probably just ask the server to create them. Which would solve the uniqueness/concurrency problem (ie, running the given code from multiple machines is going to result in collisions, which might be costly).

Note that generating sequential ids is useful to avoid index fragmentation, and not to provide some ordering of data. Row ids should be considered the same as memory addresses in a regular program, and the actual value considered undefined. Never rely on the id for ordering information, especially if the client is/might be allowed to generate it. Always use an explicit, appropriate value column instead (ie, a created_at timestamp); the row id can be used for tiebreakers, if necessary.

Also, we most likely can't use that code, because the author hasn't released the copyright/licensed it to us.

Thank you for the suggestion, but we don't plan to do anything to expose this. Such functionality could easily be implemented however is desired in a library outside of Guid itself.

Was this page helpful?
0 / 5 - 0 ratings