Efcore: EF Core 3 concurrency check issue when replacing RowVersion array instead of updating it

Created on 22 Oct 2019  路  28Comments  路  Source: dotnet/efcore

Concurrency check doesn't work when using RowVersion (SQL Server's timestamp) and one creates a new instance of byte array, instead of updating existing.
Happens when using SQL Server store, but not with InMemory. See code below.

Steps to reproduce

``` C#
// entity
public class Party
{
public int Id { get; set; }
public string FirstName { get; set; }
public byte[] RowVersion { get; set; }
}

// mapping excerpt
modelBuilder.Entity(entity =>
{
entity.ToTable("party");
entity.Property(e => e.Id).HasColumnName("id");
entity.Property(e => e.FirstName)
.HasColumnName("first_name")
.HasMaxLength(255);
entity.Property(e => e.RowVersion) // maps to timestamp SQL server type
.IsRequired()
.HasColumnName("row_version")
.IsRowVersion();
}

// this code doesn't raise an exception at all
var party = db.Party.First();
party.RowVersion = new byte[] { 0, 1, 2, 3, 4, 5, 6, 7 };
db.Update(party);
db.SaveChanges(); // <- error, no exception is thrown

// this code correctly raises a concurrency exception
var party = db.Party.First();
party.RowVersion[0] = 99;
db.Update(party);
db.SaveChanges(); // throws dbconcurrencyexception
```

Further technical details

EF Core version: 3.0.0
Database provider: Microsoft.EntityFrameworkCore.SqlServer (works correctly with InMemory)
Target framework: .NET Core 3.0
Operating system: Windows 10

customer-reported type-enhancement

Most helpful comment

Problem with the current behaviour that it does not work with detached or mapped Entities over an API.

extremly simplified

put(ClientModel model) {
  var entity = getFromEntityDbContext();
  map(model, entity);
  SaveChanges()
}

ClientModel and Entity has a [TimeStamp]. But if we update the TimeStampe (RowVersion) of Entity with the value from the client it is ignored. So the concurrency does not work.

So we are forced to do a manual concurrency check on the RowVersion, instead of letting EF Core to do the Work. This is not the behaviour we expected.

All 28 comments

Notes for triage:

  • The query uses the original value for concurrency detection. This works fine when the row version is not explicitly modified. That is, when the original value remains the value that was queried from the database.
  • Explicitly updating the current value of the row version property should be a no-op for concurrency if it behaves like other concurrency tokens. This is the case for the first example, where no exception is thrown.
  • However, when the byte array current value is mutated, then the original value is also getting mutated. This seems like a bug--probably we're not creating a copy of the byte array in the original values snapshot. This is why the second case throws.

So, the question for triage is, beyond fixing the bug, what guidance should we have for:

  • Modifying the current value of concurrency tokens in general
  • Modifying the current value of an automatic concurrency token, such as timestamp/rowversion

/cc @AndriySvyryd

Notes from triage:

  • Current behavior is good when automatic concurrency tokens are treated as immutable. Forcing a copy would by special casing resulting in extra overhead only for a negative case.
  • Putting on the backlog to consider documenting or updating the concurrency exception message to add a note to this effect when the concurrency token is a byte array.

Can we have a documentation for this in either the RowVersion or IsConcurrencyToken documentation?

Problem with the current behaviour that it does not work with detached or mapped Entities over an API.

extremly simplified

put(ClientModel model) {
  var entity = getFromEntityDbContext();
  map(model, entity);
  SaveChanges()
}

ClientModel and Entity has a [TimeStamp]. But if we update the TimeStampe (RowVersion) of Entity with the value from the client it is ignored. So the concurrency does not work.

So we are forced to do a manual concurrency check on the RowVersion, instead of letting EF Core to do the Work. This is not the behaviour we expected.

Yeah, dunno why was type-bug label removed.

With @MihaMarkic Infromation RowVersion[0]=1 a workaround can be created.

Something like this can be used in AutoMapper.

c# CreateMap<UpdateFieldSpecificationModel, FieldSpecification>() .ForMember(d => d.RowVersion, c => c.Ignore()) .AfterMap((source, destination) => { for (int i = 0; i < destination.RowVersion.Length; i++) { destination.RowVersion[i] = source.RowVersion[i]; } });
This works.

@DerAlbertCom Yep, I ended up with doing something like that as well. You could use

Buffer.BlockCopy(source.RowVersion, 0, 
    destination.RowVersion, 0, source.RowVersion.Length);

instead of for loop to make it a bit faster.

I also added few more checks, here is more complete code of mine

public static IMappingExpression<TDestination, TSource> WithRowVersioning<TDestination, TSource>(this IMappingExpression<TDestination, TSource> expression)
    where TDestination : EditModel
    where TSource : IRowVersioning
{
    return expression
        .ForMember(destEntity => destEntity.RowVersion, opt => opt.Ignore())
        .AfterMap((source, target) =>
        {
            if (source.RowVersion?.Length > 0)
            {
                if ((target.RowVersion?.Length ?? 0) == 0)
                {
                    throw new Exception($"Mapping existing model of {typeof(TDestination).Name} on new entity of {typeof(TSource).Name}. Entity should be an existing one.");
                }
                Buffer.BlockCopy(source.RowVersion, 0, target.RowVersion, 0, source.RowVersion.Length);
            }
            else
            {
                if ((target.RowVersion?.Length ?? 0) != 0)
                {
                    throw new Exception($"Mapping new model of {typeof(TDestination).Name} on existing entity of {typeof(TSource).Name}. Entity should be a new one.");
                }
            }
        });
}

@DerAlbertCom @MihaMarkic Would it be possible for one of you to put together a small, runnable project or complete code listing that shows why these workarounds are needed. I've read through the posts above several times and I'm not understanding what is going wrong here that would require mutating the array like this.

I can only speak for myself, and i am using it in Unit Tests. I am not testing database specifics, but how the rest of the infrastructure reacts on the occasion of such an exception. There are workarounds for this, or other ways to test, so its not an issue for us. That being said, I would also be interested to see in what circumstances, outside of testing purposes, such manipulations would be useful.

I will create a small sample in a unit tests.

But, this is usefull in Application where it is possible that several People are working on the same Database Record. And in a szenario where the Data is serialized. Like an ASP.NET Core Application. Serialized in the

or over Json in an API. So also the [TimeStamp] RowVersion MUST be serialized, because this is the way how EF Core together with SQL Server supports a simple optimistic Concurrency Check.

And if i have to serialize the RowVersion of the given DataRecord, i must be able to set the RowVersion on the entity. So that EF Core can fill the right parameter with the Value of the RowVersion at the for the SQL Update Query.

User A:
var version1 = getRecord('key');
write version in <form/>  (with RowVersion), user A sees the data
----
User B:
var version1 = getRecord('key');
write version in <form/> (with RowVersion) to user B sees the data
user B changes the data
user B creates save, makes an HTTP post to the server
Server gets current version of the Entity from the db
Server fill the data from the post in the in the entity (including RowVersion)
Server saves the data (Version 2 is stored)
----
user A changes the data
user A creates save, makes an HTTP post to the server
Server gets current version (now Version2) of the Entity from the db
Server fill the data from the post in the in the entity (including RowVersion)
EF Core should throw in Concurrency Exception, but does not. 
And this is the Problem. It simply overwrite the changes of User A.

Only with Updating the RowVersion byte by byte. EF Core works as expected. Or
if we manually check the RowVersion.

Current EF Core only work in Szenarios where the Entity are kept long, like in WinForms or WPF Applications.

Any Open Questions?

Just for pointing out a possible workaround for everyone else wondering what to do: you could SELECT the entity using a WHERE clause and the clients timestamp, modify the retrieved entity and perform an UPDATE. That way you can be sure that your scenario will still be detected (no entity will be SELECTed if timestamp has changed BEFORE, and UPDATE will throw DbConcurrencyException if Timestamp changed between SELECT and UPDATE).

Hope this makes sense.

Sorry, I'm lost. Why are we talking about possible workarounds? They are plenty of Workarounds.

With your Workaround we have to select the entity twice if there was a change before the first select.

Why should we Workaround?

You wrote, that Entity Framework is not working as expected, and i'm not sure about that. As i see it, you are not expected to modify RowVersion by hand, and that needs to be documented better.

I don't understand what you mean with "you have to select the entity twice", the amounts of SELECTs based on your scenario stays exactly the same. I will use your scenario to demonstrate what i mean.

Everytime you do Server gets current version of the Entity from the db in your scenario add WHERE Timestamp == TimestampFromUser to that query. That way you only get the entity if it hasn't been changed yet. If you get an entity, you now no longer need to set the Timestamp at is now exactly the same (or your query wouldn't return anything).

Next step Server fill the data from the post in the in the entity but you dont need to include the RowVersion. We assume it is the same. Now when you save it, and the Timestamp has changed in the Db, you will get a DbUpdateException.

@ThomasBergholdWieser This is a bug IMO. What is the difference or replacing (timestamp) array values with replacing array from developer perspective? It should be handled the same by EF engine. IMO - it has an array and it should compare it to original value regardless how it was set.
If it doesn't work like that, I guarantee you millions of devs bumping into the same pitfall.

@ThomasBergholdWieser If the entity was changed and i select with ID and the Old TimeStamp then NO entity is return. In that case i know the entity is deleted or changed. But i don't have an entity which i can update. So i have to select the Entity again by Id only. If a entity is returned then it was not deleted and i can update the entity. This is the reason I have to selected twice.

@MihaMarkic yes, most developers will fall in the pitfall. they write even blogpost https://dejanstojanovic.net/aspnet/2018/november/handling-data-concurrency-in-ef-core-and-aspnet-core-webapi this is clearly not tested.

@MihaMarkic I disagree. RowVersion (under different names) has been around for a few years, dating back to EF Classic, i don't see millions of users. But to the point, RowVersion is not a "regular" property, as its value is supposed to be generated by the database and not be modified by the User.

@DerAlbertCom I don't get your problem. If you need the entity, then dont include the Timestamp and compare in code for the same effect. Problem solved with no increase in SELECTs.

@ThomasBergholdWieser Of course it has to be modified by the user. How else would you deal with models otherwise, where you don't keep entities around, just models?
I have no idea how it worked before.

@MihaMarkic I mentioned a way in my reply to the mentioned scenario, please have a look.

@ThomasBergholdWieser You are right, that scenario works and is a valid one. It would fail if one caches such entities though.
However, I still think that the way EF behaves is not a correct one, or at least highly counter intuitive.

@ThomasBergholdWieser i know how to work around, the manual TimeStamp validation is that what we do. You suggested to select the entity including the Timestamp. And this is not a good workaround because of the case I described here https://github.com/aspnet/EntityFrameworkCore/issues/18505#issuecomment-561624397.

@DerAlbertCom I thought of that, too. Perhaps you would just load entity by id and then manually compare timestamp array to the model's one. That way you have both info with a single fetch.

@MihaMarkic I disagree. RowVersion (under different names) has been around for a few years, dating back to EF Classic, i don't see millions of users. But to the point, RowVersion is not a "regular" property, as its value is supposed to be generated by the database and not be modified by the User.

Precisely! You never need to update RowVersion so it's a perfect fit for concurrency check - the modified value of RowVersion property should be used only for concurrency check and not update the column in the database.

Pseudocode:

foo.Value = "bar";
foo.RowVersion = newValue;
db.SaveChanges(); // -> UPDATE foo SET Value = 'bar' WHERE Id = <id> AND RowVersion = <newValue>

@mariuszjamro exactly, MUST not be updated. But used for the check. I never expected a forced updated of a server generated timestamp.

I think the arguments why this is supposed to be a bug ignore the use case when the concurrency token is _not_ a row version. Let鈥檚 take ASP.NET Core Identity鈥檚 UserStore as an example: That one use a string-based concurrency token that鈥檚 just a Guid that gets updated on every write. So it basically looks like this (simplified):

```c#
user.Property = newValue;
user.ConcurrencyStamp = Guid.NewGuid().ToString();
await db.SaveChangesAsync();

Here, the generated query should be:

```sql
UPDATE user
SET Property = '<newValue>',
    ConcurrencyStamp = '<newGuid>'
WHERE Id = <userId>
  AND ConcurrencyStamp = '<oldGuid>'

So obviously, this mechanism needs a way to set the concurency token while also comparing against the old one.

That鈥檚 why examples like the following that was mentioned simply don鈥檛 work:

```c#
put(ClientModel model) {
var entity = GetFromEntityDbContext();
Map(model, entity);
SaveChanges();
}

By mapping the client model on top of the _current_ state of the database, you cannot compare against the memorized value of the concurrency token if you also want a mechanism to update the token.

So the solution has to be something like this:

```c#
put(ClientModel model) {
    var entity = GetFromEntityDbContext();

    // restore memorized concurrency token
    db.Entry(entity).Property(nameof(Entity.ConcurrencyToken)).OriginalValue = entity.ConcurrencyToken;

    // map the updatable properties, and reset the concurrency token
    MapEverythingExceptConcurrencyToken(model, entity);
    entity.ConcurrencyToken = GenerateNewToken();

    SaveChanges();
}

The bug mentioned in OP鈥檚 code is that party.RowVersion references the same array object as the original value. So mutating that array also mutates the original value of the concurrency token which means that a different value is being used in the WHERE to detect the conflict.

But the underlying issue is that you simply cannot utilize the concurrency check if you work with a _fresh_ entity without manually resetting its original concurrency token value back to what you want to validate against.

When the Use Case of the [Timestamp] is not a Concurrency Token then it should be documented otherwise, and EF Core should not throw an DbUpdateConcurrencyException if is changed. Yes, Workarounds are possible. Why is the RowVersion generated on the Client side?

Following the discussion I am not 100% what the best solution to this is. However what I think this is an undocumented breaking behavioral change in EF Core 3.0. Previously it was not possible to modify a RowVersion clientside. This would have thrown an DbConcurrentUpdateException. Now no exception is thrown and instead a given clientside rowversion is accepted as a new value (mapped from a disconnected entity).

So implementations that rely on the old behavior are now broken (implementations like this might not be optimal regarding above thread but I am not sure about this).

To me this seems as a big issue (despite the relative low anticipation this thread has), as it is relatively hard to detect. If there are no tests that explicitly check for concurrent updates, developers will assume their original code still work. Concurrent scenarios are especially difficult to test for single developers. So this behavioral change has the potential to stay undetected for users and developers alike. This also does not crash the app, instead there is "silent data loss", as concurrent updates just silently override each other.

Was this page helpful?
0 / 5 - 0 ratings