Runtime: LINQ bug: ConcatNIterator.LazyToArray() produces incorrect results

Created on 30 Aug 2017  路  34Comments  路  Source: dotnet/runtime

Here's a repro: https://github.com/gulbanana/repro-corefx-concat-toarray

banana@forsyth MINGW64 /c/code/repro-corefx-concat-toarray (master)
$ dotnet run -f netcoreapp1.0                                      
A B C                                                              

banana@forsyth MINGW64 /c/code/repro-corefx-concat-toarray (master)
$ dotnet run -f netcoreapp2.0                                      
A B A                                                              

The problem occurs when using Concat multiple times and then calling ToArray().
```c#
using System;
using System.Collections.Generic;
using System.Linq;

class Program
{
class A { public override string ToString() => "A"; }
class B { public override string ToString() => "B"; }
class C { public override string ToString() => "C"; }

static void Main(string[] args)
{
    var results = new List<A> { new A() }
        .Concat(new List<object> { new B() })
        .Concat(new List<C> { new C() })
        .ToArray();

    Console.WriteLine(string.Join(' ', results));
}

}
```

This program should print "A B C", but it instead prints "A B A". I believe the problem is in the specialised implementation of concatenation for 3+-enumerables-of-collections. I ran into it in an actual project (an IEnumerable containing multiple subclass instances) but I've worked around the bug by using ToList() instead.

[EDIT] Add C# syntax highlight by @karelz

area-System.Linq bug disabled-test tenet-compatibility

Most helpful comment

I think we should include the fix in 2.0.2 servicing release.

All 34 comments

cc: @jamesqo, @vsadov, @jonhanna

It does seem to be in LazyToArray().

Variant:

C# var concat = new List<string> {"A"}.Concat(new List<object> {"B"}).Concat(new List<string> {"C"}) .Concat(new List<string>{"D"}).Concat(new List<string>{"E"}); Assert.Equal(new object[]{"A", "B", "C", "D", "E"}, concat.ToArray()); // Actual "A", "B", "A", "C", "D"

Perhaps relevant that the objects added are not ICollection<T> for the collection T of the result, but are ICollection<T> for a more derived type.

In the above, after builder.ToArray() the contents of the array are A, null, A , C, D. It seems the final copy within that operation is wrong. Then the B gets correctly put where it should be.

@JonHanna You already seem to know the ins and outs of this issue. Since I'm working on a personal project right now, I'd be happy to let you handle it. Otherwise (since this is a bug, and I introduced it) I'll submit a PR within a few days.

@JonHanna If this happens in any other places like ToList(), please be sure those functions are fixed too.

This looks pretty bad - I wonder what is the impact of this bug. Which kind of scenarios is it going to show up?
@VSadov @OmarTawfik can you please dig into it as a priority? If it is impactful, we may need to fix it in 2.0 servicing.

@jamesqo I'm phone-access only for a few hours at least, but I'll take a look if you haven't found it by then

My scenario is part of an ORM - assembling a list of "operations" all subclassing a common base. Since triggering the bug requires all of concat(), toarray() and multiple-subtypes, it probably isn't common. It was very disconcerting though, seeing the knock-on effects when 1+1+1 was no longer equal to 3 :)

Looking into it.

In LargeArrayBuilder.cs changing for (; count > 0; row++, column = 0) to for (; count > 0; row++) fixes this and doesn't introduce another test failure, but that assignment to column is presumably there for a reason. @jamesqo ?

On the other hand, with it in the += to it within the loop is pointless. Is there a case where this should be set to zero?

@JonHanna the root cause is ConcatNIterator.LazyToArray(). It is calling builder.ReserveOrAdd() which in turn calls EnumerableHelpers.TryGetCount() which tests source is ICollection collection.
The problem with this specific bug is how the lists are defined (List vs List or List).
The second one (B) fails the check because source is ICollection<T> collection is using T=object.
You can simply change the check to source is ICollection collection to make it work as a workaround, but I'm investigating the larger scope.

Yeah, not finding the count should be sub-optimal, but still correct, and if it did get the count there'd still be cases that mixed cases that could and couldn't get it.

That's the root cause:

                TSource[] array = builder.ToArray();

                ArrayBuilder<Marker> markers = builder.Markers;
                for (int i = 0; i < markers.Count; i++)
                {
                    Marker marker = markers[i];
                    IEnumerable<TSource> source = GetEnumerable(deferredCopies[i]);
                    EnumerableHelpers.Copy(source, array, marker.Index, marker.Count);
                }

                return array;

This pattern (I'm finding it in a few spots) is not using SparseArrayBuilder correctly when deconstructing it. I'm looking into refactoring that.

@OmarTawfik beside having a fix (which is awesome btw), I would love to know what is the impact of this bug -- which scenarios will be impacted? How common they are? How hard it will be for affected developers to root cause it? (likely pretty hard)

This pattern is used from ToArray() on the result of SelectMany(), Concat(), Prepend(), or Append() (although I didn't try to reproduce the behavior on all cases). The issue happens when you have collections of different types; they are constructed using SparseArrayBuilder incorrectly. I'm not sure how common that scenario is.

@jamesqo basically, SparseArrayBuilder.CopyTo() has a bug. When it hits a marker , it advances arrayIndex, copied, and count (to skip the empty region covered by the marker). However, it doesn't modify position, so invalid items get copied. I don't think LargeArrayBuilder even exposes that.

Since SparseArrayBuilder is only used in ToArray() scenarios, is there a benefit from using this type here? why creating sparse arrays then copying them to normal arrays, then filling the empty spaces? why not just have a helper that operates on a single array directly? am I missing something?

Spent some time investigating other possibilities. The following is a repro for Concat().ToArray():

        [Fact]
        public void ConcatListsProduceOrderedResults()
        {
            var results = new TestEnumerable<int> (new int[] { 1 })
                .Concat(new List<int> { 2 })
                .Concat(new TestEnumerable<int>(new int[] { 3 }))
                .ToArray();

            Assert.Equal("1 2 3", string.Join(' ', results)); // Produces 1 2 1
        }

And this is for SelectMany().ToArray():

        [Fact]
        public void AppendToNonCollectionToArrayOrderedResults()
        {
            var results = new int[] { 4, 5, 6 }
            .SelectMany(i => i == 5
                ? (IEnumerable<int>)new List<int>() { i }
                : new TestEnumerable<int>(new int[] { i }))
            .ToArray();

            Assert.Equal("4 5 6", string.Join(' ', results));  // Produces 4 5 4
        }

I didn't find a possible path for Append().ToArray() or Prepend().ToArray(). The problem can happen because of two reasons:

  1. In the original bug reported: The list new List<object> { new B() }) skipped an erroneous check in EnumerableHelpers.TryGetCount(), which caused it to be marked as 'not-easy-to-get-count-for'.
  2. The wider cause: If one of the collections (for example, check TestEnumerable here) is not 'not-easy-to-get-count-for, it will be marked as such as well.

ToArray() on groups of these enumerators will fail to order them correctly, thus resulting in invalid elements.

On part one of that, the case of an enumerable having a count or length property through non-generic ICollection or for a different element type, but not ICollection<T> for the relevant T was considered but it was decided as likely rare enough that the cost of checking wouldn't be justified by the benefit, and I'm inclined to still agree.

Of course the second issue which is not merely slower than possible, but incorrect, is not a matter of weighing likelihood of inputs.

The assignment of zero to column is what is causing the manipulation within the array builder to go wrong, but having only had snatches of time to look at it I'm not sure that that assignment shouldn't happen in some cases, in which case removing it would just replace one bug with another.

Hi everyone, I'm on mobile rn so I can't really write a lot. I see you guys are fretting over this and the code I wrote isn't the most straightforward to fix, so I will take a look asap and submit a fix when I'm around a computer. Thanks.

I should be able to submit a fix tmrw for this. Thanks for all of the preliminary investigation @JonHanna and @OmarTawfik.

edit: After thinking about it in my mind for a while, I believe what's happening is the column = 0 part is causing the returned CopyPosition (which is supposed to represent the position copied up to) to be inaccurate. In the example in @JonHanna's 2nd comment, we should start copying after the null from (row 0, column 1) but instead do it from (row 0, column 0).

Thanks @jamesqo! but my question was about the reason we're using SparseArrayBuilder at all :)
From what I see, the existing code goes over all the underlying collections, and if it is cheap to get their count, store them in a separate array, or else enumerate/add them. Then does a second pass after calling SparseArrayBuilder.ToArray() to fill the resulting array.
Maybe I'm missing something, but why did we add SparseArrayBuilder at all? shouldn't we just be using LargeArrayBuilder directly since we're enumerating all the collections eventually anyways?

Edit: I see the difference in ordering now.

@OmarTawfik Thanks for the regression tests, I'm going to use them in my upcoming PR. Luckily I don't think this bug happens with Append / Prepend (the other place where SAB<T> is used), since you don't have any possibility of a gap surrounded by lazy enumerables there.

My group appears to have hit this as well while trying to move our project to netstandard2.0. We also hit it once with Concat and once with SelectMany.

With Concat, we had (roughly) GetValues1().Concat(GetValues2()).Concat(GetValuesN()).ToArray() and ended up with a null in the final set of values.

With SelectMany we had the following:

internal Dictionary<AAA, string[]> Get(MultiValueDictionary<AAA, BBB> arg)
    => arg.ToDictionary(kvp => kvp.Key,
                        kvp => kvp.Value.SelectMany(g => g.GetValues()).ToArray());

and saw the same behavior where a b c d became a b c a.

I think we should include the fix in 2.0.2 servicing release.

I think we should include the fix in 2.0.2 servicing release.

I agree. cc: @karelz, @Petermarcu

@markples

ended up with a null in the final set of values.

Can you please share a repro for your first example? That may be a separate bug on its own. As far as I can see, this bug should only incorrectly duplicate some items, e.g. in your second example a was duplicated. It shouldn't insert a null if none of the collections have null.

@jamesqo

Here is a repro (from @vuminhle). It seems very specific - reducing the list sizes or changing the construction in various ways loses the repro.

using System;
using System.Collections.Generic;
using System.Linq;

namespace repro {
    internal class Program {
        private static void Main(string[] args) {
            A[] list = List1().Concat(List2()).Concat(List3()).ToArray();
            foreach (A a in list) {
                Console.WriteLine(a.Value);
            }
        }

        internal static IEnumerable<A> List1() {
            for (var i = 0; i < 4; i++) {
                yield return new A(i);
            }
        }

        internal static IEnumerable<A> List2() {
            return Enumerable.Range(0, 2).Select(v => new A(v));
        }

        internal static IEnumerable<A> List3() {
            for (var i = 0; i < 5; i++) {
                yield return new A(i);
            }
        }

        internal class A {
            public A(int v) {
                Value = v;
            }

            public int Value { get; }
        }
    }
}

My scenario is part of an ORM - assembling a list of "operations" all subclassing a common base.

I've just realised this is what was causing a bug with a website I was porting from netfx to corefx, and finding it hard to reproduce because the order of the sources to the concatenation was non-deterministic and only sometimes in the order to trigger this, but I didn't think that would matter to the bug. Easily worked around now I know what it is, so thanks for that, @gulbanana

@OmarTawfik You can close this. I opened dotnet/corefx#23833 for the separate issue mentioned, which I intend to grab tmrw.

It was reopened to track 2.0.x port in PR dotnet/corefx#23865

@karelz this can be closed right?

Yes, now that the rel/2.0.0 PR is merged.

Was this page helpful?
0 / 5 - 0 ratings