Investigating a crash in Enumerable.LastOrDefault with a custom collection

 
 
  • Gérald Barré

This post is part of the series 'Crash investigations and code reviews'. Be sure to check out the rest of the blog posts of the series!

Have you ever seen the following thread-safety remark on some concurrent collections?

Thread-Safety remark for ConcurrentBag

Let me explain what it means. We got a crash in an application when using Enumerable.LastOrDefault on a custom collection that has the same thread-safety remark. Here's the call stack:

System.ArgumentOutOfRangeException: Specified argument was out of the range of valid values. (Parameter 'index')
   at System.Collections.Immutable.Requires.FailRange(String parameterName, String message)
   at System.Collections.Immutable.ImmutableList`1.Node.ItemRef(Int32 index)
   at System.Collections.Immutable.ImmutableList`1.get_Item(Int32 index)
   at System.Collections.Immutable.ImmutableList`1.System.Collections.Generic.IList<T>.get_Item(Int32 index)
   at MyApplication.CustomCollection`1.get_Item(Int32 index)
   at System.Linq.Enumerable.TryGetLast[TSource](IEnumerable`1 source, Boolean& found)
   at System.Linq.Enumerable.LastOrDefault[TSource](IEnumerable`1 source)
   at MyApplication.Program.DoJob[T](CustomCollection`1 items)

Here's the offending code:

C#
static void DoJob<T>(CustomCollection<T> items)
{
    var lastItem = list.LastOrDefault();
    Console.WriteLine(lastItem);
}

Here's the code for the custom collection:

C#
class CustomCollection<T> : IList<T>, IReadOnlyList<T>
{
    private readonly object _lock = new object();
    private ImmutableList<T> _list = ImmutableList.Create<T>();

    public T this[int index]
    {
        get => _list[index];
        set => throw new NotSupportedException();
    }

    public int Count => _list.Count;
    public bool IsReadOnly => false;
    public int IndexOf(T item) => _list.IndexOf(item);

    public IEnumerator<T> GetEnumerator() => _list.GetEnumerator();

    public void Add(T item)
    {
        lock (_lock)
        {
            _list = _list.Add(item);
        }
    }

    public bool Remove(T item)
    {
        lock (_lock)
        {
            var oldList = _list;
            _list = _list.Remove(item);
            return _list != oldList;
        }
    }

    // non-relevant methods omitted for brevity
}

The call stack suggests a thread-safety issue. There is no reason for a single-threaded application to crash here. As you can see, all methods of the collection are thread-safe, so the issue doesn't seem to originate from the collection itself. Additionally, enumerating an ImmutableList<T> while modifying the collection from another thread is safe.

The only remaining part is Enumerable.LastOrDefault. LastOrDefault calls TryGetLast. Let's look at the code of this method:

C#
// https://github.com/dotnet/runtime/blob/f86385a1a452a7ac08277f2ef66ebe302b3c90ca/src/libraries/System.Linq/src/System/Linq/Last.cs#L43
private static TSource TryGetLast<TSource>(this IEnumerable<TSource> source, out bool found)
{
    // null check and non-relevant branches omitted for brevity
    if (source is IList<TSource> list)
    {
        int count = list.Count;
        if (count > 0)
        {
            found = true;
            return list[count - 1];
        }
    }
    else
    {
        using (IEnumerator<TSource> e = source.GetEnumerator())
        {
            if (e.MoveNext())
            {
                TSource result;
                do
                {
                    result = e.Current;
                }
                while (e.MoveNext());

                found = true;
                return result;
            }
        }
    }

    found = false;
    return default!;
}

There is a special case when the collection implements IList<T>. In that case, it calls Count and uses the indexer to get the item. When multiple threads access the collection, there is no guarantee it remains unchanged between the calls to list.Count and list[count - 1]. If an item is removed between those two calls, an IndexOutOfRangeException is thrown. This means that even if the GetEnumerator method is thread-safe, the LastOrDefault method is not when the collection implements IList<T>.

C#
int count = list.Count;     // The collection can change between this call
if (count > 0)
    return list[count - 1]; // and this one. If one item is removed, this will crash

There are multiple ways to fix this issue. One approach is to add a method directly on the custom collection that operates on ImmutableList<T>:

C#
class CustomCollection<T> : IList<T>, IReadOnlyList<T>
{
    private ImmutableList<T> _list = ImmutableList.Create<T>();

    // This call is safe as it operates directly on the ImmutableList<T>
    // instead of the CustomCollection<T>
    public T LastOrDefault() => _list.LastOrDefault();
}

Alternatively, you can wrap the collection in a type that only implements IEnumerable<T>:

C#
customCollection.AsOnlyEnumerable().LastOrDefault();

static class EnumerableWrapperExtensions
{
    // Note that AsEnumerable has a different behavior, thus the name AsOnlyEnumerable
    // https://www.meziantou.net/what-s-the-usage-of-asenumerable.htm
    public static IEnumerable<T> AsOnlyEnumerable<T>(this IEnumerable<T> items)
    {
        foreach(var item in items)
            yield return item;
    }
}

Alternatively, you can implement it manually:

C#
static void DoJob<T>(CustomCollection<T> items)
{
    T lastItem = default;
    using var enumerator = items.GetEnumerator();
    while (enumerator.MoveNext())
    {
        lastItem = enumerator.Current;
    }

    Console.WriteLine(lastItem);
}

In multi-threaded applications, these subtle interactions between collection interfaces and LINQ operators can cause hard-to-reproduce crashes. Be careful.

#Additional resources

Do you have a question or a suggestion about this post? Contact me!

Follow me:
Enjoy this blog?