Home > Net >  Return Linq.Where object (IEnumerable) from within a lock - is it thread safe?
Return Linq.Where object (IEnumerable) from within a lock - is it thread safe?

Time:01-10

Consider the following code block

public class Data
{
    public bool Init { get; set; }
    public string Value {get; set; }
}

public class Example
{

    private Object myObject = new Object();
    private List<Data> myList = new List<Data>
    {
       new Data { Init = true, Value = "abc" },
       new Data { Init = false, Value = "def" },
    };

    public IEnumerable<string> Get()
    {
        lock(this.myObject)
        {
            return this.myList.Where(i => i.Init == true).Select(i => i.Value);
        }
    }

    public void Set(string value)
    {
        lock (this.myObject)
        {
            this.myList.Add(new Data { Init = false, Value = value });
        }
    }
}

If multiple threads are calling Get() - will the method be thread-safe?

In addition - will invoking .ToList() at the linq query will make it thread-safe?

return this.myList.Where(i => i.Init == true).Select(i => i.Value).ToList()

CodePudding user response:

Note that you do not lock here:

public void Set(string value)
{
    this.myList.Add(new Data { Init = false, Value = value });
}

So it's not thread-safe in any case.

Assuming you just forgot to do that - it's still not safe because Get returns "lazy" IEnumerable. It holds a reference to myList and it will enumerate it only when returned IEnumerable itself will be enumerated. So you are leaking reference to myList you are trying to protect with lock, outside of lock statement to arbitrary code.

You can test it like this:

var example = new Example();
var original = example.Get();
example.Clear(); // new method which clears underlying myList
foreach (var x in original)
    Console.WriteLine(x);

Here we call Get, then we clear myList and then we enumerate what we got from Get. Naively one may assume original will contain original 2 items we had, but it will not contain anything, because it's evaluated only when we enumerate original, and at that point in time - list has already been cleared and is empty.

If you use

public IEnumerable<string> Get()
{
    lock(this.myObject)
    {
        return this.myList.Where(i => i.Init == true).Select(i => i.Value).ToList();
    }
}

Then it will be "safe". Now you return not "lazy" IEnumerable but a new instance of List<> with copies of values you have in myList.

CodePudding user response:

You have to be aware about the difference between the potential to enumerate a sequence (= the IEnumerable) and the enumerated sequence itself (the List, Array, etc after you enumerated the sequence.

So you have a class Example, which internally holds a List<Data> in member MyList. Every Data has at least a string property 'Value`.

Class Example has Methods to extract Value, and to add new elements to the MyList.

I'm not sure if it is wise to call them Set and Get, these names are quite confusing. Maybe you've simplified your example (which by the way made it more difficult to talk about it).

You have an object of class Example and two threads, that both have access to this object. You worry, that while one thread is enumerating the elements of the sequence, that the other thread is adding elements of the sequence.

Your Get method returns the "potential to enumerate". The sequence is not enumerated yet after you return from Get, and after the Lock is disposed.

This means, that when you start enumerating the sequence, the Data is not locked anymore. If you've ever returned an IEnumerable from data that you fetched from a database, you probably have seen the same problem: the connection to the database is disposed before you start enumerating.

Solution 1: return enumerated data: inefficient

You already mention one solution: enumerate the data in a List before you return. This way, property MyList will not be accesses after you return from Get, so the lock is not needed anymore:

public IEnumerable<string> GetInitializedValues()
{
    lock(this.MyList)
    {
        return this.MyList
            .Where(data => data.Init == true)
            .Select(data => data.Value)
            .ToList();
    }
}

In words: Lock MyList, which is a sequence of Data. Keep only those Data in this sequence that have a true value for property Init. From every remaining Data, take the value of property Value and put them in a List. Dispose the lock and return the List.

This is not efficient if the caller doesn't need the complete list.

// Create an object of class Example which has a zillion Data in MyList:
Example example = CreateFilledClassExample();

// I only want the first element of MyList:
string onlyOneString = example.GetInitializedValues().FirstOrDefault();

GetInitializedValues creates a list of zillion elements, and returns it. The caller only takes the first initialized value and throws the rest of the list away. What a waste of processing power.

Solution 2: use yield return: only enumerate what must be enumerated

The keyword yield means something like: return the next element of the sequence. Keep everything alive, until the caller disposes the IEnumerator

public IEnumerable<string> GetInitializedValues()
{
    lock(this.MyList)
    {
        IEnumerable<string> initializedValues = this.MyList
            .Where(data => data.Init == true)
            .Select(data => data.Value);

        foreach (string initializedValue in initializedValues)
        {
            yield return initializedValue;
        }
    }
}

Because the yield is inside the lock, the lock remains active, until you dispose the enumerator:

List<string> someInitializedValues = GetInitializedValues()
   .Take(3)
   .ToList();

This one is save, and only enumerates the first three elements.

Deep inside it will do something like this:

List<string> someInitializedValues = new List<string>();
IEnumerable<string> enumerableInitializedValues = GetInitializedValues();
// MyList is not locked yet!

// Create the enumerator. This is Disposable, so use using statement:
using (IEnumerator<string> initializedValuesEnumerator = enumerableInitializedValues.GetEnumerator())
{
    // start enumerating the first 3 elements (remember: we used Take(3)
    while (initializedValuesEnumerator.MoveNext() && someInitializedValues.Count < 3)
    {
        // there is a next element, fetch it and add the fetched value to the list
        string fetchedInitializedValue = initializedValuesEnumerator.Current;
        someInitializedValues.Add(fetchedInitializedValue);
    }

    // the enumerator is not disposed yet, MyList is still locked.
}
// the enumerator is disposed. MyList is not locked anymore
  • Related