Proper way to synchronize a property's value in a multi-threaded application-CodePudding

I've recently started revisiting some of my old multi-threaded code and wondering if it's all safe and correct (No issues in production yet...). In particular am I handling object references correctly? I've read a ton of examples using simple primitives like integers, but not a lot pertaining to references and any possible nuances.

First, I recently learned that object reference assignments are atomic, at least on a 64 bit machine which is all I'm focused on for this particular application. Previously, I was locking class properties' get/sets to avoid corrupting the reference as I didn't realize reference assignments were atomic. For example:

    // Immutable collection of options for a Contact
    public class ContactOptions
    {
        public string Email { get; }
        public string PhoneNumber { get; }
    }
    
    // Sample class that implements the Options
    public class Contact
    {
        private readonly object OptionsLock = new object();
        private ContactOptions _Options;
        public ContactOptions Options { get { lock(OptionsLock) { return _Options; } }
            set { lock(OptionsLock) { _Options = value; } } };
    }

Now that I know that a reference assignment is atomic, I thought "great, time to remove these ugly and unnecessary locks!" Then I read further and learned of synchronization of memory between threads. Now I'm back to keeping the locks to ensure the data doesn't go stale when accessing it. For example, if I access a Contact's Options, I want to ensure I'm always receiving the latest set of Options assigned.

Questions:

Correct me if I'm wrong here, but the above code does ensure that I'm achieving the goal of getting the latest value of Options when I get it in a thread safe manner? Any other issues using this method?
I believe there is some overhead with the lock (Converts to Monitor.Enter/Exit). I thought I could use Interlocked for a nominal performance gain, but more importantly to me, a cleaner set of code. Would the following work to achieve synchronization?

    private ContactOptions _Options;
    public ContactOptions Options { 
        get { return Interlocked.CompareExchange(ref _Options, null, null); }
        set { Interlocked.Exchange(ref _Options, value); } }

Since a reference assignment is atomic, is the synchronization (using either lock or Interlocked) necessary when assigning the reference? If I omit the set logic and only maintain the get, will I still maintain atomicity and synchronization? My hopeful thinking is that the lock/Interlock usage in the get would provide the synchronization I'm looking for. I've tried writing sample programs to force stale value scenarios, but I couldn't get it done reliably.

    private ContactOptions _Options;
    public ContactOptions Options { 
        get { return Interlocked.CompareExchange(ref _Options, null, null); }
        set { _Options = value; } }

Side Notes:

The ContactOptions class is deliberately immutable as I don't want to have to synchronize or worry about atomicity within the options themselves. They may contain any kind of data type, so I think it's a lot cleaner/safer to assign a new set of Options when a change is necessary.
I'm familiar of the non-atomic implications of getting a value, working with that value, then setting the value. Consider the following snippet:

    public class SomeInteger
    {
        private readonly object ValueLock = new object();
        private int _Value;
        public int Value { get { lock(ValueLock) { return _Value; } }
            private set { lock(ValueLock) { _Value = value; } } };
        
        // WRONG
        public void manipulateBad()
        {
            Value  ;
        }
        
        // OK
        public void manipulateOk()
        {
            lock (ValueLock)
            {
                Value  ;
                // Or, even better: _Value  ; // And remove the lock around the setter
            }
        }
    }

Point being, I'm really only focused on the memory synchronization issue.

CodePudding user response：

Yes, the lock (OptionsLock) ensures that all threads will see the latest value of the Options, because memory barriers are inserted when entering and exiting the lock.
Replacing the lock with methods of the Interlocked or the Volatile class would serve equally well the latest-value-visibility goal. Memory barriers are inserted by these methods as well. I think that using the Volatile communicates better the intentions of the code:

public ContactOptions Options
{
    get { return Volatile.Read(ref _Options); }
    set { Volatile.Write(ref _Options, value); }
}

Omitting the synchronization in either the get or the set accessor puts you automatically in the big black forest of memory models, cache coherency protocols and CPU architectures. In order to know if it's safe to omit it, intricate knowledge of the targeted hardware/OS configuration is required. You will need either an expert's advice, or to become an expert yourself. If you prefer to stay in the realm of software development, don't omit the synchronization!

CodePudding user response：

Correct me if I'm wrong here, but the above code does ensure that I'm achieving the goal of getting the latest value of Options when I get it in a thread safe manner? Any other issues using this method?

Yes, locks will emit memory barriers, so it will ensure the value is read from memory. There are no real issues other than potentially being more conservative than it has to be. But I have a saying, if in doubt, use a lock.

I believe there is some overhead with the lock (Converts to Monitor.Enter/Exit). I thought I could use Interlocked for a nominal performance gain, but more importantly to me, a cleaner set of code. Would the following work to achieve synchronization?

Interlocked should also emit memory barriers, so I would think this should do more or less the same thing.

Since a reference assignment is atomic, is the synchronization (using either lock or Interlocked) necessary when assigning the reference? If I omit the set logic and only maintain the get, will I still maintain atomicity and synchronization? My hopeful thinking is that the lock/Interlock usage in the get would provide the synchronization I'm looking for. I've tried writing sample programs to force stale value scenarios, but I couldn't get it done reliably.

I would think that just making the field volatile should be sufficient in this scenario. As far as I understand it the problem with "stale values" is somewhat exaggerated, the cache coherency protocols should take care of most issues.

To my knowledge, the main problem is preventing the compiler from just put the value in a register and not do any subsequent load at all. And that volatile should prevent this, forcing the compiler to issue a load each time it is read. But this would mostly be an issue when repeatedly checking a value in a loop.

But it is not very useful to look at just a single property. Problems more often crop up when you have multiple values that needs to be synchronized. A potential issue is reordering instructions by the compiler or processor. Locks & memory barriers prevent such reordering, but if that is a potential issue, it is probably better to lock a larger section of code.

Overall I consider it prudent to be paranoid when dealing with multiple threads. It is probably better to use to much synchronization than to little. One exception would be deadlocks that may be caused by having too many locks. My recommendation regarding this is to be very careful what you are calling when holding a lock. Ideally a lock should only be held for a short, predictable, time.

Also keep using pure functions and immutable data structures. These are a great way to avoid worrying about threading issues.