Home > Blockchain >  Global variable concurrent use for write-once read-many usecase
Global variable concurrent use for write-once read-many usecase

Time:12-30

I want to use a global variable. During first access to read it, I want to write to this variable using an API. For any subsequent access to read it, it should not require locks.

This is the implementation I have written. But this seems like an overkill for such a small task.

  1. Is there a conventional way it is done in go?
  2. If I have multiple such global variables and I don't want to put them under this same struct, is there a way to do this without code duplication?
  3. Read this answer. One way for making single variable's use to be atomic is by using "sync/atomic" library. But, the functions are only there for integer types. How does community work with "string" type?

Please free to suggest any other unrelated changes.

PS: Using sync.Once didn't seem right. If the first time to fetch fails, the program will never get 'clusterName'. Also, couldn't think of a way to make other readers wait until sync.Once is complete. Any ideas?

package main

import (
    "fmt"
    "sync"
    "time"
)

type ClusterName struct {
    sync.RWMutex
    clusterName string
}

// Global variable. Read-many, write once.
var clusterName ClusterName

// Method to avoid locking during reads if 'clusterName' has been filled once.
func GetClusterName() (string, error) {
    // Take read lock to see if 'clusterName' is filled.
    clusterName.RLock()
    if clusterName.clusterName == "" {
        // 'clusterName' is not filled. Release read-lock and call method to fill it with write-lock.
        clusterName.RUnlock()
        return getClusterName()
    }

    defer clusterName.RUnlock()
    return clusterName.clusterName, nil
}

// Method to fetch and fill cluster name. Takes a write-lock.
func getClusterName() (string, error) {
    // Take write-lock.
    clusterName.Lock()
    defer clusterName.Unlock()

    // See if previous writer has already filled this. Just return if already filled.
    if clusterName.clusterName != "" {
        return clusterName.clusterName, nil
    }

    // Only 1 writer will ever reach here.
    var err error
    clusterName.clusterName, err = fetchClusterName()
    if err != nil {
        return "", err
    }

    return clusterName.clusterName, nil
}

func fetchClusterName() (string, error) {
    // API call.
    time.Sleep(time.Second)
    return "test-cluster-name", nil
}

func main() {
    for i := 0; i < 50; i   {
        fmt.Println(GetClusterName())
    }
}

Playground: https://go.dev/play/p/PV42PMXliRC

CodePudding user response:

Reading your code, it seems that sync.RWMutex is unsuitable here. From how it works, it seems that for any kind of read operation, there is a chance that it results in write operation too.

Use a sync.Mutex instead of RWMutex

Mutex fixes the flaw in the code, in which RLock can just be replaced with Lock. This is because in your code, you initially read the variable with RLock and when you realize that it is empty, release the read lock and do write lock instead. This very little moment during unlocking and relocking with write lock can cause race condition, so why not just do a full lock and unlock for every read? It also makes the code simpler. As you said, fetch can sometimes fail. This means that the first read operation does not always result in write to clusterName. Therefore, we have to assume that every read operation can result in write, which means that read locking only cannot be used here.

Here is a re-implementation for your code

func GetClusterName() (string, error) {
    // Take read lock to see if 'clusterName' is filled.
    clusterName.Lock()
    defer clusterName.Unlock()
    if clusterName.clusterName == "" {
        return getClusterName()
    }

    return clusterName.clusterName, nil
}

func getClusterName() (string, error) {
    // See if previous writer has already filled this. Just return if already filled.
    if clusterName.clusterName != "" {
        return clusterName.clusterName, nil
    }

    // Only 1 writer will ever reach here.
    var err error
    clusterName.clusterName, err = fetchClusterName()
    if err != nil {
        return "", err
    }

    return clusterName.clusterName, nil
}

So,

  1. You should use sync.Mutex. As you said, sync.Once should not be used here as it only ensures that an action is done once, regardless that it leads to successful operation or not. This is also the most conventional and simplest way
  2. This also depends on what you are trying to achieve and how much codes you want to share. I suggest creating multiple structs and an interface which contains methods to get, fetch, and set a value, and also lock and unlock. All structs should have mutex in it and a variable that you want to set. You can then create a function that does locking, get, fetch, set, and unlock.
  3. String is, well, different than integer. String is an array of character (integer), it is stored differently than int. Instead of using atomic, we usually just use a separate mutex to operate with string.

Here is my example. I assume that each of your variable has different ways of fetching so Fetch needs to also be abstract.

type LockSetter interface {
    Lock()
    Unlock()
    Set(value string)
    Get() string
    Fetch() (string, error)
}

type LockSetVariable struct {
    lock       *sync.Mutex
    myVariable string
}

func (l *LockSetVariable) Lock() {
    l.lock.Lock()
}

func (l *LockSetVariable) Unlock() {
    l.lock.Unlock()
}

func (l *LockSetVariable) Set(value string) {
    l.myVariable = value
}

func (l *LockSetVariable) Get() string {
    return l.myVariable
}

func (l *LockSetVariable) Fetch() (string, error) {
    now := time.Now()
    l.myVariable = now.Format("2006")
    return "", nil
}

func GetValue(param LockSetter) (string, error) {
    param.Lock()
    defer param.Unlock()
    
    v := param.Get()
    if v == "" {
        newV, err := param.Fetch()
        if err != nil {
            return "", err
        }
        
        param.Set(newV)
    }
    
    return v, nil
}

CodePudding user response:

You need a modified sync.Once implementation, if you look at sync.Once, it does this:

if (atomic.LoadUint32(&done) == 0) {
    m.Lock()
    if (done == 0) {
        f()
        atomic.StoreUint32(&done, 1)
    }
    m.Unlock()
}

Essentially, it checks if the operation has been executed before the check, then enter a lock to wait for/execute the operation. Furthermore, to work with data types other than the basic integer ones, wrap them in pointers. As a result, you can do:

var m sync.Mutex
var clusterName atomic.Pointer[string]

// Method to avoid locking during reads if 'clusterName' has been filled once.
func GetClusterName() (string, error) {
    // See if 'clusterName' is filled.
    cptr := clusterName.Load()
    if cptr == nil {
        return getClusterName()            
    }

    return *cptr, nil
}

// Method to fetch and fill cluster name. Takes a lock.
func getClusterName() (string, error) {
    // Take lock.
    m.Lock()
    defer m.Unlock()

    // See if previous writer has already filled this. Just return if already filled.
    cptr := clusterName.Load()
    if cptr != nil {
        return *cptr, nil
    }

    // Only 1 successful writer will ever reach here.
    c, err := fetchClusterName()
    if err != nil {
        // Fail, just abort, do not touch clusterName here
        return "", err
    }

    // Only release when succeed
    clusterName.Store(&c)
    return c, nil
}

func fetchClusterName() (string, error) {
    // API call.
    time.Sleep(time.Second)
    return "test-cluster-name", nil
}

The function GetClusterName can only return at 4 points. If it returns after checking the variable outside the lock, then there must be another goroutine having written to clusterName before, which guarantees a release-acquire semantics between these 2 threads. If it returns after checking the variable inside the lock, the same thing happens as before. If it returns after fetchClusterName, then the operation will encounter an error, and the result is an empty string, or the operation succeeded, and the goroutine is the sole writer of the clusterName variable.

  • Related