Home > database >  Lock-free array expansion in Java
Lock-free array expansion in Java

Time:09-28

I have an array to which many threads are writing. However each thread has a pre-assigned range of indices which it may write to. Further, nothing will be reading from the array until all threads are done.

So far, so thread-safe. The problem arises when I need to expand the array, by which of course I mean swap it out for a larger array which copies the first. This is only done occasionally (similar to an ArrayList).

Currently I'm acquiring a lock for every single write to the array. Even though there is no need to lock in order to keep the array consistent, I'm having to lock in case the array is currently being copied/swapped.

As there are very many writes I don't want to require a lock for them. I'm okay with a solution which requires locking for writer threads only while the array is being copied and swapped, as this is infrequent.

But I can't just impose write locks only when the copy/swap is in progress, as threads may already be committing writes to the old array.

I think I need some variety of barrier which waits for all writes to complete, then pauses the threads while I copy/swap the array. But CyclicBarrier would require me to know exactly how many threads are currently active, which is non-trivial and possibly susceptible to edge-cases in which the barrier ends up waiting forever, or lowers itself too early. In particular I'm not sure how I'd deal with a new thread coming in while the barrier is already up, or how to deal with threads which are currently polling a job queue, so will never decrement the barrier count while there are no new jobs.

I may have to implement something which (atomically) counts active threads and tries to pre-empt all the edge cases.

But this may well be a "solved" problem that I don't know about, so I'm hoping there may be a simpler (therefore better) solution than the Cyclic barrier/thread counting. Ideally one which uses an existing utility class.

By the way, I've considered CopyOnWriteArray. This is no use to me, as it copies for every write (a lot of them), not just array expansions.

Also note the structure written to pretty much has to be an array, or array-based.

Thanks

CodePudding user response:

Although it's technically not correct, you can probably use a ReadWriteLock. The threads that are writing to a single portion all use a read lock (this is the technically incorrect part, they're not reading...), and the resize uses a write lock. That way, all writing threads can work together. A resize has to wait until all portioned writes are done, which then blocks the entire array. Once that is done, all portioned writes can continue.

CodePudding user response:

There is a solution, although there will be some overhead, but no locking.

But first, I would recommend using a 2-D array (an array of arrays) unless you absolutely need a 1-D array. You can then expand the top-level array without affecting the contents of the lower-level arrays. You can also write a wrapper class for this to access the whole thing using 1-D indices if you wish.

But if you really want to have a 1-D array, I would recommend the following:

I am assuming each thread has some number which it knows which uniquely identifies itself and can be converted to a small index (else, I don't see how you index into the main array).

I also assume you have a reference to the main array called mainArray which is a static.

You need another static array currentArrays of length numberOfThreads. Each array element will contain a reference of the main array the thread is currently using.

Also, another static reference to your main array called previousArray.

Both mainArray and previousArray need to be declared volatile.

When you need to grow the array, copy the current array reference to previousArray, allocate a new array and write its reference to mainArray. You don't need to copy anything at this point.

Before accessing the main array in your threads you need to grab a local reference to it (i.e., a local variable) by assigning from mainArray.

Then compare the grabbed reference with the reference in currentArrays. If it is the same, carry on, being careful to use the local reference.

If it is different, call a method (that you will write) to copy the part of the previous array for your thread to the new array and then carry on as before. Write the new array reference to currentArrays for that thread. Again, use the local reference until you are done.

The old array should be garbage collected once all of the threads have finished copying their part of it, which means not until all threads have had at least one request requiring it.

There will be some initialisation code for first time use which should be obvious (all currentArrays elements are set to mainArray).

I believe this should work. There is obviously the overhead of comparing array references before you can access the array; however, if you do a lot of array accesses in a single transaction/request you can save the array reference that you grabbed, pass it around and only recheck when you need to grab it again. That should reduce the overhead.

Disclaimer: I haven't tested it. Comments welcome.

  • Related