I have this simple server-socket application using TCP that allows multiple clients to connect to a server and store and retrieve data from an ArrayList. Everytime a client connection is accepted into the server, a new thread is created to handle the client activity, and since all clients must interact with the same data structure I decided to create a static ArrayList as follows:
public class Repository {
private static List<String> data;
public static synchronized void insert(String value){
if(data == null){
data = new ArrayList<>();
}
data.add(value);
}
public static synchronized List<String> getData() {
if(data == null){
data = new ArrayList<>();
}
return data;
}
}
So, everytime a client inserts a value or reads the list they just call Repository.insert(value)
or Repository.getData()
from their respective threads.
My questions are:
- Making these methods synchronized is enough to make the operations thread safe?
- Are there any architectural or performance issues with this static List thing? I could also create an instance of the List in the server class (the one that accepts the connections) and send a reference via contructor to the Threads instead of using the static. Is this any better?
- Can
Collections.synchronizedList()
add any value to such a simple task? Is it necessary? - How could I test the thread-safety in this scenario? I tried just creating multiple clients and make them access the data and everything seems to work, but I'm just not convinced... Here is a short snippet of this test:
IntStream.range(0,10).forEach(i->{
Client client = new Client();
client.ConnectToServer();
try {
client.SendMessage("Hello from client " i);
} catch (IOException e) {
e.printStackTrace();
}});
//assertions on the size of the array
Thanks in advance! :)
CodePudding user response:
- Yes, see https://stackoverflow.com/a/2120409/3080094
- Yes (see comments further below) and yes. Using one object (a singleton if you like) is preferred over static methods (the latter are, in general, harder to maintain).
- Not necessary but preferred: it avoids you making mistakes. Also, instead of:
private static List<String> data;
you can use
private static final List<String> data = Collections.synchronizedList(new ArrayList<>());
which has three benefits: final
ensures all threads see this value (thread-safe), you can remove the code for checking on a null-value (which can be error-prone) and you no longer need to use synchronized
in your code since the list itself is now synchronized.
- Yes, there are ways to improve this to make it more likely you find bugs but when it comes to multi-threading you can never be entirely sure.
About the " architectural or performance issues": every read of the list has to be synchronized so when multiple clients want to read the list, they will all be waiting for a lock to read the entire list. Since you are only inserting at the end and reading the list, you can use a ConcurrentLinkedQueue
. This "concurrent" type (i.e. no need to use synchronized
- the queue is thread-safe) will not lock when the entire list is read, multiple threads can read the data at the same time. In addition, you should hide the details of your implementation which you can do, for example, with an Iterator
:
import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;
private final Queue<String> dataq = new ConcurrentLinkedQueue<>();
public Iterator<String> getData() {
return dataq.iterator();
}
About "testing the thread-safety": focus on the code that needs to be thread-safe. Using clients to connect to a server to test if the Repository
code is thread-safe is inefficient: most of the test will just be waiting on I/O and not actually using Repository
at the same time. Write a dedicated (unit) test just for the Repository
class. Keep in mind that your operating system determines when threads start and run (i.e. your code for threads to start and run are just hints to the operating system), so you will need the test to run for a while (i.e. 30 seconds) and provide some output (logging) to ensure threads are running at the same time. Output to console should only be shown at the end of the test: in Java output to console (System.out) is synchronized which in turn can make threads work one after the other (i.e. not at the same time on the class under test).
Finally, you can improve the test using a java.util.concurrent.CountDownLatch
to let all threads synchronize before executing the next statements concurrently (this improves the chance of finding race-conditions). That is a bit much for this already long answer to explain, so I'll leave you with a (admittedly complicated) example (focus on how the tableReady
variable is used).