How to speed up the population of a second level index map using multithreading?-CodePudding

I'm stuck with a performance issue I would hope to get some help/idea/suggestions.

First things first, some context:

A foo() procedure it's processed in parallel through multithreading. One goal of this procedure it's to populate a map (from now on mainMap) to index some results obtained during the computation. The signature of the mainMap is Map<Pair<T, T>, Object>

For the nature of my problem the values of the mainMap doesn't really matter. So let's focus on the key.

The Pair object it's really simple, it just pack up 2 (same type) objects.

public class Pair<T, T> {
    private T first; //first member of pair
    private T second; //second member of pair

    public Pair(T first, T second) {
        this.first = first;
        this.second = second;
    }

    public boolean contains(T elem) {
        return this.first.equals(elem) || this.second.equals(elem);
    }

    //Getters and Setters...
}

Problem: For my processing purposes, I need to fetch from the mainMap all values that have as key a pair that contains as first or second element a specific T elem .

Now I'm gonna walk you through my approaches to the one I guess might work.

Note: in the examples I'm gonna use Strings as implementation of the generics T, but keep in mind that could be anything.

Dummy approach: I just ran a mainMap.keySet() and filter the result using key.contains(elem). All the pairs remained can now be used to query the mainMap.

Results: This basic approach just doesn't fit my needs in terms of performance as long my maps can be very large with thousands of thousands of entries.

Secondary level indexing: I created 2 new maps with the following signature Map<T, Set<Pair<T, T>>> called firstToPairMap and secondToPairMap. They work as follows:

Let's say that in the mainMap there are these 2 entries:
```
  {A, B} -> ...
  {C, B} -> ...
  {X, Z} -> ...
```
The firstToPairMap would then be populated like this:
```
  A -> [{A, B}]
  C -> [{C, B}]
  X -> [{X, Z}]
```
While the secondToPairMap would be like this:
```
  B -> [{A, B}, {C, B}]
  Z -> [{X, Z}]
```

As you can see both these 2 new maps index the keys of the mainMap so that querying some T elem to both these secondary level index maps give directly all the pairs I need to query the mainMap.

Results: The querying part works just fine. The main problem here is the population of these 2 secondary level index maps. The mainMap is populated in parallel by multithreading and so are the secondary level index maps. But in order to populate correctly these maps I can't simply put new entries, I need to merge new values with pre-existent ones (look the B case on the secondToPairMap). So, to get to the point, the problem is that this operation of merging it's really slow.

Trying to pursuing the second approach, I tried the following strategy without any success yet. I queued all Pairs I had to insert during the parallel computing and delayed this operation so that it remains only to implement (better) this function:

private static void dequeuePairsMap(Queue<Pair<T, T>> queue, Map<T, Set<Pair<T, T>>> secondToPairMap, Map<T, Set<Pair<T, T>>> firstToPairMap) {
   while(!queue.isEmpty()) {
        Pair<T, T> pair = queue.poll();
        Set<Pair<T, T>> secondToPairSet = secondToPairMap.getOrDefault(pair.getSecond(), new HashSet<>());
        Set<Pair<T, T>> firstToPairSet = firstToPairMap.getOrDefault(pair.getFirst(), new HashSet<>());
        secondToPairSet.add(pair);
        firstToPairSet.add(pair);
        secondToPairMap.put(pair.getSecond(), secondToPairSet);
        firstToPairMap.put(pair.getFirst(), firstToPairSet);
    }
}

As you can tell this implementation doesn't solve my problem cause I'm still relying on a merge kind approach. I was just able to prove that the performance problem it's not due the concurrent access to the map, but it's truly the merge operation.

Can you help me figuring it out how to solve the performance issue?

EDIT: sharing new implementation of dequeuePairsMap that actually speed up things. The idea behind is to perform as many possibile bulk insert in the map and minimizing in the same time merge requests.

private static void dequeuePairsMap(Queue<Pair<T, T>> queue, Map<T, Set<Pair<T, T>>> secondToPairMap, Map<T, Set<Pair<T, T>>> firstToPairMap) {

    while(!queue.isEmpty()) {
        Pair<T, T> pair = queue.poll();
        Set<Pair<T, T>> firstToPairSet = queue.stream().filter(edge -> edge.containsAsFirst(pair.getFirst())).collect(Collectors.toSet());
        Set<Pair<T, T>> secondToPairSet = queue.stream().filter(edge -> edge.containsAsSecond(pair.getSecond())).collect(Collectors.toSet());
        firstToPairSet.add(pair);
        secondToPairSet.add(pair);
        Set<Pair<T, T>> firstToPairExistingSet = firstToPairMap.getOrDefault(pair.getFirst(), new HashSet<>());
        Set<Pair<T, T>> secondToPairExistingSet = secondToPairMap.getOrDefault(pair.getSecond(), new HashSet<>());
        if(!firstToPairExistingSet.containsAll(firstToPairSet))
            firstToPairMap.merge(pair.getFirst(), firstToPairSet, (oldSet, newSet) -> Stream.concat(oldSet.stream(), newSet.stream()).collect(Collectors.toSet()));

        if(!secondToPairExistingSet.containsAll(secondToPairSet))
            secondToPairMap.merge(pair.getSecond(), secondToPairSet, (oldSet, newSet) -> Stream.concat(oldSet.stream(), newSet.stream()).collect(Collectors.toSet()));

    }

}

CodePudding user response：

I don't know what your throughput is, or what you expect it to be, but the "merge" should be pretty efficient since it is just inserting a value into a set. If its performance is not what is expected, then I'd suggest looking at the performance of the hashcode() method. It should be fast to calculate, and should distribute your keys to a spread of values. A poor hashcode() method on custom objects can make hashing slow.

You can get bad performance due to locking and concurrency issues, in which case ConcurrentHashMap may help. It also offers a newKeySet() method for a concurrent set implementation.

Also, I would suggest that you try not using getOrDefault here. Instead get and test for null. In the case where the key exists, you can then skip your put operations. Otherwise construct the default explicitly and put it. Minimizing operations to your parent hashmaps is good for throughput. But we're only talking about a small factor, so I can't imagine this making much of a difference.

If you absolutely can't fix the bottleneck, then you can solve this by having multiple threads, each with a portion of the data structure, and divide keys by the range the hashcode is in. So instead of writing to something like firstToPairMap you calculate a hashcode(), and based on it send this to the correct queue running on its own CPU to manipulate this data structure.

That seems excessive though.