Home > Software engineering >  Reconciliation between two List<Map<String, Object>> Objects
Reconciliation between two List<Map<String, Object>> Objects

Time:08-18

I am attempting to reconcile or match the data between two datasets query1results, and query2results. We will refer to them as left hand side (LHS) for query1, and right hand side (RHS) for query2. The goal is to record how many matches we have and how many breaks we have.

  1. Records in both the LHS and the RHS dataset are considered matches
  2. Records in LHS that are not in the RHS are right-hand-breaks
  3. Whatever remains unmatched in the RHS after the first two passes are left-hand-breaks

Here is my code with some of my previous attempts.

    @Override
    public void reconcile(LocalDate date) {
        List<Map<String, Object>> query1Records = executeQuery1(date).collect(Collectors.toList());
        List<Map<String, Object>> query2Records = executeQuery2(date).collect(Collectors.toList());

ATTEMPT #1 -> Only does LHS matching and no breaks

List<Map<String, Object>> matching = query1Records.parallelStream().filter(searchData ->
                query2Records.parallelStream().anyMatch(inputMap ->
                        searchData.get("instrument").equals(inputMap.get("instrument"))
                                && String.valueOf(searchData.get("entity")).equals(inputMap.get("entity"))
                                && searchData.get("party").equals(inputMap.get("party"))
                                && ((BigDecimal) searchData.get("quantity")).compareTo((BigDecimal) inputMap.get("quantity")) == 0))
        .collect(Collectors.toList());

}

ATTEMPT #2 -> Should only match if all values match on LHS and RHS

    List<String> keys = Arrays.asList("entity", "instrument", "party", "quantity");

    Function<Map<String, Object>, List<?>> getKey
            = m -> keys.stream().map(m::get).collect(Collectors.toList());

    Map<List<?>, Map<String, Object>> bpsKeys = query1Records.stream()
            .collect(Collectors.toMap(
                    getKey,
                    m -> m,
                    (a, b) -> {
                        throw new IllegalStateException("duplicate "   a   " and "   b);
                    },
                    LinkedHashMap::new));

    List<Map<String,Object>> matchinRecords = query2Records.stream()
            .filter(m -> bpsKeys.containsKey(getKey.apply(m)))
            .collect(Collectors.toList());

    matchinRecords.forEach(m -> bpsKeys.remove(getKey.apply(m)));
    List<Map<String,Object>> notMatchingRecords = new ArrayList<>(bpsKeys.values());

Any help on this would be greatly appreciated. Please let me know if you need any more information from me.

CodePudding user response:

Since equals() contract of the Map states that two maps are considered to be equal if both objects are of type Map and their entry sets are equal, the function getKey you've used in your code is redundant.

Instead, we can compare these maps directly because they are guaranteed to contain the same keys, the result will be the same. The approach of generating a key-object would make sense only if there could be some keys in these maps that should be ignored (which is not the case here).

Because the cost of contains check depends on the type of collection, we can use Set to reduce time complexity.

To find the intersection, we need to filter objects from the LHS that are not present in the RHS.

And to obtain the difference, we can generate a union by merging the data from both datasets, and then filter only those maps that are not contained in the intersection.

Set<Map<String, Object>> lhsSet = executeQuery1(date).collect(Collectors.toSet());
Set<Map<String, Object>> rhsSet = executeQuery2(date).collect(Collectors.toSet());
    
Set<Map<String, Object>> intersection = lhsSet.stream()
    .filter(rhsSet::contains)
    .collect(Collectors.toSet());
        
Set<Map<String, Object>> diff = Stream.concat(lhsSet.stream(), rhsSet.stream())
    .filter(key -> !intersection.contains(key))
    .collect(Collectors.toSet());

The same can be done without streams, using pre-Java 8 features of the collection Framework:

Set<Map<String, Object>> intersection = new HashSet<>(lhsSet);
intersection.removeAll(rhsSet);
    
Set<Map<String, Object>> diff = new HashSet<>(lhsSet); // or use `lhsSet` itself if you don't need it afterwards
diff.addAll(rhsSet);          // union
diff.removeAll(intersection); // difference
  • Related