Home > Back-end >  How do I subtract two Object Lists based on the attributes?
How do I subtract two Object Lists based on the attributes?

Time:05-25

I have two Java Object Lists

Let's say dataToBeAdded, dataToBeSubtracted

The objects are of the same data type and have multiple attributes

DummyObject{ attr1, attr2, attr3, attr4 }

I want to merge these lists, but in a conditional manner If attr1, attr2, attr3 are matching in the lists

Subtract the attr4 and make it the part of the list.

If the attributes are not matching

  • the element is from list1(dataToBeadded) add it to output as it is

  • the element is from list2(dataToBeSubtracted) make attr4 negative (multiply by -1)

This is something like a Full Outer Join kind of operation

I did something using Maps and Streams

        Map<String, DummyObj> dataToBeAddedMap = dataToBeAdded.stream()
                .collect(Collectors.toMap(obj -> obj.attr1()   obj.attr2()   obj.attr3(), item -> item));

        Map<String, CumulativeSalesDataByHour> dataToBeSubtractedMap = dataToBeSubtracted.stream()
                .collect(Collectors.toMap( obj -> obj.attr1()   obj.attr2()   obj.attr3(), item ->
                        new DummyObject(item.attr1(), item.attr2(), item.attr3(), -1 * item.attr4())));

        Map<String, DummyObject> resultantData = Stream.of(dataToBeAddedMap, dataToBeSubtractedMap)
                .flatMap(map -> map.entrySet().stream())
                .collect(Collectors.toMap(
                        Map.Entry::getKey,
                        Map.Entry::getValue,
                        (v1, v2) -> new DummyObject(v1.attr1(),
                                v1.attr2(),
                                v1.attr3(),
                                v1.attr4()   v2.attr4())
                ));

        System.out.println(resultantData.values());

This gives me my desired result, but is there any more efficient way to achieve this?

Edit 1:

Adding Input and expected Output

        DummyObject a1 = new DummyObject("uuid1", "abcd", "mer1", 20D);
        DummyObject a2 = new DummyObject("uuid1", "pqrs", "mer1", 25D);
        DummyObject a3 = new DummyObject("uuid2", "xyz", "mer1", 18D);

        List<DummyObject> dataToBeAdded = ImmutableList.of(a1,a2,a3);

        DummyObject d1 = new DummyObject("uuid1", "abcd", "mer1", 5D);
        DummyObject d2 = new DummyObject("uuid1", "pqrs", "mer1", 2D);
        DummyObject d3 = new DummyObject("uuid3", "xyz", "mer2", 10D);

        List<DummyObject> dataToBeSubtracted = ImmutableList.of(d1,d2,d3);

        Desired Output
        [

            DummyObject("uuid1", "abcd", "mer1", 15D); // 20-5
            DummyObject("uuid1", "pqrs", "mer1", 23D); // 25-2
            DummyObject("uuid2", "xyz", "mer1", 18D); 
            DummyObject("uuid3", "xyz", "mer1", -10D); 

        ]    

CodePudding user response:

Instead of creating two extra Map for the elements to add and the elements to subtract, you could immediately create a chained stream with the two List and map each element of the subtracting List to an element whose attr4 field is already negated.

Then, you could collect all the objects within a single Map with the collect(Collectors.toMap()) terminal operation. The key would be the concatenation of the first 3 fields, the value would be the object itself while the colliding cases could be handled by creating a new DummyObj with the same 3 fields you're grouping by and the fourth field given by the sum of attr4 of the first and second colliding value (similarly as you were doing in your last stream).

Another small improvement could be not to chain strings with the operator, which creates a new String for each concatenation, but to use the String.format() method which produces a single String to group the elements by (less overhead).

Map<String, DummyObj> mapRes = Stream.concat(dataToBeAdded.stream(), dataToBeSubtracted.stream().map(obj -> {
            obj.setAttr4(-1 * obj.getAttr4());
            return obj;
        }))
        .collect(Collectors.toMap(obj -> String.format("%s%s%s", obj.getAttr1(), obj.getAttr2(), obj.getAttr3()),
                Function.identity(),
                (obj1, obj2) -> new DummyObj(obj1.getAttr1(), obj1.getAttr2(), obj1.getAttr3(), obj1.getAttr4()   obj2.getAttr4())
        ));

Here is a link to test the code above:

https://ideone.com/K5qfsI

CodePudding user response:

If you truly want performance, you need to rewrite the pipeline - whatever happens that gets you two this question? Make it so that instead you have 2 Maps, using the first 3 attributes as key (perhaps make a new class that represents those 3 on their own).

But, if the pipeline cannot change, the fastest algorithm is:

  1. Write a Comparator that sorts these lists.
  2. Make 2 iterators (one for each).
  3. Make an output list.
  4. Loop, making a 'current item' pointer for both iterators.
  5. If current-1 is below current-2 (or current-2 is done), copy current-1 and advance iterator-1.
  6. If current-2 is below current-1 (or current-1 is done), flip the sign on c-2, add that, and advance iterator-2.
  7. If current-1 and current-2 are identical, update attr4 appropriately, then advance both.
  8. If both are done, return the output list.

It'd be significantly more code, but it makes no new transient objects whatsoever, except 2 iterators and a comparator.

  • Related