I have two Java Object Lists
Let's say dataToBeAdded, dataToBeSubtracted
The objects are of the same data type and have multiple attributes
DummyObject{ attr1, attr2, attr3, attr4 }
I want to merge these lists, but in a conditional manner If attr1, attr2, attr3 are matching in the lists
Subtract the attr4 and make it the part of the list.
If the attributes are not matching
the element is from list1(dataToBeadded) add it to output as it is
the element is from list2(dataToBeSubtracted) make attr4 negative (multiply by -1)
This is something like a Full Outer Join kind of operation
I did something using Maps and Streams
Map<String, DummyObj> dataToBeAddedMap = dataToBeAdded.stream()
.collect(Collectors.toMap(obj -> obj.attr1() obj.attr2() obj.attr3(), item -> item));
Map<String, CumulativeSalesDataByHour> dataToBeSubtractedMap = dataToBeSubtracted.stream()
.collect(Collectors.toMap( obj -> obj.attr1() obj.attr2() obj.attr3(), item ->
new DummyObject(item.attr1(), item.attr2(), item.attr3(), -1 * item.attr4())));
Map<String, DummyObject> resultantData = Stream.of(dataToBeAddedMap, dataToBeSubtractedMap)
.flatMap(map -> map.entrySet().stream())
.collect(Collectors.toMap(
Map.Entry::getKey,
Map.Entry::getValue,
(v1, v2) -> new DummyObject(v1.attr1(),
v1.attr2(),
v1.attr3(),
v1.attr4() v2.attr4())
));
System.out.println(resultantData.values());
This gives me my desired result, but is there any more efficient way to achieve this?
Edit 1:
Adding Input and expected Output
DummyObject a1 = new DummyObject("uuid1", "abcd", "mer1", 20D);
DummyObject a2 = new DummyObject("uuid1", "pqrs", "mer1", 25D);
DummyObject a3 = new DummyObject("uuid2", "xyz", "mer1", 18D);
List<DummyObject> dataToBeAdded = ImmutableList.of(a1,a2,a3);
DummyObject d1 = new DummyObject("uuid1", "abcd", "mer1", 5D);
DummyObject d2 = new DummyObject("uuid1", "pqrs", "mer1", 2D);
DummyObject d3 = new DummyObject("uuid3", "xyz", "mer2", 10D);
List<DummyObject> dataToBeSubtracted = ImmutableList.of(d1,d2,d3);
Desired Output
[
DummyObject("uuid1", "abcd", "mer1", 15D); // 20-5
DummyObject("uuid1", "pqrs", "mer1", 23D); // 25-2
DummyObject("uuid2", "xyz", "mer1", 18D);
DummyObject("uuid3", "xyz", "mer1", -10D);
]
CodePudding user response:
Instead of creating two extra Map
for the elements to add and the elements to subtract, you could immediately create a chained stream with the two List
and map each element of the subtracting List
to an element whose attr4
field is already negated.
Then, you could collect all the objects within a single Map
with the collect(Collectors.toMap())
terminal operation. The key would be the concatenation of the first 3 fields, the value would be the object itself while the colliding cases could be handled by creating a new DummyObj
with the same 3 fields you're grouping by and the fourth field given by the sum of attr4
of the first and second colliding value (similarly as you were doing in your last stream).
Another small improvement could be not to chain strings with the
operator, which creates a new String
for each concatenation, but to use the String.format()
method which produces a single String
to group the elements by (less overhead).
Map<String, DummyObj> mapRes = Stream.concat(dataToBeAdded.stream(), dataToBeSubtracted.stream().map(obj -> {
obj.setAttr4(-1 * obj.getAttr4());
return obj;
}))
.collect(Collectors.toMap(obj -> String.format("%s%s%s", obj.getAttr1(), obj.getAttr2(), obj.getAttr3()),
Function.identity(),
(obj1, obj2) -> new DummyObj(obj1.getAttr1(), obj1.getAttr2(), obj1.getAttr3(), obj1.getAttr4() obj2.getAttr4())
));
Here is a link to test the code above:
CodePudding user response:
If you truly want performance, you need to rewrite the pipeline - whatever happens that gets you two this question? Make it so that instead you have 2 Maps, using the first 3 attributes as key (perhaps make a new class that represents those 3 on their own).
But, if the pipeline cannot change, the fastest algorithm is:
- Write a Comparator that sorts these lists.
- Make 2 iterators (one for each).
- Make an output list.
- Loop, making a 'current item' pointer for both iterators.
- If current-1 is below current-2 (or current-2 is done), copy current-1 and advance iterator-1.
- If current-2 is below current-1 (or current-1 is done), flip the sign on c-2, add that, and advance iterator-2.
- If current-1 and current-2 are identical, update attr4 appropriately, then advance both.
- If both are done, return the output list.
It'd be significantly more code, but it makes no new transient objects whatsoever, except 2 iterators and a comparator.