Home > Back-end >  Getting all the POJO items that match max value of a POJO variable
Getting all the POJO items that match max value of a POJO variable

Time:03-01

I have a POJO class from which I want to collect all the POJO objects that matches the max value of a given POJO variable.

I have the below POJO class

@Data
@AllArgsConstructor
@NoArgsConstructor
public class IPTraceData implements Serializable {

    private String factId;
    private long startIp;
    private long endIp;
    private int confidence;
    private LocalDate date;
    private long ipaddr;
}

I want to get all the POJO object which match with the max value of confidence variable.

I am able to get the results using the below code in Java 8.

int max = allTraces.stream()
            .max(Comparator.comparing(IPTraceData::getConfidence))
            .get()
            .getConfidence();

List<IPTraceData> traceData = allTraces
            .stream()
            .filter(m -> m.getConfidence() == max)
            .collect(Collectors.toList());

However, I am trying to write a code in Java 8 using a single stream statement. How can I achieve the same using the single stream statement?

CodePudding user response:

Technically it's is possible but this solution has an additional cost of allocating in memory an intermediate map Map<Integer, <List<IPTraceData>>.

The approach of finding the max confidence first and then processing the data set based on it is more performance-wise.

List<IPTraceData> traceData = allTraces
            .stream()
            .collect(Collectors.groupingBy(IPTraceData::getConfidence))
            .entrySet().stream()
            .max(Map.Entry.comparingByKey())
            .map(Map.Entry::getValue)
            .orElse(Collections.emptyList());

Note:

  • avoid using get() with optional when code doesn't contain any checks that ensure that the optional object is not empty. If you expect it not to be empty and your intention is to make the code fail to emphasize the problem use orElseThrow() instead. That'll make your intention more clear.

CodePudding user response:

If you want to do it without additional intermediate memory usage, and fully parallelizable, you can use the general collect functionality. Something like this (untested; feel free to extract methods to make it more readable):

List<IPTraceData> traceData = allTraces.stream()
    .collect(
        ArrayList::new,
        (r, t) -> {
          if (r.isEmpty()) {
            r.add(t);
          } else {
            int currentMaxConfidence = r.get(0).getConfidence();
            if (t.getConfidence() == currentMaxConfidence) {
              r.add(t);
            } else if (t.getConfidence() > currentMaxConfidence) {
              r.clear();
              r.add(t);
            }
          }
        },
        (left, right) -> {
          if (left.isEmpty()) {
            left.addAll(right);
          } else if (!right.isEmpty()) {
            int leftMax = left.get(0).getConfidence();
            int rightMax = right.get(0).getConfidence();

            if (leftMax == rightMax) {
              left.addAll(right);
            } else if (leftMax < rightMax) {
              left.clear();
              left.addAll(right);
            }
          }
        }
    );

You can also do the same thing with non-mutable data structures using reduce() instead of collect(), but using the normal Java lists is a bit of a pain in a non-mutable way.

Note that I'm aware that the code is quite unwieldy and probably not worth it, but I'm purely answering your question without any assumptions about how and where you want to use it.

  • Related