Home > Net >  Java Stream of nested lists - Filter main list depending on nested object list property
Java Stream of nested lists - Filter main list depending on nested object list property

Time:12-21

I have a rather complex problem I'm trying to figure out. I have a list of objects, that has a nested list of objects that I need to filter the main list of objects by another nested object.

public TimeOffSlot {
    private long id;
    private LocalDate slotDate;
    private LocalTime slotTime;
    private Collection<PersonTimeOffSlot> personTimeOffList; 
}

public PersonTimeOffSlot {
    private long timeOffId;
    private TimeOffCategory timeOffCategory;
    private Person person;
}

public Person {
    private long personId;
    private String name;
}

There could be multiple TimeOffSlots (one for each hour), but I only want to show a TimeOffSlot per person per day. For example, say Person A has multiple TimeOffSlots selected for Monday, I only want to grab the first instance and disregard the rest. Hope this makes sense.

Collection<TimeOffSlot> filteredTimeOffSlots = timeOffSlots.stream()
    .filter(slot -> slot.getSlotDate().equals(LocalDate.now()))
    .collect(Collectors.toCollection(ArrayList::new));
            
filteredTimeOffSlots = filteredTimeOffSlots.stream()
    .filter(slot -> slot.getPersonTimeOffList.stream()
        .filter(distinctByKey(s -> s.getPerson().getPersonId())))
    .collect(Collectors.toCollection(ArrayList::new));

private static <T> Predicate<T> distinctByKey(
        Function<? super T, ?> keyExtractor) {
      
        Map<Object, Boolean> seen = new ConcurrentHashMap<>(); 
        return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null; 
    }

CodePudding user response:

To filter the TimeOffSlots by person, you can use a group by clause with the person's ID as the key. Here is how you can do it:

Collection<TimeOffSlot> filteredTimeOffSlots = timeOffSlots.stream()
.filter(slot -> slot.getSlotDate().equals(LocalDate.now()))
.collect(Collectors.toCollection(ArrayList::new));

Map<Long, TimeOffSlot> timeOffSlotsByPerson = filteredTimeOffSlots.stream()
.flatMap(slot -> slot.getPersonTimeOffList.stream())
.collect(Collectors.groupingBy(s -> s.getPerson().getPersonId(),
Collectors.mapping(s -> s.getTimeOffSlot(), Collectors.toList())))
.entrySet().stream()
.collect(Collectors.toMap(Map.Entry::getKey, entry -> entry.getValue().get(0)));

filteredTimeOffSlots = new ArrayList<>(timeOffSlotsByPerson.values());

CodePudding user response:

If I understood the problem correctly, the resulting list should contain only TimeOffSlot that are associated Person objects that are not present in any other TimeOffSlots in the list.

To discard the TimeOffSlot associated with repeated person ids the data can be collected into an intermediate HashMap where id would serve as a Key.

And since while processing each TimeOffSlot we need to check if one of its ids was previously encountered or not, it makes sense to introduce a custom accumulation type wrapped around a `Map, which would be responsible for performing this check and would also contain all other logic required to create a custom collector, so that the stream itself would be very lean and well-readable.

For that, we can define a custom Collector using static factory method Collector.of.

That's how it might look like:

List<TimeOffSlot> timeOffSlots = // initializing the list
        
LocalDate now = LocalDate.now(); // no need to invoke now() multiple times from the stream
        
List<TimeOffSlot> filteredTimeOffSlots = timeOffSlots.stream()
    .filter(slot -> slot.getSlotDate().equals(now))
    .collect(Collector.of(
        SlotAccumulator::new,     // supplier - is meant to provide an instance of the mutable container
        SlotAccumulator::accept,  // accumulator - changes the state of the mutable container
        SlotAccumulator::merge,   // combiner - merges the partial results produced during the parallel execution
        SlotAccumulator::getSlots // finisher - function performing the final transformation
    ));

A custom accumulation type SlotAccumulator might look like that. For convenience, I've implemented Consumer interface (so that reference to it SlotAccumulator::accept can be used as an accumulator of the Collector).

public class SlotAccumulator implements Consumer<TimeOffSlot> {
    private Map<Long, TimeOffSlot> slotById = new HashMap<>();
    
    @Override
    public void accept(TimeOffSlot slot) {
        if (notSeen(slot)) {            // if none of the contained ids was encountered before
            slot.getPersonTimeOffList()
                .forEach(personSlot -> slotById.put(personSlot.getPerson().getPersonId(), slot));
        }
    }
    
    public SlotAccumulator merge(SlotAccumulator other) {
        other.slotById.forEach((id, slot) -> slotById.putIfAbsent(id, slot));
        return this;
    }
    
    public List<TimeOffSlot> getSlots() {
        return slotById.values().stream().distinct().toList(); // for Java 16 or collect(Collectors.toList())
    }
    
    private boolean notSeen(TimeOffSlot slot) {
        
        return slot.getPersonTimeOffList().stream()
            .map(PersonTimeOffSlot::getPerson)
            .map(Person::getPersonId)
            .noneMatch(slotById::containsKey);
    }
}

In case if you don't enjoy the idea of introducing a separate class, it possible to plug all the logic right into stream (but approach shown above is more maintainable, its way easier to reuse and maintain).

For that, we can make use of the tree-args version of collect() and generate an intermediate Map. Then create a stream over the Map's Values and collect the elements into a list.

List<TimeOffSlot> filteredTimeOffSlots1 = timeOffSlots.stream()
    .filter(slot -> slot.getSlotDate().equals(now))
    .collect(
        HashMap::new,
        (Map<Long, TimeOffSlot> slotById, TimeOffSlot slot) -> {
            boolean idsNotSeen = slot.getPersonTimeOffList().stream()
                .map(PersonTimeOffSlot::getPerson)
                .map(Person::getPersonId)
                .noneMatch(slotById::containsKey);
            if (idsNotSeen) slot.getPersonTimeOffList()
                .forEach(personSlot -> slotById.put(personSlot.getPerson().getPersonId(), slot));
        },
        (left, right) -> right.forEach(left::putIfAbsent)
    )
    .values().stream()
    .toList();

Note:

Usage of stateful functions in the stream is discouraged by the API documentation. Approach that would allow to avoid accumulating state outside the stream pipeline is always preferred one.

  • Related