Home > Net >  How to fetch 3 objects having the highest values from a List with Stream API
How to fetch 3 objects having the highest values from a List with Stream API

Time:04-30

I have a method like this:

public String mostExpensiveItems() {
        List<Entry> myList = getList();
        List<Double> expensive = myList.stream()
                .map(Entry::getAmount)
                .sorted(Comparator.reverseOrder())
                .limit(3)
                .toList();

        return "";
    }

This method needs to return the product IDs of the 3 most expensive items as a string like this:

"item1, item2, item3"

I should be able to use only streams and I got stuck here. I should be able to sort the items by value then get the product IDs, but I can't seem to make it work.

Entry class

public class Entry {

    private String productId;
    private LocalDate date;
    private String state;
    private String category;
    private Double amount;

    public Entry(LocalDate orderDate, String state, String productId, String category, Double sales) {
        this.date = orderDate;
        this.productId = productId;
        this.state = state;
        this.category = category;
        this.amount = sales;
    }

    public String getProductId() {
        return productId;
    }

CodePudding user response:

Assuming product ID is inside Entry, it can be something like this.

public String mostExpensiveItems() {
    List<Entry> myList = getList();
    List<String> expensive = myList.stream()
            .sorted(Comparator.comparing(Entry::getAmount).reversed())
            .limit(3)
            .map(Entry::getProductID)
            .toList();

    return "";
}

NB: I didn't test this out yet, but this should be able to convey the idea.

CodePudding user response:

You don't to sort all the given data for this task. Because sorting is overkill, when only need to fetch 3 largest values.

Because sorting the hole data set will cost O(n log n) time. Meanwhile, this task can be done in a single pass through the list, maintaining only 3 largest previously encountered values in the sorted order. And time complexity will be very close to a linear time.

To implement the partial sorting with streams, you can define a custom collector (an object that is responsible for accumulating the data from the stream).

You can create a custom collector either inline by using one of the versions of the static method Collector.of() or by creating a class that implements the Collector interface.

These are parameters that you need to provide while defining a custom collector:

  • Supplier Supplier<A> is meant to provide a mutable container which store elements of the stream. In this case because we need to perform a partial sorting, PriorityQueue will be handy for that purpose as a mutable container.
  • Accumulator BiConsumer<A,T> defines how to add elements into the container provided by the supplier. For this task, the accumulator needs to guarantee that queue will not exceed the given size by rejecting values that are smaller than the lowest value previously added to the queue and by removing the lowest value if the size has reached the limit a new value needs to be added.
  • Combiner BinaryOperator<A> combiner() establishes a rule on how to merge two containers obtained while executing stream in parallel. Here combiner rely on the same logic that was described for accumulator.
  • Finisher Function<A,R> is meant to produce the final result by transforming the mutable container. The finisher function in the code below turns the queue into an immutable list.
  • Characteristics allow to provide additional information, for instance Collector.Characteristics.UNORDERED which is used in this case denotes that the order in which partial results of the reduction produced while executing in parallel is not significant, which can improve performance of this collector with parallel streams.

Note that with Collector.of() only supplier, accumulator and combiner are mandatory, other parameters are defined if needed.

The method below that generates a collector would be more reusable if we apply generic type parameter to it and declare it to expect a comparator as a parameter (will be used in the constructor of the PriorityQueue and while adding elements to the queue).

Custom collector:

public static <T> Collector<T, ?, List<T>> getMaxN(int size, Comparator<T> comparator) {
    
    return Collector.of(
        () -> new PriorityQueue<>(comparator),
        (Queue<T> queue, T next) -> tryAdd(queue, next, comparator, size),
        (Queue<T> left, Queue<T> right) -> {
            right.forEach(next -> tryAdd(left, next, comparator, size));
            return left;
        },
        (Queue<T> queue) -> queue.stream().toList(),
        Collector.Characteristics.UNORDERED);
}

public static <T> void tryAdd(Queue<T> queue, T next, Comparator<T> comparator, int size) {
    if (queue.size() == size && comparator.compare(queue.element(), next) < 0) queue.remove(); // if next value is greater than the smallest element in the queue and max size has been exceeded the smallest element needs to be removed from the queue
    if (queue.size() < size) queue.add(next);
}

Stream:

public static <T> String getMostExpensive(List<T> list, Function<T, String> function,
                                          Comparator<T> comparator, int limit) {
    
    return list.stream().parallel()
        .collect(getMaxN(limit, comparator))
        .stream()
        .map(function)
        .collect(Collectors.joining(", "));
}

main() - demo with dummy Entries that expects only amount as a parameter.

public static void main(String[] args) {
    List<Entry> entries =
        List.of(new Entry("item1", 2.6), new Entry("item2", 3.5), new Entry("item3", 5.7),
                new Entry("item4", 1.9), new Entry("item5", 3.2), new Entry("item6", 9.5),
                new Entry("item7", 7.2), new Entry("item8", 8.1), new Entry("item9", 7.9));

    System.out.println(getMostExpensive(entries, Entry::getProductId,
                                        Comparator.comparingDouble(Entry::getAmount), 3));
}

Output

[item9, item6, item8] // final result is not sorted PriorityQueue maintains the elements in unsorted order (sorting happens only while dequeue operation happens), if these values are requeted to be sorted it could be done by changing the finisher function
  • Related