Home > Software design >  Remove Objects from a List based on their ID and collect Objects that have been removed to another L
Remove Objects from a List based on their ID and collect Objects that have been removed to another L

Time:06-21

I want to remove from a list of Employee (list1) objects that are not present in another list of Employee (list2) by their id and add removed objects from list1 into another list (list3) using Java 8.

Example :

List<Employee> list1 = Stream.of(
                        new Employee("100","Boston","Massachusetts"),
                        new Employee("400","Atlanta","Georgia"),
                        new Employee("300","pleasanton","California"),
                        new Employee("200","Decatur","Texas"),
                        new Employee("500","Cumming","Atlanta"),
                        new Employee("98","sula","Maine"),
                        new Employee("156","Duluth","Ohio"))
                .collect(Collectors.toList());

From the above list need to remove Employee object based on id of below list.

List<Employee> list2 = Stream.of(
                        new Employee("100","Boston","Massachusetts"),
                        new Employee("800","pleasanton","California"),
                        new Employee("400","Atlanta","Georgia"),
                        new Employee("10","Decatur","Texas"),
                        new Employee("500","Cumming","Atlanta"),
                        new Employee("50","sula","Maine"),
                        new Employee("156","Duluth","Ohio"))
                .collect(Collectors.toList());

Expected Output : list1 and list3

       List<Employee> list1 = Stream.of(
                        new Employee("100","Boston","Massachusetts"),
                        new Employee("400","Atlanta","Georgia"),
                        new Employee("500","Cumming","Atlanta"),
                new Employee("156","Duluth","Ohio"))
                .collect(Collectors.toList());

        List<Employee> list3 = Stream.of(
                        new Employee("300","pleasanton","California"),
                        new Employee("200","Decatur","Texas"),
                        new Employee("98","sula","Maine")
                        )
                .collect(Collectors.toList());

Tried below way but not working as expected

        List<Employee> list3 = new ArrayList<>();
        if(CollectionUtils.isNotEmpty(list1) && CollectionUtils.isNotEmpty(list2)){
            list2.stream().forEachOrdered( l2 -> {
                Optional<Employee> nonMatch = list1.stream().filter(l1 -> !l1.getId().equalsIgnoreCase(l2.getId())).findAny();
                if(nonMatch.isPresent()){
                    list3.add(nonMatch.get());
                    list1.removeIf(l1 -> l1.getId().equalsIgnoreCase(nonMatch.get().getId()));
                }
            });
        }

        System.out.println(list1);
        System.out.println(list3);

CodePudding user response:

Here come two possible solutions.

This one is short and concise, but does in fact not remove elements from list1 but utilizes a partitioning collector to create the two lists. Think of the partitioning collector as kind of a two-way filter: if your predicate is fulfilled, collect to one list, if it's not, collect to the other list. The predicate in our case actually is "does list2 contain an employee with the same ID as the stream element from list1?". In order to lower the actual overhead, the code prepares a list of IDs from list2 up-front.

        final List<String> list2Ids = list2.stream()
                .map(Employee::getId)
                .collect(Collectors.toList());

        Map<Boolean, List<Employee>> partitioned = list1.stream()
                .collect(Collectors.partitioningBy(e -> list2Ids.contains(e.getId())));

        list1 = partitioned.get(true);
        List<Employee> list3 = partitioned.get(false);

If you need to keep list1 - e.g. for memory foot-print reasons - and really have to remove the elements from it, I'd say you will have to stick to the really old-fashioned iterator. The reason for that is that iterators allow you to iterate some collection and remove elements while doing so. The next sample does exactly this. Note, that I prepared a list of IDs of list2 up-front again.

        final List<String> list2Ids = list2.stream()
                .map(Employee::getId)
                .collect(Collectors.toList());

        final List<Employee> list3 = new LinkedList<>();

        for (Iterator<Employee> iterator = list1.iterator(); iterator.hasNext();) {
            Employee next = iterator.next();

            if (!list2Ids.contains(next.getId())) {
                list3.add(next);
                iterator.remove();
            }
        }

CodePudding user response:

Issues with your current code:

  • Remember as the rule of thumb: every time you're finding yourself changing something outside the stream using forEach, most likely something is going wrong (*have a look at the API documentation. You can invoke forEach() method on any type of collection without creating a stream, but keep in mind that using a multiline lambda inside forEach() doesn't bring you any advantage over a plain for loop.
  • it's highly advisable to give your variable clear self-explanatory names. Like emplDeparmentX, empleOldDB, etc. Names list1 makes more difficult to reason about the code.

You can address this problem in the following steps (time complexity of each step is linear):

  • create a Set of id contained in the list2 (remainder: the time complexity of the contains check is O(1));
  • generate the list of Employee (denoted as nonCommonId in the code) from the list1 that have no common id with Employee contained in the list2 by checking every id from the list1 against the Set obtained at the previous step.
  • Removing employees with different id from a list separately causes additional performance overhead because each removal has a linear time complexity. A better option will be to use removeAll().
  • apply removeAll() on the list id discard all the employee that are present in the list obtained at the previous step.

The overall time complexity O(n m) (n and m - are numbers of elements in the list1 and list2).

Set<String> idSet = list2.stream()
    .map(Employee::id)
    .collect(Collectors.toSet());
            
List<Employee> nonCommonId = list1.stream()
    .filter(emp ->  !idSet.contains(emp.id()))
    .collect(Collectors.toList());
            
list1.removeAll(new HashSet<>(nonCommonId)); // list `nonCommonId` is wrapped by the HashSet to improve performance, because `removeAll()` relies on the `contains()` checks
// diplaying results
System.out.println("List1:");
list1.forEach(System.out::println);
            
System.out.println("nonCommonId:");
nonCommonId.forEach(System.out::println);

Output:

List1:
Employee[id=100, city=Boston, state=Massachusetts]
Employee[id=400, city=Atlanta, state=Georgia]
Employee[id=500, city=Cumming, state=Atlanta]
Employee[id=156, city=Duluth, state=Ohio]
nonCommonId:
Employee[id=300, city=pleasanton, state=California]
Employee[id=200, city=Decatur, state=Texas]
Employee[id=98, city=sula, state=Maine]

A link to Online Demo

  • Related