Home > Mobile >  Counting the amount of object elements in one ArrayList occur in another ArrayList
Counting the amount of object elements in one ArrayList occur in another ArrayList

Time:10-15

I have two ArrayLists that hold TCP Flow objects from a pcap file. One list is a reference list that holds each unique packet found (no doubles, just every unique packet found). The other list, is the list of Flow objects that have had their flow marked as completed in my program.

My goal: Using the reference list, count the number of times, or frequency, a packet appears in the completed flow list.

The TCP Flow Object is defined as follows:

public class Flow {
    String destIp;
    String sourceIp;
    String destPort;
    String srcPort;
    double arrivalTime;
    int completed;
}

Each TCP flow object can be identified using destIp, sourceIp, destPort, and srcPort.

I previously found I can use the frequency method in Collections, but I have to use 4 different tuples to identify a row and not just one. My original plan was to create a nested for loop that inspected a packet from the reference list, then inspected each packet in the completed flow list, such as:

for(Flow referenceList : refList) {
    for(Flow compList : cList) {
        if tuples match the one in cList add to count
    }
}

would there be an easier way or more efficient way to accomplish this?

CodePudding user response:

The time it'll take is N*M where N and M are the sizes of those two lists. That gets pricey, fast. Two lists each of 10k entries and this will start taking really, really long.

Assuming these lists are not tiny, your data types are messed up. If one of them is a Set instead, this is simply O(M), as in, then it'd only take about 10k steps, much better. The one you make only once and which is all unique sounds like it should be a Set.

Whichever way you go here, you must have an actual hashCode() and equals() method on this Flow class. You can search the web for tutorials on how to write them, or get the IDE to make them for you, or use Project Lombok to make them for you.

CodePudding user response:

You can do it with O(n) time using Map and save intermediate count to it. No need to use for twice.

public static class Flow {

    String destIp;
    String sourceIp;
    String destPort;
    String srcPort;
    double arrivalTime;
    int completed;

}

public static void main(String... args) {
    Function<Flow, String> getKey = flow ->
            String.format("%s|%s|%s|%s",
                    flow.sourceIp, flow.srcPort, flow.destIp, flow.destPort);
    List<Flow> refList = List.of();
    List<Flow> cList = List.of();
    Map<String, Long> histogram = histogram(refList, cList, getKey);
}

public static Map<String, Long> histogram(List<Flow> refList,
        List<Flow> cList, Function<Flow, String> getKey) {
    Map<String, Long> map =
            refList.stream()
                   .map(getKey)
                   .collect(Collectors.groupingBy(Function.identity(),
                           Collectors.counting()));
    
    Map<String, Long> res = new HashMap<>();

    cList.stream()
         .map(getKey)
         .filter(map::containsKey)
         .forEach(key -> res.put(key, map.get(key)   1));

    return map;
}
  • Related