How can I write a lambda expression that reads two different columns in a CSV file?-CodePudding

for this particular problem, "countRespondentsByAgeRange" receives a stream of strings (CSV file) and returns a map with the count of respondents who watched any of the movies by age range (which are "18-29", "30-44", "45-60" or "> 60").

In this CSV file, the first two lines are column headers (so they don't matter basically) but the column that holds if they saw the movies or not ("Yes" or "No") is in column 2, but the column that holds the age range is column 31.

I need to know how to filter out the stream to where if the respondent answered "Yes", it will take their age range and return the output for all of them.

I only know how to copy one column into a List and look through there, but since you can't reuse a stream I don't know how to get the info for both. My attempt returns a map of all of the age ranges and how many are in there but I don't know how to filter it to show only the ones that are "Yes" My attempt:

public static final Function<Stream<String>, Map<String, Long>> countRespondentsWhoHaveWatchedAnyOfTheSixMoviesByAgeRange = a -> {
    // list of the age ranges from participants
    List<String> strToList = a.map(s -> s.split(",")[30]).collect(Collectors.toList());
    List<String> x = strToList.stream().filter(s -> s.startsWith("1")).collect(Collectors.toList());
    List<String> y = strToList.stream().filter(s -> s.startsWith("3")).collect(Collectors.toList());
    List<String> z = strToList.stream().filter(s -> s.startsWith("4")).collect(Collectors.toList());
    List<String> old = strToList.stream().filter(s -> s.startsWith(">")).collect(Collectors.toList());
    Long xVal = (long) x.size();
    Long yVal = (long) y.size();
    Long zVal = (long) z.size();
    Long oVal = (long) old.size();
    return Map.of("18-29", xVal, "30-44", yVal, "45-60", zVal, "> 60", oVal);
};

Does anyone know how to do a shorter version of this lambda expression to just filter/sort through it to return this kind of map?

CodePudding user response：

Inside your function iteration over the data set happens five times and an intermediate list is being allocated in memory. It's redundant.

Your solution can be improved in terms of performance end readability if instead of applying this multiline function the data will be processed directly in the source stream.

        Map<String, Long> ageByCount = getRespondentsData() // source of the stream
                .map(s -> s.split(","))
                .filter(arr -> arr[1].startsWith("Y"))
                .map(arr -> arr[30])
                .collect(Collectors.groupingBy(UnaryOperator.identity(),
                                               Collectors.counting()));