Home > database >  R: How to subset DFs according to unique values in column and loop each of them in the function?
R: How to subset DFs according to unique values in column and loop each of them in the function?

Time:08-31

I want to create a loop function, which subset data according to a unique value in Category column, so to have separated DFs where `Category == "A", Category == "B",.... Category == "F"

data <- structure(list(Category = c("A", "A", "B", "B", "C", "C", "C", 
"D", "E", "E", "E", "C", "A", "B", "B", "F", "F"), Sales = c(34L, 
32L, 34L, 56L, 32L, 66L, 6L, 55L, 66L, 56L, 3L, 43L, 56L, 65L, 
34L, 65L, 43L), Year = c(2015L, 2015L, 2015L, 2015L, 2016L, 2016L, 
2016L, 2016L, 2017L, 2017L, 2017L, 2017L, 2018L, 2018L, 2018L, 
2018L, 2018L)), class = "data.frame", row.names = c(NA, -17L))

   Category Sales Year
1         A    34 2015
2         A    32 2015
3         B    34 2015
4         B    56 2015
5         C    32 2016
6         C    66 2016
7         C     6 2016
8         D    55 2016
9         E    66 2017
10        E    56 2017
11        E     3 2017
12        C    43 2017
13        A    56 2018
14        B    65 2018
15        B    34 2018
16        F    65 2018
17        F    43 2018

and after throwing all these unique DFs into function, for example:

divide <- function(df) {
  Sales <- (Sales - 32) * 5 / 9
  return(df)
}

data <- divide(data)

and after looping all these DFs, combine them using rbind()

P.S all these need to be automatized, I can't code and subset DFs manually for every unique value in the Category column. So I need to use unique() function

CodePudding user response:

You can:

  • Use split to split your data according to the unique Categories.
  • Use lapply to apply a function to each of your dataset (here, I created a new variable, sumYear).
  • Use do.call to get back to your original dataframe
split(data, data$Category) |>
  lapply(function(x) transform(x, sumYear = sum(Year))) |>
  do.call(rbind.data.frame, args = _)

Note that this is not a very straightforward way to apply a function by group. You could use dplyr::group_by or ave in base R to do so; see e.g. solutions of this question: Run a custom function on a data frame in R, by group.

CodePudding user response:

Isn't this as simple as:

data$Sales <- (data$Sales - 32) * 5 / 9

If you want separate data frames:

DFs <- split(data, data$Category)

but I don't see such need, because you are to apply the same operation for all categories.

  • Related