Home > Enterprise >  how to write a for loop to combine several data frame that are made with forward pipe operator in R?
how to write a for loop to combine several data frame that are made with forward pipe operator in R?

Time:02-20

I need to make a new dataframe but I don't know how to use for loop to reduce the repitition.

This is my original data frame

ID t1 t2 t4 t5 t6
1 S B 11 1 1
1 S B 11 2 0
1 S B 12 3 1
1 S B 12 4 1
1 S B 13 5 0
1 S B 14 6 1
1 S B 14 7 1
1 S B 15 8 0
2 S B 11 1 1
2 S B 12 2 1
2 S B 13 3 1
2 S B 14 4 0
2 S B 15 5 1
3 S G 11 1 1
3 S G 12 2 1
3 S G 12 3 0
3 S G 13 4 0
3 S G 14 5 1
3 S G 15 6 1
4 S G 11 1 1
4 S G 12 2 0
4 S G 13 3 0
4 S G 14 4 1
4 S G 15 5 0
5 N B 11 1 1
5 N B 12 2 1
5 N B 13 3 1
6 N B 11 1 1
6 N B 12 2 1
6 N B 13 3 1
6 N B 13 4 1
6 N B 14 5 0
6 N B 15 6 1
7 N G 11 1 0
7 N G 12 2 1
8 N G 11 1 0
8 N G 11 2 1
8 N G 11 3 0
8 N G 12 4 1
8 N G 12 5 0
8 N G 13 6 1
8 N G 13 7 1
8 N G 13 8 1
8 N G 14 9 1
8 N G 14 10 0
8 N G 15 11 1
8 N G 15 12 1
8 N G 15 13 0
8 N G 15 14 0

The following is the code I have written to extract my new data frames:

t=levels(as.factor(df$t4))

df11<- df %>%
  filter(t4==11) %>%
  group_by(ID) %>%
  mutate(num=seq_along(ID)) %>% 
  as.data.frame

df.11.new<- df11 %>%
  group_by(t2, num) %>%
  summarise(mean=mean(t6), count=n())

df.11.new$t7="d11"

I need to repeat this code for all the levels of t4, which are "11", "12", "13", "14" and "15"

and finally combine them all like the following code:

df.all<-rbind(df.11.new, df.12.new, df.13.new, df.14.new, df.15.new)

But I don't know how to write a for loop?

CodePudding user response:

Instead of filtering, add 't4' as grouping, then we don't need multiple filter in a loop and then rbind the outputs

library(stringr)
library(dplyr)
df.all <- df %>% 
   group_by(ID, t4) %>% 
   mutate(num = row_number()) %>%
   group_by(t4, t2, num) %>%
   summarise(mean = mean(t6), count = n(), 
      t7 = str_c('d', first(t4)), .groups = 'drop')

-checking with OP's output for t4 = 11

> df.all %>% 
   filter(t4 == 11)
# A tibble: 5 × 6
     t4 t2      num  mean count t7   
  <int> <chr> <int> <dbl> <int> <chr>
1    11 B         1   1       4 d11  
2    11 B         2   0       1 d11  
3    11 G         1   0.5     4 d11  
4    11 G         2   1       1 d11  
5    11 G         3   0       1 d11  
> df.11.new
# A tibble: 5 × 4
# Groups:   t2 [2]
  t2      num  mean count
  <chr> <int> <dbl> <int>
1 B         1   1       4
2 B         2   0       1
3 G         1   0.5     4
4 G         2   1       1
5 G         3   0       1

If we use the rowid from data.table, can remove the first grouping

library(data.table)
df %>% 
  group_by(t4, t2, num = rowid(ID, t4)) %>% 
  summarise(mean = mean(t6), count = n(), 
     t7 = str_c('d', first(t4)), .groups = 'drop')
  • Related