Home > Software design >  R: Creating a new row for each group, with values being the difference of existing entries in the gr
R: Creating a new row for each group, with values being the difference of existing entries in the gr

Time:07-08

Region Age Student Type Values
A 17 Any 32
A 17 Full time 24
A 18 Any 27
A 18 Full time 19
B 17 Any 22
B 17 Full time 14
B 18 Any 80
B 18 Full time 75

I am working with this dataset in R. I am hoping to create a new tow for each region and age, with student type being "Part time" and values being the values of "Any" - "Full time". It seems I can use lag in the process, but I was hoping to be more explicit, specifying it is "Any" - "Full time", as while this dataset is well organized there may be data sets where entries are reversed.

Ideally the result would look something like

Region Age Student Type Values
A 17 Any 32
A 17 Full time 24
A 17 Part time 8
A 18 Any 27
A 18 Full time 19
A 18 Part time 8
B 17 Any 22
B 17 Full time 14
B 17 Part time 8
B 18 Any 80
B 18 Full time 75
B 18 Part time 5

Thank you!

CodePudding user response:

You may try

library(dplyr)
df %>%
  group_by(Region, Age) %>%
  summarize(Student.Type = "Part time",
            Values = abs(diff(Values))) %>%
  rbind(., df) %>%
  arrange(Region, Age, Student.Type)

   Region   Age Student.Type Values
   <chr>  <int> <chr>         <int>
 1 A         17 Any              32
 2 A         17 Full time        24
 3 A         17 Part time         8
 4 A         18 Any              27
 5 A         18 Full time        19
 6 A         18 Part time         8
 7 B         17 Any              22
 8 B         17 Full time        14
 9 B         17 Part time         8
10 B         18 Any              80
11 B         18 Full time        75
12 B         18 Part time         5

CodePudding user response:

With dplyr, you could use group_modify() add_row().

df %>%
  group_by(Region, Age) %>%
  group_modify(~ {
    .x %>%
      summarise(StudentType = "Part time", Values = -diff(Values)) %>%
      add_row(.x, .)
  }) %>%
  ungroup()

# # A tibble: 12 × 4
#    Region   Age StudentType    Values
#    <chr>  <int> <chr>           <int>
#  1 A         17 Any                32
#  2 A         17 Full time          24
#  3 A         17 Part time           8
#  4 A         18 Any                27
#  5 A         18 Full time          19
#  6 A         18 Part time           8
#  7 B         17 Any                22
#  8 B         17 Full time          14
#  9 B         17 Part time           8
# 10 B         18 Any                80
# 11 B         18 Full time          75
# 12 B         18 Part time           5
  •  Tags:  
  • r
  • Related