Home > other >  How to compute with values in a group of a dataframe by another value in said frame
How to compute with values in a group of a dataframe by another value in said frame

Time:10-06

I have a data frame like so:

Topic1        Topic2        Val1        Val2
Fruit         A               12        9
Fruit         B               10        7
Fruit         Final           16        9
Shirt         X               40        1
Shirt         Y               10        10
Shirt         A               30        3
Shirt         Final           100       20

I need to divide the value found in Val1 by the "Final" value of that column as Grouped by Topic1. The end result would look something like this:

Topic1        Topic2        Val1        Val2      Calc
Fruit         A               12        9         .75
Fruit         B               10        7         .44
Fruit         Final           16        9         1
Shirt         X               40        1         .4
Shirt         Y               10        10        .1
Shirt         A               30        3         .3
Shirt         Final           100       20         1

How can I do that?

CodePudding user response:

If you want to divide the value found in Val1 by the "Final" value of that column as Grouped by Topic1 you can do:

dat  |>
    group_by(Topic1)  |>
    mutate(Calc = Val1 / Val1[Topic2=="Final"])

# # A tibble: 7 x 5
# # Groups:   Topic1 [2]
#   Topic1 Topic2  Val1  Val2  Calc
#   <chr>  <chr>  <int> <int> <dbl>
# 1 Fruit  A         12     9 0.75 
# 2 Fruit  B         10     7 0.625
# 3 Fruit  Final     16     9 1
# 4 Shirt  X         40     1 0.4
# 5 Shirt  Y         10    10 0.1
# 6 Shirt  A         30     3 0.3
# 7 Shirt  Final    100    20 1

This does not quite match your expected output, e.g. row 2 is 0.625 rather than 0.44. However the Val1 value is 10 and the Final Val1 value for Fruit is 16, so it seems correct. Let me know if I have misunderstood what you are asking.

Data

dat  <- structure(list(Topic1 = c("Fruit", "Fruit", "Fruit", "Shirt",
"Shirt", "Shirt", "Shirt"), Topic2 = c("A", "B", "Final", "X",
"Y", "A", "Final"), Val1 = c(12L, 10L, 16L, 40L, 10L, 30L, 100L
), Val2 = c(9L, 7L, 9L, 1L, 10L, 3L, 20L)), class = "data.frame", row.names = c(NA,
-7L))
  • Related