Home > Software design >  Summing rows if shared value and adding up values of another column
Summing rows if shared value and adding up values of another column

Time:10-10

I'm an amateur user of R and I'm currently working data from an imaginary bet. I have two questions about these data:

  1. How can I sum up row values, in case they share the same Date and Participants values. For example, the first two rows are on the same date "11-06-2020" and the same Participant "Mike". I would like to just have an individual row which contains this information and which sums up the Points column values. This is, the first row will be 11-06-2020 | Mike | 2, the second row will be 11-06-2020 | Paul | 4, etc.
  2. My second question was about adding up values. This is, if the Participant Mike obtained 2 points from the Date 11-06-2020, this value will be summed up to the 12-06-2020. Therefore, the value of the Points column of the Participant Miguel on the Date 12-06-2020, will be 8 (2 6). Here's the data:
Date <- c("11-06-2020", "11-06-2020", "11-06-2020", "11-06-2020",
          "11-06-2020", "11-06-2020", "12-06-2020", "12-06-2020",
          "12-06-2020", "13-06-2020", "13-06-2020", "13-06-2020",
          "13-06-2020", "13-06-2020", "13-06-2020", "13-06-2020",
          "13-06-2020", "13-06-2020")

Participants <- c("Mike", "Mike", "Paul", "Paul", "Sarah", "Sarah",
                  "Mike", "Paul", "Sarah", "Mike", "Mike", "Mike",
                  "Paul", "Paul", "Paul", "Sarah", "Sarah", "Sarah")

Points <- c(0, 2, 2, 2, 4, 2, 6, 0, 2, 2, 10, 0, 4, 6, 0, 6, 4, 2)

bet <- data.frame(Date, Participants, Points)

# Output from the console:

Date Participants Points
1  11-06-2020         Mike      0
2  11-06-2020         Mike      2
3  11-06-2020         Paul      2
4  11-06-2020         Paul      2
5  11-06-2020        Sarah      4
6  11-06-2020        Sarah      2
7  12-06-2020         Mike      6
8  12-06-2020         Paul      0
9  12-06-2020        Sarah      2
10 13-06-2020         Mike      2
11 13-06-2020         Mike     10
12 13-06-2020         Mike      0
13 13-06-2020         Paul      4
14 13-06-2020         Paul      6
15 13-06-2020         Paul      0
16 13-06-2020        Sarah      6
17 13-06-2020        Sarah      4
18 13-06-2020        Sarah      2

CodePudding user response:

Using dplyr:

Question 1:

bet2 <- count(bet, Date, Participants, wt = Points, name = "Points")
bet2

        Date Participants Points
1 11-06-2020         Mike      2
2 11-06-2020         Paul      4
3 11-06-2020        Sarah      6
4 12-06-2020         Mike      6
5 12-06-2020         Paul      0
6 12-06-2020        Sarah      2
7 13-06-2020         Mike     12
8 13-06-2020         Paul     10
9 13-06-2020        Sarah     12

Question 2:

bet2 |> 
  group_by(Participants) |> 
  mutate(CumuPoints = cumsum(Points)) |>
  ungroup()

  Date       Participants Points CumuPoints
1 11-06-2020 Mike              2          2
2 11-06-2020 Paul              4          4
3 11-06-2020 Sarah             6          6
4 12-06-2020 Mike              6          8
5 12-06-2020 Paul              0          4
6 12-06-2020 Sarah             2          8
7 13-06-2020 Mike             12         20
8 13-06-2020 Paul             10         14
9 13-06-2020 Sarah            12         20
  • Related