Home > Software design >  Sum of column based on a condition a R
Sum of column based on a condition a R

Time:11-08

I would like to print out total amount for each date so that my new dataframe will have date and and total amount columns. My data frame looks like this

permitnum amount
6/1/2022 na
ascas 30.00
olic 40.41
6/2/2022 na
avrey 17.32
fev 32.18
grey 12.20

any advice on how to go about this will be appreciated

CodePudding user response:

Here is another tidyverse option, where I convert to date (and then reformat), then we can fill in the date, so that we can use that to group. Then, get the sum for each date.

library(tidyverse)

df %>% 
  mutate(permitnum = format(as.Date(permitnum, "%m/%d/%Y"), "%m/%d/%Y")) %>% 
  fill(permitnum, .direction = "down") %>% 
  group_by(permitnum) %>% 
  summarise(total_amount = sum(as.numeric(amount), na.rm = TRUE))

Output

  permitnum  total_amount
  <chr>             <dbl>
1 06/01/2022         70.4
2 06/02/2022         61.7

Data

df <- structure(list(permitnum = c("6/1/2022", "ascas", "olic", "6/2/2022", 
"avrey", "fev", "grey"), amount = c("na", "30.00", "40.41", "na", 
"17.32", "32.18", "12.20")), class = "data.frame", row.names = c(NA, 
-7L))

CodePudding user response:

Here is an option. Split the data by the date marked by a row with a number, then summarize the total in amount and combine the date and all rows.

library(tidyverse)

dat <- read_table("permitnum    amount
6/1/2022    na
ascas   30.00
olic    40.41
6/2/2022    na
avrey   17.32
fev 32.18
grey    12.20")

dat |>
  group_split(id = cumsum(grepl("\\d", permitnum))) |>
  map_dfr(\(x){
    date <- x$permitnum[[1]]
    x |>
      slice(-1) |>
      summarise(date = date,
                total_amount = sum(as.numeric(amount)))
  })
#> # A tibble: 2 x 2
#>   date     total_amount
#>   <chr>           <dbl>
#> 1 6/1/2022         70.4
#> 2 6/2/2022         61.7
  • Related