Home > database >  Add column with row number count of grouped column values while excluding NA's
Add column with row number count of grouped column values while excluding NA's

Time:02-12

I am trying to add a column to a data frame with a row number by groups. I do not want to count NA's. I have grouped by Day and want to add a column counting the row numbers by group without NA's. I am getting a data frame that counts the NA's first and then the number values.

Code so far:

df <- data.frame (Day  = c("1", "1", "1", "1", "1", "2", "2", "2", "2", "2", "3", "3", "3", "3", "3"),
                  test1 = c("1.2", NA, "3.2", NA, "7.2", "1.2", "2.5", "0.98", NA, "1.6","1.2", NA, "4.1", "9.2", "0.78")
)

df <- df %>%
  group_by(Day) %>%
  mutate(row_num = row_number(!is.na(test1)))
print(df)

# A tibble: 15 x 3
# Groups:   Day [3]
   Day   test1 row_num
   <chr> <chr>       <int>
 1 1     1.2             3
 2 1     NA              1
 3 1     3.2             4
 4 1     NA              2
 5 1     7.2             5
 6 2     1.2             2
 7 2     2.5             3
 8 2     0.98            4
 9 2     NA              1
10 2     1.6             5
11 3     1.2             2
12 3     NA              1
13 3     4.1             3
14 3     9.2             4
15 3     0.78            5

It should be:

Day   test1 row_num
   <chr> <chr>       <int>
 1 1     1.2             1
 2 1     NA              
 3 1     3.2             2
 4 1     NA              
 5 1     7.2             3
 6 2     1.2             4
 7 2     2.5             5
 8 2     0.98            6
 9 2     NA              
10 2     1.6             1
11 3     1.2             2
12 3     NA              
13 3     4.1             3
14 3     9.2             4
15 3     0.78            5

CodePudding user response:

We may use replace

library(dplyr)
df %>% 
  group_by(Day) %>%
   mutate(row_num =  replace(test1, !is.na(test1),
       seq_along(test1[!is.na(test1)]))) %>%
   ungroup

Or with data.table

library(data.table)
setDT(df)[!is.na(test1), row_num := seq_len(.N), by = Day]

CodePudding user response:

df %>%
  group_by(Day) %>%
  mutate(row_num = row_number(NA^is.na(test1)))
  Day   test1 row_num
   <chr> <chr>   <int>
 1 1     1.2         1
 2 1     NA         NA
 3 1     3.2         2
 4 1     NA         NA
 5 1     7.2         3
 6 2     1.2         1
 7 2     2.5         2
 8 2     0.98        3
 9 2     NA         NA
10 2     1.6         4
11 3     1.2         1
12 3     NA         NA
13 3     4.1         2
14 3     9.2         3
15 3     0.78        4
  • Related