Home > Mobile >  R - Summarize dataframe to avoid NAs
R - Summarize dataframe to avoid NAs

Time:01-30

Having a dataframe like:

id = c(1,1,1)
A = c(3,NA,NA)
B = c(NA,5,NA)
C= c(NA,NA,2)
df = data.frame(id,A,B,C)

  id  A  B  C
1  1  3 NA NA
2  1 NA  5 NA
3  1 NA NA  2

I want to summarize the whole dataframe in one row that it contains no NAs. It should looke like:

  id  A  B  C
1  1  3  5  2

It should work also when the dataframe is bigger and contains more ids but in the same logic.

I didnt found the right function for that and tried some variations of summarise().

CodePudding user response:

You can group_by id and use max with na.rm = TRUE:

library(dplyr)
df %>% 
  group_by(id) %>% 
  summarise(across(everything(), max, na.rm = TRUE))

     id     A     B     C
1     1     3     5     2

If multiple cases, max may not be what you want, you can use sum instead.

  • Related