Filtering NAs by group-CodePudding

I have this df

df <- data.frame(
    id = c(1L,1L,1L,2L,2L,3L),
    keyword = c("car","hospital",NA,"cat",NA,NA))

I would like to get this

df <- data.frame(
    id = c(1L,1L,2L,3L),
    keyword = c("car","hospital","cat",NA))

If there is a keyword I would like to keep it and if there is none keep NA

Trying something like

df %>% group_by(id) %>% filter(!is.na(keyword) | keyword != " ")

CodePudding user response：

Possible solution: First we remove all rows with NA in keyword and then we add new rows for the potentially missing ids (where all other columns will now contain NA):

library(dplyr)
library(tidyr)
df %>% 
  filter(!is.na(keyword)) %>% 
  full_join(df %>% select(id) %>% unique())

Returns:

  id  keyword
1  1      car
2  1 hospital
3  2      cat
4  3     <NA>

CodePudding user response：

You may filter the rows conditionally.

library(dplyr)

df %>%
  group_by(id) %>%
  filter(if(all(is.na(keyword))) row_number() == 1 else !is.na(keyword)) %>%
  ungroup

#    id keyword 
#  <int> <chr>   
#1     1 car     
#2     1 hospital
#3     2 cat     
#4     3 NA