I have data columns like:
Weather1 Weather.2 Weather3 Sunny Sunny NA Cloudy NA some cloud Hazy Hazy Hazy Warm NA Warm NA NA NA
Here is a snippet of code to recreate the data:
example = data.frame(Weather1 = c('Sunny','Cloudy','Hazy','Warm','NA'),
Weather.2 = c('Sunny','NA','Hazy','NA','NA'),
Weather3 = c('NA','some cloud','Hazy','NA', 'NA'))
I want to combine the three variables like the following:
Weather1 Weather.2 Weather3 combine_weather Sunny Sunny NA Sunny Cloudy NA some cloud Cloudy Hazy Hazy Hazy Hazy Warm NA NA Warm NA NA NA NA
I want to use value from first variable if there is any mismatch.
CodePudding user response:
You can use coalesce
to fill in NA
horizontally with the first non-missing value.
Note I have changed your string "NA" into real NA
first.
There are a few ways to specify columns in "tidy-select" style:
- If your columns that need to be acted on all start with the string "Weather", you can do
select(., starts_with("Weather"))
. - If they are ordered next to each other, you can do
select(., Weather1:Weather3)
. - Or select them manually
select(., Weather1, Weather.2, Weather3)
library(dplyr)
example %>% mutate(across(everything(), ~ifelse(.x == "NA", NA, .x)),
combine_weather = coalesce(!!!select(., everything())))
Weather1 Weather.2 Weather3 combine_weather
1 Sunny Sunny <NA> Sunny
2 Cloudy <NA> some cloud Cloudy
3 Hazy Hazy Hazy Hazy
4 Warm <NA> <NA> Warm
5 <NA> <NA> <NA> NA
CodePudding user response:
Another way could be first define how your combine_weather
could be in pattern:
then extract
after unite
:
library(tidyverse)
pattern <- c("Sunny|Cloudy|Hazy|Warm")
example %>%
na_if("NA") %>%
unite(combine_weather, starts_with("Weather"), sep = " ", na.rm = TRUE, remove = FALSE) %>%
mutate(combine_weather = str_extract(combine_weather, pattern))
combine_weather Weather1 Weather.2 Weather3
1 Sunny Sunny Sunny <NA>
2 Cloudy Cloudy <NA> some cloud
3 Hazy Hazy Hazy Hazy
4 Warm Warm <NA> <NA>
5 <NA> <NA> <NA> <NA>