Home > Net >  Merge multiple categorical variables into one using value from another value in r
Merge multiple categorical variables into one using value from another value in r

Time:05-09

I have data columns like:

Weather1    Weather.2   Weather3        
Sunny       Sunny       NA              
Cloudy      NA          some cloud      
Hazy        Hazy        Hazy            
Warm        NA          Warm            
NA          NA          NA              

Here is a snippet of code to recreate the data:

example = data.frame(Weather1 = c('Sunny','Cloudy','Hazy','Warm','NA'),
                     Weather.2 = c('Sunny','NA','Hazy','NA','NA'),
                     Weather3 = c('NA','some cloud','Hazy','NA', 'NA'))

I want to combine the three variables like the following:

Weather1    Weather.2   Weather3        combine_weather
Sunny       Sunny       NA              Sunny
Cloudy      NA          some cloud      Cloudy
Hazy        Hazy        Hazy            Hazy
Warm        NA          NA              Warm
NA          NA          NA              NA

I want to use value from first variable if there is any mismatch.

CodePudding user response:

You can use coalesce to fill in NA horizontally with the first non-missing value.

Note I have changed your string "NA" into real NA first.

There are a few ways to specify columns in "tidy-select" style:

  1. If your columns that need to be acted on all start with the string "Weather", you can do select(., starts_with("Weather")).
  2. If they are ordered next to each other, you can do select(., Weather1:Weather3).
  3. Or select them manually select(., Weather1, Weather.2, Weather3)
library(dplyr)

example %>% mutate(across(everything(), ~ifelse(.x == "NA", NA, .x)),
                   combine_weather = coalesce(!!!select(., everything())))

  Weather1 Weather.2   Weather3 combine_weather
1    Sunny     Sunny       <NA>           Sunny
2   Cloudy      <NA> some cloud          Cloudy
3     Hazy      Hazy       Hazy            Hazy
4     Warm      <NA>       <NA>            Warm
5     <NA>      <NA>       <NA>              NA

CodePudding user response:

Another way could be first define how your combine_weather could be in pattern: then extract after unite:

library(tidyverse)

pattern <- c("Sunny|Cloudy|Hazy|Warm")

example %>% 
  na_if("NA") %>% 
  unite(combine_weather, starts_with("Weather"), sep = " ", na.rm = TRUE,  remove = FALSE) %>%
  mutate(combine_weather = str_extract(combine_weather, pattern))
  combine_weather Weather1 Weather.2   Weather3
1           Sunny    Sunny     Sunny       <NA>
2          Cloudy   Cloudy      <NA> some cloud
3            Hazy     Hazy      Hazy       Hazy
4            Warm     Warm      <NA>       <NA>
5            <NA>     <NA>      <NA>       <NA>
  •  Tags:  
  • r
  • Related