I am using a dataset that stores country-specific information in different variables, I want to combine this into a single variable. It feels like this should be such an easy exercise, but I cannot figure it out and I can't find any answers here.
Say the data looks like this:
df <- data.frame(country = c("BE", "BE", "BE", "NL", "NL", "NL"),
year = c(2010, 2010, 2010, 2010, 2010, 2010),
party_NL = c(NA, NA, NA, "A", "B", "B"),
party_BE = c("C", "D", "E", NA, NA, NA))
country year party_NL party_BE
1 BE 2010 <NA> C
2 BE 2010 <NA> D
3 BE 2010 <NA> E
4 NL 2010 A <NA>
5 NL 2010 B <NA>
6 NL 2010 B <NA>
What I need is the following:
country year party_NL party_BE party
1 BE 2010 <NA> C C
2 BE 2010 <NA> D D
3 BE 2010 <NA> E E
4 NL 2010 A <NA> A
5 NL 2010 B <NA> B
6 NL 2010 B <NA> B
Guessing some loop would have to be applied. Once again, it sounds so easy that I apologize in advance.
Thanks
CodePudding user response:
You can use coalesce
in dplyr:
df %>%
mutate(party = coalesce(party_NL, party_BE))
Output:
country year party_NL party_BE party
1 BE 2010 <NA> C C
2 BE 2010 <NA> D D
3 BE 2010 <NA> E E
4 NL 2010 A <NA> A
5 NL 2010 B <NA> B
6 NL 2010 B <NA> B
CodePudding user response:
Use mutate
from dplyr
, this will do the trick:
require(dplyr)
df %>%
mutate(party = ifelse(is.na(party_NL), party_BE, party_NL))