I am trying to combine multiple columns in R into a single one.
My data looks like this:
age gender a b c d e f race insured
1 13 Female 0 0 0 0 0 0 white 0
2 12 Female 0 0 0 0 0 0 white 1
3 19 Male 0 0 0 0 0 1 other 0
4 19 Female 0 1 0 0 0 0 white 0
5 13 Female 1 1 0 0 0 1 white 0
This is what my desired output looks like:
age gender race insured value
1 13 Female white 0 none
2 12 Female white 1 none
3 19 Male other 0 f
4 19 Female white 0 b
5 13 Female white 0 a
5 13 Female white 0 b
5 13 Female white 0 f
I've tried using gather to create a long data frame but this doesn't add the data to the dataframe like i want. How can I do this with the dplyr or tidyverse package?
CodePudding user response:
A solution using tidyverse
. dat4
is the final output.
library(tidyverse)
dat2 <- dat %>%
mutate(ID = 1:n())
dat3 <- dat2 %>%
pivot_longer(a:f, names_to = "value", values_to = "number") %>%
filter(number == 1) %>%
select(-number)
dat4 <- dat2 %>%
left_join(dat3) %>%
select(-ID, -c(a:f)) %>%
replace_na(list(value = "none"))
dat4
# age gender race insured value
# 1 13 Female white 0 none
# 2 12 Female white 1 none
# 3 19 Male other 0 f
# 4 19 Female white 0 b
# 5 13 Female white 0 a
# 6 13 Female white 0 b
# 7 13 Female white 0 f
DATA
dat <- read.table(text = " age gender a b c d e f race insured
1 13 Female 0 0 0 0 0 0 white 0
2 12 Female 0 0 0 0 0 0 white 1
3 19 Male 0 0 0 0 0 1 other 0
4 19 Female 0 1 0 0 0 0 white 0
5 13 Female 1 1 0 0 0 1 white 0",
header = TRUE)
CodePudding user response:
One more way
df <- read.table(text = ' row_num age gender a b c d e f race insured
1 13 Female 0 0 0 0 0 0 white 0
2 12 Female 0 0 0 0 0 0 white 1
3 19 Male 0 0 0 0 0 1 other 0
4 19 Female 0 1 0 0 0 0 white 0
5 13 Female 1 1 0 0 0 1 white 0', header = T)
df
#> row_num age gender a b c d e f race insured
#> 1 1 13 Female 0 0 0 0 0 0 white 0
#> 2 2 12 Female 0 0 0 0 0 0 white 1
#> 3 3 19 Male 0 0 0 0 0 1 other 0
#> 4 4 19 Female 0 1 0 0 0 0 white 0
#> 5 5 13 Female 1 1 0 0 0 1 white 0
library(tidyverse)
nm <- c('a', 'b', 'c', 'd', 'e', 'f')
df %>% mutate(across(a:f, as.logical)) %>%
nest(value = c(a, b, c, d, e, f)) %>%
mutate(value = map(value, ~ ifelse(length(nm[unlist(.)]) == 0,
'none',
paste(unlist(nm[unlist(.)]), collapse = ',')
)
)) %>%
unnest(value) %>%
separate_rows(value, sep = ',')
#> # A tibble: 7 x 6
#> row_num age gender race insured value
#> <int> <int> <chr> <chr> <int> <chr>
#> 1 1 13 Female white 0 none
#> 2 2 12 Female white 1 none
#> 3 3 19 Male other 0 f
#> 4 4 19 Female white 0 b
#> 5 5 13 Female white 0 a
#> 6 5 13 Female white 0 b
#> 7 5 13 Female white 0 f
Created on 2021-11-16 by the reprex package (v2.0.0)
CodePudding user response:
(not allowed to comment, also being a fairly new user)
The information you give it quite minimum. If you just want to combine columns check out the merge command, using this command you need to bind the two (your df1 and the new Column1) by column name, first add an empty column with the right name to your dataframe and then merge them:
names(df1)[11]<- "Value"
New_df <- merge(df1,Column1, by = "Value")