I have a column in a dataframe holding subjects:
sub <- c("A", "A", "B", "C", "C", "C", "D", "E", "F", "F")
subjects <- data.frame(sub)
I have another data frame containing columns of subjects (where subjects are only found in one column):
one <- c("A", "C", "F")
two <- c("B", "D", NA)
three <- c("E", NA, NA)
newsubjects <- data.frame(one, two, three)
I'm wanting to rename the subjects in the first dataframe to the column name found in the second dataframe corresponding to that subject.
So for example, I want the A, C, and F subjects in the first dataframe to be renamed 'one'. Doing this manually would take a long time so I'm hoping theres a way to use the columns in the second data frame to do this.
I've tried a bunch of stuff with forcats::fct_recode and levels but nothing works because I'm not using these functions correctly. Eg IIRC one of my attempts looked something like this:
subjects %>%
mutate(new_var = forcats::fct_recode(sub,
!!! setNames(as.character(subjects$sub), newsubjects$one)))
Which I know is completely wrong. Part of the problem is it's difficult fo me to articulate my problem in a way that returns relevant search results. Thank you for any help you can provide, I appreciate it.
CodePudding user response:
Using purrr::map()
, derive a list pairing column names with values from newsubjects
. Then unpack this inside forcats::fct_collapse()
to recode values in subjects
.
library(purrr)
library(forcats)
new_ids <- map(newsubjects, ~ .x[!is.na(.x)])
subjects$sub <- fct_collapse(subjects$sub, !!!new_ids)
subjects
sub
1 one
2 one
3 two
4 one
5 one
6 one
7 two
8 three
9 one
10 one
CodePudding user response:
If you reshape newsubjects
longer, you could join the two tables:
library(tidyverse)
subjects %>%
left_join(newsubjects %>%
pivot_longer(everything(), names_to = "new_sub", values_to = "sub"))
Joining, by = "sub"
sub new_sub
1 A one
2 A one
3 B two
4 C one
5 C one
6 C one
7 D two
8 E three
9 F one
10 F one
CodePudding user response:
On the basis of equal length in one, two, three you could also create a lookup
library(dplyr)
sub <- c("A", "A", "B", "C", "C", "C", "D", "E", "F", "F")
subjects <- data.frame(sub)
one <- c("A", "C", "F")
two <- c("B", "D", NA)
three <- c("E", NA, NA)
additions <- c(one, two, three)
lookup <- data.frame(
sub = additions %>% unlist(),
value = rep(1:length(additions), each=length(additions[[1]])))
subjects %>% inner_join(lookup) %>% select(value)
CodePudding user response:
In base R:
gsub("\\d", "", names(unlist(newsubjects))[match(subjects$sub, unlist(newsubjects))])