I am looking to form a final data frame by iterating over two data frames.
The first df looks like this
employee_name <- c("rob", "peter")
employee_attribute1 <- c("10", "5")
employee_attribute2 <- c("5", "5")
employee_df <- data.frame(employee_name, employee_attribute1,employee_attribute2)
the second one looks like this. Employees can and do belong to multiple groups
employee_group <- c("1", "2","3" )
employee_attributes <- c("employee_attribute2", "employee_attribute2","employee_attribute1" )
group_att_mapping_df <- data.frame(employee_group,employee_attributes)
I would like to create a data frame like this. The rating condition is if employee attribute score in employee_df
for the employee_attributes
( from group_att_mapping_df) = 10 then the rating is 1 else it is 2.
employee_group| employee_name | employee_rating
1 | Rob | 2
1 | Peter | 2
2 | Rob | 2
2 |Peter | 2
3 | Rob | 1
3 | Peter | 2
Need help writing valid R code to iterate over the two data frames. I will attach a screenshot of the loop I have written so far. Many thanks in advance.
CodePudding user response:
You don't need to iterate over the dataframes. You can use pivoting and joining to get the expected output:
employee_df %>%
tidyr::pivot_longer(
contains("attribute"),
names_to = "employee_attributes",
values_to = "attribute_score"
) %>%
left_join(group_att_mapping_df) %>%
mutate(employee_rating = ifelse(attribute_score == 10, 1, 2)) %>%
select(employee_group, employee_name, employee_rating) %>%
arrange(employee_group)
#> Joining, by = "employee_attributes"
#> # A tibble: 6 x 3
#> employee_group employee_name employee_rating
#> <chr> <chr> <dbl>
#> 1 1 rob 2
#> 2 1 peter 2
#> 3 2 rob 2
#> 4 2 peter 2
#> 5 3 rob 1
#> 6 3 peter 2
Explaination:
- First we pivot
employee_df
to get all the attributes variables as one column. - Next we left join with
group_att_mapping_df
to get theemployee_group
column. - Then we create the
employee_rating
column based on your logic: 1 if the attribute score is 10, and 2 otherwise. - Finally I just make the data look the same as your expected output, by selecting the columns you want and sorting them by
employee_group
.
Data without the ...
employee_name <- c("rob", "peter")
employee_attribute1 <- c("10", "5")
employee_attribute2 <- c("5", "5")
employee_df <- data.frame(employee_name, employee_attribute1,employee_attribute2)
employee_group <- c("1", "2","3")
employee_attributes <- c("employee_attribute2", "employee_attribute2","employee_attribute1")
group_att_mapping_df <- data.frame(employee_group,employee_attributes)