I would like to reshape the following data frame using dplyr::mutate()
: The goal is to create a new column variable percent
but ensuring that the values correspond to the correct subject
. The data is currently in the "long" format. Detailed: I would like to extract rows pertaining the pattern "percentQ" and create a new column called percent
based on each subject
to ensure that the score aligns with the corresponding subject
.
df_long <- structure(list(id = c(NA,NA, "scoreQ1", NA, "scoreQ2", NA, NA,"percentQ1", "percentQ2", NA, "GPA"),
subject = c(NA,NA, "Chris", NA, "Liz", NA, NA, "Chris","Liz", NA, NA),
grade = c(NA,NA, 45L, NA, 60L, NA, NA, 75L, 100L, NA)), row.names = c(NA,-11L), class = c("data.table", "data.frame"))
print(df_long)
#id subject grade
#<NA> <NA> NA
#<NA> <NA> NA
#scoreQ1 Chris 45
#<NA> <NA> NA
#scoreQ2 Liz 60
#<NA> <NA> NA
#<NA> <NA> NA
#percentQ1 Chris 75
#percentQ2 Liz 100
#<NA> <NA> NA
#GPA <NA> NA
Please suggest R scripts that would allow to reshape the data frame to the following:
As can be seen, the percent
value corresponds to the correct subject
, in this instance 75
for Chris and 100
for Liz. I have been running into issues, where I was unable to assign the percent
values to the correct subject
.
df_wide <- structure(list(id = c(NA,NA, "scoreQ1", NA, "scoreQ2", NA, NA, NA, "GPA"),
subject = c(NA,NA, "Chris", NA, "Liz", NA, NA, NA, NA),
grade = c(NA,NA, 45L, NA, 60L, NA, NA, NA, NA),
percent = c(NA,NA, 75L, NA, 100L, NA, NA, NA, NA)), row.names = c(NA,-9L), class = c("data.table", "data.frame"))
print(df_wide)
#id subject grade percent
#<NA> <NA> NA NA
#<NA> <NA> NA NA
#scoreQ1 Chris 45 75
#<NA> <NA> NA NA
#scoreQ2 Liz 60 100
#<NA> <NA> NA NA
#<NA> <NA> NA NA
#<NA> <NA> NA NA
#GPA <NA> NA NA
CodePudding user response:
I think your task can be completed with a join process.
Here, we can filter out the values with percent in the id columns, then
left_join
it to the data_frame after using rename
.
library(dplyr)
percent <- grepl("percent", df_long$id)
df_long |>
filter(!percent) |>
left_join(
filter(df_long, percent) |>
rename(percent = "grade") |>
select(-id), by = "subject"
)
id subject grade percent
1: <NA> <NA> NA NA
2: <NA> <NA> NA NA
3: scoreQ1 Chris 45 75
4: <NA> <NA> NA NA
5: scoreQ2 Liz 60 100
6: <NA> <NA> NA NA
7: <NA> <NA> NA NA
8: <NA> <NA> NA NA
9: GPA <NA> NA NA
Data
df_long <- structure(list(id = c(NA, NA, "scoreQ1", NA, "scoreQ2", NA, NA,
"percentQ1", "percentQ2", NA, "GPA"), subject = c(NA, NA, "Chris",
NA, "Liz", NA, NA, "Chris", "Liz", NA, NA), grade = c(NA, NA,
45L, NA, 60L, NA, NA, 75L, 100L, NA, NA)), row.names = c(NA,
-11L), class = c("data.table", "data.frame"))
CodePudding user response:
I may be missing something, but you aren't really moving from long form to wide form, you are just adding an additional column. Assuming that the top mark is 60 then:
library(tidyverse)
df2 <-
df_long |>
mutate(percent = (grade/60)*100)
should do the trick.