I am searching for a solution how to transform the following data frame using dplyr:
item <- c('A','B','C')
one <- c(2, 1, 2)
two <- c(1,1,2)
data <- data.frame(item,one,two)
data
item | one | two |
---|---|---|
A | 2 | 1 |
B | 1 | 1 |
C | 2 | 2 |
Now, the column "one" contains the number of ratings of the value 1, the column "two" the number of ratings of the value 2. My ideal data frame after transformation would look like this:
item | rating |
---|---|
A | 1 |
A | 1 |
A | 2 |
B | 1 |
B | 2 |
C | 1 |
C | 1 |
C | 2 |
C | 2 |
Any idea how I could get to this output (it doesn't have to be dplyr)? I know how to use pivot_longer of the tidyr package but that doesn't solve the problem of repeating the number of rows...
CodePudding user response:
library(dplyr)
library(tidyr) # pivot_longer
nums <- c(one = 1, two = 2, three = 3)
data %>%
pivot_longer(-item) %>%
group_by(item) %>%
summarize(rating = rep(name, times = value)) %>%
ungroup() %>%
mutate(rating = nums[rating])
# # A tibble: 9 x 2
# item rating
# <chr> <dbl>
# 1 A 1
# 2 A 1
# 3 A 2
# 4 B 1
# 5 B 2
# 6 C 1
# 7 C 1
# 8 C 2
# 9 C 2
I had to define nums
because I couldn't find (in my haste) an easy way to convert "one"
to 1
in a programmatic way. You'll need to make sure it goes out at least as far as you need; I added three=3
for demonstration, if you truly only have one
and two
then you should be good as-is.
(Related to that topic: Convert written number to number in R)
CodePudding user response:
Maybe you could convert it from wide to long format with the gather()
function and then replace the string values of "one" and "two" by integers
library(tidyverse)
item <- c('A','B','C')
one <- c(2, 1, 2)
two <- c(1,1,2)
data <- data.frame(item,one,two)
long_df <- gather(data, rating, count, one:two)
new_df <- tibble()
for (i in range(nrow(data))) {
new_df <- rbind(new_df, do.call("rbind", replicate(long_df[i, "count"], long_df, simplify = FALSE)))
}
new_df <- new_df %>% select(-c("count"))