I have a dataset that consists of unique identifiers for a group of raters and ratees. I would like to be able to get the interrater reliability for each item but am running into a problem with how the data is structured. Because each ratee was rated 4-5 times I am able to group the data by ratee ID. Unfortunately, because of the unique rater ID, I can't set up the dataset properly to use the irr package.
My data looks something like this
Rater | Ratee | Rating |
---|---|---|
11111 | 12345 | 1 |
12112 | 12345 | 1 |
12232 | 12345 | 0 |
12457 | 12345 | 0 |
16794 | 12345 | 1 |
55555 | 16454 | 0 |
66666 | 16454 | 1 |
77777 | 16454 | 1 |
88888 | 16454 | 0 |
99999 | 16454 | 1 |
I would like to have some way to iteratively go through each group and rename the unique identifier for the rater to something I can use to pivot the data into the right format. For example, going through each group of ratee ID's and assigning a new value to the rater like r1 for the first value, r2 for the second value and so on, and repeat once it finds a new group. The end result would hopefully look something like this:
Rater | Ratee | Rating |
---|---|---|
r1 | 12345 | 1 |
r2 | 12345 | 1 |
r3 | 12345 | 0 |
r4 | 12345 | 0 |
r5 | 12345 | 1 |
r1 | 16454 | 0 |
r2 | 16454 | 1 |
r3 | 16454 | 1 |
r4 | 16454 | 0 |
r5 | 16454 | 1 |
Can anyone help me do this? I am at a loss and have exhausted my R repertoire.
CodePudding user response:
I think you want this:
library(dplyr)
your_data %>%
group_by(Ratee) %>%
mutate(new_rater_column = paste0("r", row_number())) %>%
ungroup()
I used a new column name instead of overwriting the old Rater
column just in case the information there is useful.