How to create an ID column for duplicate rows based on data from another column?-CodePudding

I have a dataset that looks like this:

  Study_ID   ear
1      100  Left
2      100 Right
3      200  Left
4      200 Right
5      300  Left
6      300 Right

Where every patient is duplicated once (Study_ID appears twice), for each of their ears (left, right). I want to create a new variable that identifies which row is for the left ear, and which is for the right.

My desired output would look like this:

  Study_ID   ear ear_ID
1      100  Left  100_L
2      100 Right  100_R
3      200  Left  200_L
4      200 Right  200_R
5      300  Left  300_L
6      300 Right  300_R

Where the first part of the variable is the study ID, and the second part of the variable is 'L' or 'R' for left or right ear.

How can I go about doing this?

Reproducible Data:

data<-data.frame(Study_ID=c("100","100","200","200","300","300"),ear=c("Left","Right","Left","Right","Left","Right"))

CodePudding user response：

transform(data, ear_ID =paste(Study_ID, substr(ear, 1, 1), sep='_'))

  Study_ID   ear ear_ID
1      100  Left  100_L
2      100 Right  100_R
3      200  Left  200_L
4      200 Right  200_R
5      300  Left  300_L
6      300 Right  300_R

Note that tidyverse, you can just group by both the two columns and each will be considered unique identifier of the ear