Home > Enterprise >  How to spread one column and fill with responses from another column in R
How to spread one column and fill with responses from another column in R

Time:05-24

I have the following dataset

df1 <- data.frame(
  ACCESSION=c("123456","123456", "234567", "234567","345678", "345678","456789","456789"),
  MRN=c("1", "1", "2", "2", "3", "3", "4", "4"),
  SOURCE=c("BAL", "BAL", "BAL","BAL", "BAL", "BAL", "BW", "BW"),
  TASK_TYPE=c("Stain", "Culture","Stain", "Culture","Stain", "Culture","Stain", "Culture"),
  RESULT=c("Neg", "Pos","Neg", "Pos","Neg", "Pos","Neg", "Pos"))

But I would like to spread the TASK_TYPE column into two columns and fill with the responses from the RESULT column based on ACCESSION (while keeping all other columns as is). So I would like it to look like this

df2 <- data.frame(
  ACCESSION=c("123456","234567","345678", "456789"),
  MRN=c("1", "2", "3", "4"),
  SOURCE=c("BAL", "BAL", "BAL", "BW"),
  STAIN=c("Neg","Neg","Neg","Neg"),
  CULTURE=c("Pos","Pos","Pos","Pos"))
 
ACCESSION      MRN       SOURCE      STAIN       CULTURE
123456         1         BAL         Neg          Pos
234567         2         BAL         Neg          Pos
345678         3         BAL         Neg          Pos
456789         4         BW          Neg          Pos

I have looked everywhere and have come close but I haven't been able to find the exact solution. Thank you in advance!

CodePudding user response:

Use pivot_wider from the tidyr package:

library(tidyr)
df1 %>% pivot_wider(names_from = TASK_TYPE, values_from = RESULT)

Output:

# ACCESSION MRN   SOURCE Stain Culture
# <chr>     <chr> <chr>  <chr> <chr>  
# 1 123456    1     BAL    Neg   Pos    
# 2 234567    2     BAL    Neg   Pos    
# 3 345678    3     BAL    Neg   Pos    
# 4 456789    4     BW     Neg   Pos    

CodePudding user response:

Using dcast

library(data.table)
dcast(setDT(df1), ... ~ TASK_TYPE, value.var = 'RESULT')

-output

Key: <ACCESSION, MRN, SOURCE>
   ACCESSION    MRN SOURCE Culture  Stain
      <char> <char> <char>  <char> <char>
1:    123456      1    BAL     Pos    Neg
2:    234567      2    BAL     Pos    Neg
3:    345678      3    BAL     Pos    Neg
4:    456789      4     BW     Pos    Neg

CodePudding user response:

A base R option using reshape

reshape(
    df1,
    direction = "wide",
    idvar = c("ACCESSION", "MRN", "SOURCE"),
    timevar = "TASK_TYPE"
)

gives

  ACCESSION MRN SOURCE RESULT.Stain RESULT.Culture
1    123456   1    BAL          Neg            Pos
3    234567   2    BAL          Neg            Pos
5    345678   3    BAL          Neg            Pos
7    456789   4     BW          Neg            Pos
  • Related