Home > Mobile >  Number order of events
Number order of events

Time:04-29

I am working on a project using medication history of patients, and I want to ask your help for this. The database contains start dates of medication in random order, and I would like to number the medications in order of use.

So I would like to transform:

ID 001 002 003
medA 2001    2005    2003
medB 1999    2000    2015
medC 2019    2014    2000

To:

ID 001 002 003
medA 1 3 2
medB 1 2 3
medC 3 2 1

The real database has 700 subjects and 10 medications.

Is there a way to do this in R?

Thanks in avance for your help!

NB this is my first post, please let me know if I'm doing something wrong forum-wise :)

CodePudding user response:

If you want to keep the original columns:

df[, paste0("rank", 1:3)] <- t(apply(df[,2:4], 1, rank))

CodePudding user response:

Here's an approach:

library(tidyverse)

tribble(
  ~ID, ~"001", ~"002", ~"003",
  "medA", 2001, 2005, 2003,
  "medB", 1999, 2000, 2015,
  "medC", 2019, 2014, 2000
) |> 
  pivot_longer(- ID) |> 
  group_by(ID) |> 
  mutate(rank = rank(value)) |> 
  select(-value) |> 
  pivot_wider(names_from = name, values_from = rank)
#> # A tibble: 3 × 4
#> # Groups:   ID [3]
#>   ID    `001` `002` `003`
#>   <chr> <dbl> <dbl> <dbl>
#> 1 medA      1     3     2
#> 2 medB      1     2     3
#> 3 medC      3     2     1

Created on 2022-04-28 by the reprex package (v2.0.1)

CodePudding user response:

Another approach in base R:

#Your data
mydf <- structure(list(
ID = c("medA", "medB", "medC"), 
`001` = c(2001L,1999L, 2019L), 
`002` = c(2005L, 2000L, 2014L), 
`003` = c(2003L,                                                                                                                        2015L, 2000L)), 
class = "data.frame", 
row.names = c(NA, -3L))

# Transform
mydf[,2:4] <- t(apply(mydf[,2:4], 1, order))

# Result
mydf
    ID 001 002 003
1 medA   1   3   2
2 medB   1   2   3
3 medC   3   2   1

In case more explanation is helpful:

  • order is a function that returns the order of a numeric vector, such as a numeric row or a numeric column. For example: order(c(6,4,5)) returns 2 3 1.

  • mydf[, 2:4] means the second column up to the fourth column of mydf.

  • apply is a function that apply another function to each row or each column of a data frame or a matrix. In your case, order is to be applied to each row of mydf[, 2:4] so the index 1 is used. If a function is to be applied to each column, the index 2 should be used.

  • t is a function to transpose a matrix or a data frame. In this case, it is used to restore the values in each row because when order is applied, the results are returned as columns, so they are transposed to be rows again.

  •  Tags:  
  • r
  • Related