Home > Enterprise >  Calculating Percent Change in R for Multiple Variables
Calculating Percent Change in R for Multiple Variables

Time:07-20

I'm trying to calculate percent change in R with each of the time points included in the column label (table below). I have dplyr loaded and my dataset was loaded in R and I named it data. Below is the code I'm using but it's not calculating correctly. I want to create a new dataframe called data_per_chg which contains the percent change from "v1" each variable from. For instance, for wbc variable, I would like to calculate percent change of wbc.v1 from wbc.v1, wbc.v2 from wbc.v1, wbc.v3 from wbc.v1, etc, and do that for all the remaining variables in my dataset. I'm assuming I can probably use a loop to easily do this but I'm pretty new to R so I'm not quite sure how proceed. Any guidance will be greatly appreciated.

id wbc.v1 wbc.v2 wbc.v3 rbc.v1 rbc.v2 rbc.v3 hct.v1 hct.v2 hct.v3
a1 23 63 30 23 56 90 13 89 47
a2 81 45 46 N/A 18 78 14 45 22
a3 NA 27 14 29 67 46 37 34 33
 data_per_chg<-data%>%
      group_by(id%>%
      arrange(id)%>%
      mutate(change=(wbc.v2-wbc.v1)/(wbc.v1))
    data_per_chg

CodePudding user response:

Assuming the NA values are all NA and no N/A

library(dplyr)
library(stringr)
data <- data %>%
   na_if("N/A") %>%
   type.convert(as.is = TRUE) %>%
   mutate(across(-c(id, matches("\\.v1$")), ~ {
    v1 <- get(str_replace(cur_column(), "v\\d $", "v1"))
   (.x - v1)/v1}, .names = "{.col}_change"))

-output

data
 id wbc.v1 wbc.v2 wbc.v3 rbc.v1 rbc.v2 rbc.v3 hct.v1 hct.v2 hct.v3 wbc.v2_change wbc.v3_change rbc.v2_change rbc.v3_change hct.v2_change hct.v3_change
1 a1     23     63     30     23     56     90     13     89     47     1.7391304     0.3043478      1.434783     2.9130435    5.84615385     2.6153846
2 a2     81     45     46     NA     18     78     14     45     22    -0.4444444    -0.4320988            NA            NA    2.21428571     0.5714286
3 a3     NA     27     14     29     67     46     37     34     33            NA            NA      1.310345     0.5862069   -0.08108108    -0.1081081

If we want to keep the 'v1' columns as well

data %>%
   na_if("N/A") %>%
   type.convert(as.is = TRUE) %>% 
   mutate(across(ends_with('.v1'), ~ .x - .x, 
    .names = "{str_replace(.col, 'v1', 'v1change')}")) %>% 
   transmute(id, across(ends_with('change')),
    across(-c(id, matches("\\.v1$"), ends_with('change')),
     ~ {
    v1 <- get(str_replace(cur_column(), "v\\d $", "v1"))
   (.x - v1)/v1}, .names = "{.col}_change")) %>%
    select(id, starts_with('wbc'), starts_with('rbc'), starts_with('hct'))

-output

 id wbc.v1change wbc.v2_change wbc.v3_change rbc.v1change rbc.v2_change rbc.v3_change hct.v1change hct.v2_change hct.v3_change
1 a1            0     1.7391304     0.3043478            0      1.434783     2.9130435            0    5.84615385     2.6153846
2 a2            0    -0.4444444    -0.4320988           NA            NA            NA            0    2.21428571     0.5714286
3 a3           NA            NA            NA            0      1.310345     0.5862069            0   -0.08108108    -0.1081081

data

data <- structure(list(id = c("a1", "a2", "a3"), wbc.v1 = c(23L, 81L, 
NA), wbc.v2 = c(63L, 45L, 27L), wbc.v3 = c(30L, 46L, 14L), rbc.v1 = c("23", 
"N/A", "29"), rbc.v2 = c(56L, 18L, 67L), rbc.v3 = c(90L, 78L, 
46L), hct.v1 = c(13L, 14L, 37L), hct.v2 = c(89L, 45L, 34L), hct.v3 = c(47L, 
22L, 33L)), class = "data.frame", row.names = c(NA, -3L))
  •  Tags:  
  • r
  • Related