Home > Software engineering >  applying math operations conditionally on columns using R
applying math operations conditionally on columns using R

Time:09-15

I have tried to apply the method describes in the solutions of this post But I cant seem to get it to work for my use case.

I have a dataframe like so:

TS                          Wafer(1)Radius(06)      Wafer(2)Radius(06)   Wafer(3)Radius(06)             Wafer(1)_max            Wafer(2)_max            Wafer(3)_max            Wafer(1)_min                Wafer(2)_min                    Wafer(3)_min
2022-06-29T02:54:33.537582      698.827305          699.153166              701.153731                  699.17035               699.183843              701.545892              698.572553                  698.678988                      699.444565
2022-06-29T02:54:40.987582      696.241402          696.336327              700.313207                  696.241402              696.411087              700.435095              695.253749                  695.655695                      696.047009
2022-06-29T02:54:48.429972      691.987146          691.803447              697.176958                  691.987146              691.803447              697.187276              690.879083                  690.706554                      690.284588
2022-06-29T02:54:55.877582      686.561008          686.295043              692.386884                  686.561008              686.295043              692.386884              684.639355                  684.388604                      684.443958
2022-06-29T02:55:03.327582      680.716974          680.377037              686.803004                  680.716974              680.377037              686.803004              678.071563                  677.826257                      677.984677
2022-06-29T02:55:10.777582      674.501714          674.25299               680.702401                  674.501714              674.25299               680.702401              672.066944                  671.429438                      671.354129

I would like to subtract the Radius(06) of each wafer from its corresponding max. And I would like to subtract the min of each wafer from its corresponding Radius(06) So I resulting columns would be :

Wafer(1)_max-Wafer(1)Radius(06)
Wafer(2)_max-Wafer(2)Radius(06)
Wafer(3)_max-Wafer(3)Radius(06)
Wafer(1)Radius(06)-Wafer(1)_min
Wafer(2)Radius(06)-Wafer(2)_min
Wafer(3)Radius(06)-Wafer(3)_min

Is there any way to achieve this using grep like pattern matching that includes regex characters? (I will need to maintain the brackets in the column names) If there is a much easier way to do this that requires the removal of the parentheses, I am willing to rename them at the beginning of my algorithm.

For convenience, the code to generate the dataset is here:

df <- data.frame( TS = c("2022-06-29T02:54:33.537582","2022-06-29T02:54:40.987582","2022-06-29T02:54:48.429972","2022-06-29T02:54:55.877582","2022-06-29T02:55:03.327582","2022-06-29T02:55:10.777582"),
                  `Wafer(1)Radius(06)` = c(698.827305,696.241402,691.987146,686.561008,680.716974,674.501714),
                  `Wafer(2)Radius(06)` = c(699.153166,696.336327,691.803447,686.295043,680.377037,674.25299),
                  `Wafer(3)Radius(06)` = c(701.153731,700.313207,697.176958,692.386884,686.803004,680.702401),
                  `Wafer(1)_max` = c(699.17035,696.241402,691.987146,686.561008,680.716974,674.501714),
                  `Wafer(2)_max` = c(699.183843,696.411087,691.803447,686.295043,680.377037,674.25299),
                  `Wafer(3)_max` = c(701.545892,700.435095,697.187276,692.386884,686.803004,680.702401),
                  `Wafer(1)_min` = c(698.572553,695.253749,690.879083,684.639355,678.071563,672.066944),
                  `Wafer(2)_min` = c(698.678988,695.655695,690.706554,684.388604,677.826257,671.429438),
                  `Wafer(3)_min` = c(699.444565,696.047009,690.284588,684.443958,677.984677,671.354129))

CodePudding user response:

I'm not sure if this is what you meant, but this is what I wrote for what you asked for:

library(tidyverse)

df %>% 
  mutate(Wafer.1.Subtract.Max = Wafer.1.Radius.06. - Wafer.1._max,
         Wafer.2.Subract.Max = Wafer.2.Radius.06. - Wafer.2._max,
         Wafer.3.Subract.Max = Wafer.3.Radius.06. - Wafer.3._max,
         Wafer.1.Subtract.Min = Wafer.1.Radius.06. - Wafer.1._min,
         Wafer.2.Subract.Min = Wafer.2.Radius.06. - Wafer.2._min,
         Wafer.3.Subract.Min = Wafer.3.Radius.06. - Wafer.3._min)

CodePudding user response:

You will make you life simpler if you stop using parentheses in your column names. R will always convert them to "." unless you add the argument check.names=FALSE when you create the data frame. Here's another way to get what want:

Max <- df[, 5:7] - df[, 2:4]
colnames(Max) <- paste0("Wafer.", 1:3, ".Subtract.Max")
Min <- df[, 2:4] - df[, 8:10]
colnames(Min) <- paste0("Wafer.", 1:3, ".Subtract.Min")

If you want them in a single data frame:

df.all <- cbind(df, Max, Min)
  •  Tags:  
  • r
  • Related