Home > Mobile >  Reverse standardization after removing rows
Reverse standardization after removing rows

Time:12-05

I have been working with R for about six months now, and so I am still somewhat of a novice with a lot of this. I have a large dataset of 260 columns with 1000 rows and I need to convert the data to standard deviation units and then removing outliers which do not meet the set SD criteria. I have managed to convert the data and remove the necessary rows; however, after doing this I need to convert the data back to its original values. The problem that I am facing is that when I do this it continuously throws up an error and I am not sure how to get past this. I am assuming that this is due to the dataset now being different in size than before I had standardised it, but I can't think of a way to work around this.

I have looked through past questions around this issue but I have not found anything that solves my problem and so any help regarding this issue would be greatly appreciated.

Here is a sample idea of what I am trying to do and what is failing

y = 30
C = 30
ds <- matrix(data = NA, nrow = y, ncol = C)

for (i in 1:y) {
  ds[i,] <- sample(1:100, C, TRUE)}

ds_z <- scale(ds, center = TRUE, scale = TRUE)
no_out <- ds_z[!rowSums(ds_z >2),]
revrs = t(apply(no_out, 1, function(r)r*attr(no_out,'scaled:scale')   attr(no_out, 'scaled:center')))

CodePudding user response:

Try

i1 <- !rowSums(ds_z > 2)
no_out <- ds_z[i1, ]
 lst1 <- lapply(attributes(ds_z)[-1], \(x) x[i1])
no_out2 <- (no_out * lst1$`scaled:scale`)    lst1$`scaled:center`
 no_out2 <- round(no_out2) 
  • Related