How to remove parts of a column from matrix in R-CodePudding

Let's say I have a matrix

     [,1] [,2] [,3] [,4]
[1,]  10   11   12   13
[2,]  9    10   15    4
[3,]  5     7    4   10
[4,]  1     2    6    2

I want to remove parts of a column where the values are <=5. Even if there is a higher value in the next row of the column (ie. [3,4] after [2,4] is <5), those will become 0, so I should be left with:

 [,1] [,2] [,3] [,4]
[1,]  10   11   12   13
[2,]  9    10   15   NA
[3,]  NA    7   NA   NA
[4,]  NA   NA   NA   NA

The matrix was created by using a for-loop to iterate a population 100 times so my matrix is 100x100. I tried to use an if function in the for-loop to remove parts of the column but instead it just removed all columns after the first one.

if(matrix[,col]<=5) break

CodePudding user response：

Here's a way to replace the required values in a matrix with NA:

# Create a random matrix with 20 rows and 20 columns
m <- matrix(floor(runif(400, min = 0, max = 101)), nrow = 20)

# Function that iterates through a vector and replaces values <= 5
# and the following values with NA
f <- function(x) {
  fillNA <- FALSE
  for (i in 1:length(x)) {
    if (fillNA || x[i] <= 5) {
      x[i] <- NA
      fillNA <- TRUE
    }
  }
  x
}

# Apply the function column-wise
apply(m, 2, f)

CodePudding user response：

We can do this in base R. Let's assume that your matrix is called m. The function below does the following:

Check each element to see if it is <= 5, producing TRUE/FALSE values.
Cumulatively sum the TRUE/FALSE values.
Replace any non-zero cumulative values with NA.
Use apply to perform this operation per column of the matrix.

This can be fit on one line:

m2 <- apply(m, 2, \(x) ifelse(cumsum(x <= 5), NA, x))

     [,1] [,2] [,3] [,4]
[1,]   10   11   12   13
[2,]    9   10   15   NA
[3,]   NA    7   NA   NA
[4,]   NA   NA   NA   NA

CodePudding user response：

# Load the necessary packages
library(dplyr)

# Set the seed for reproducibility
set.seed(123)

# Create a random matrix with 100 rows and 100 columns
matrix <- matrix(runif(10000), nrow = 100)

# Replace values in each row of the matrix that are <= 5 with NA
matrix[apply(matrix, 1, function(x) any(x <= 5)), ] <- NA

# View the modified matrix
matrix

This code first loads the dplyr package, which is not necessary for this task but is used here to create a random matrix. It then sets the seed for reproducibility, so that the same random matrix is generated every time the code is run. Next, it creates a random matrix with 100 rows and 100 columns using the runif function, which generates random uniform numbers between 0 and 1. Finally, it uses the apply function to apply the logic to each row of the matrix and replace any values that are <= 5 with NA.