Let's say I have a matrix
[,1] [,2] [,3] [,4]
[1,] 10 11 12 13
[2,] 9 10 15 4
[3,] 5 7 4 10
[4,] 1 2 6 2
I want to remove parts of a column where the values are <=5. Even if there is a higher value in the next row of the column (ie. [3,4] after [2,4] is <5), those will become 0, so I should be left with:
[,1] [,2] [,3] [,4]
[1,] 10 11 12 13
[2,] 9 10 15 NA
[3,] NA 7 NA NA
[4,] NA NA NA NA
The matrix was created by using a for-loop to iterate a population 100 times so my matrix is 100x100. I tried to use an if function in the for-loop to remove parts of the column but instead it just removed all columns after the first one.
if(matrix[,col]<=5) break
CodePudding user response:
Here's a way to replace the required values in a matrix with NA:
# Create a random matrix with 20 rows and 20 columns
m <- matrix(floor(runif(400, min = 0, max = 101)), nrow = 20)
# Function that iterates through a vector and replaces values <= 5
# and the following values with NA
f <- function(x) {
fillNA <- FALSE
for (i in 1:length(x)) {
if (fillNA || x[i] <= 5) {
x[i] <- NA
fillNA <- TRUE
}
}
x
}
# Apply the function column-wise
apply(m, 2, f)
CodePudding user response:
We can do this in base R. Let's assume that your matrix is called m
. The function below does the following:
- Check each element to see if it is
<=
5, producing TRUE/FALSE values. - Cumulatively sum the TRUE/FALSE values.
- Replace any non-zero cumulative values with NA.
- Use
apply
to perform this operation per column of the matrix.
This can be fit on one line:
m2 <- apply(m, 2, \(x) ifelse(cumsum(x <= 5), NA, x))
[,1] [,2] [,3] [,4]
[1,] 10 11 12 13
[2,] 9 10 15 NA
[3,] NA 7 NA NA
[4,] NA NA NA NA
CodePudding user response:
# Load the necessary packages
library(dplyr)
# Set the seed for reproducibility
set.seed(123)
# Create a random matrix with 100 rows and 100 columns
matrix <- matrix(runif(10000), nrow = 100)
# Replace values in each row of the matrix that are <= 5 with NA
matrix[apply(matrix, 1, function(x) any(x <= 5)), ] <- NA
# View the modified matrix
matrix
This code first loads the dplyr package, which is not necessary for this task but is used here to create a random matrix. It then sets the seed for reproducibility, so that the same random matrix is generated every time the code is run. Next, it creates a random matrix with 100 rows and 100 columns using the runif function, which generates random uniform numbers between 0 and 1. Finally, it uses the apply function to apply the logic to each row of the matrix and replace any values that are <= 5 with NA.