I have a data frame with integers, like so:
# generate data frame
df = cbind(c(0,102,0,40,0,0), c(22,0,0,0,12,4), c(23,101,55,0,0,0),
c(0,0,0,414,0,0), c(0,0,61,0,0,112), c(0,0,0,0,20,0))
colnames(df) = c('A', 'T', 'C', 'G', 'N', 'Del')
rownames(df) = c('Pos1', 'Pos2', 'Pos3', 'Pos4', 'Pos5', 'Pos6')
df
A T C G N Del Pos1 0 22 23 0 0 0 Pos2 102 0 101 0 0 0 Pos3 0 0 55 0 61 0 Pos4 40 0 0 414 0 0 Pos5 0 12 0 0 0 20 Pos6 0 4 0 0 112 0
I also have a vector with integers (which correspond to column indices of df):
# generate vector
cols = c(2,3,5,4,6,5)
Now, I want to reset all integers in df to zero that are present in columns with column indices that are listed in the vector, row-by-row. For example, for the first row I want to reset column 2 to zero, for the second row I want to reset column 3 to zero, etc.
I solved this with the following piece of code:
for (i in c(1:nrow(df))) {
ncol = cols[[i]]
df[[i, ncol]] = 0
df
}
df
A T C G N Del Pos1 0 0 23 0 0 0 Pos2 102 0 0 0 0 0 Pos3 0 0 55 0 0 0 Pos4 40 0 0 0 0 0 Pos5 0 12 0 0 0 0 Pos6 0 4 0 0 0 0
As you can see, my code behaves as intended. However, it turns out to be very inefficient on large datasets. I therefore wondered whether there is an alternative that will be considerably faster than using a for-loop.
Note that it looks like I am resetting the maximum value in each row, but this is not the case as in some instances, it is the smaller of the two values that I am resetting to zero. So I cannot simply reset the min or max in each row to zero.
CodePudding user response:
You can use cbind
to create a matrix of row and column positions and replace those with 0
as follows.
rows <- seq_len(nrow(df))
df[cbind(rows, cols)] <- 0
Result
df
# A T C G N Del
#Pos1 0 0 23 0 0 0
#Pos2 102 0 0 0 0 0
#Pos3 0 0 55 0 0 0
#Pos4 40 0 0 0 0 0
#Pos5 0 12 0 0 0 0
#Pos6 0 4 0 0 0 0
CodePudding user response:
One solution involving dplyr
could be:
df <- as.data.frame(df)
df %>%
mutate(across(everything(),
~ replace(., cols == match(cur_column(), names(cur_data())), 0)))
A T C G N Del
Pos1 0 0 23 0 0 0
Pos2 102 0 0 0 0 0
Pos3 0 0 55 0 0 0
Pos4 40 0 0 0 0 0
Pos5 0 12 0 0 0 0
Pos6 0 4 0 0 0 0