Match Column to Column Names, add value to row/column of Matches
My first question, so be sure to teach me a lesson
Given Data Frame
df<- structure(list(ID = c("ID001", "ID001", "ID003", "ID004", "ID003",
"ID004"), ID001 = c(1L, 0L, 1L, 0L, 1L, 1L), ID002 = c(0L,
0L, 0L, 0L, 0L, 0L), ID003 = c(1L, 0L, 1L, 1L, 0L, 0L), ID004 = c(1L,
1L, 0L, 0L, 1L, 1L)), class = "data.frame", row.names = c(NA, -6L))
ID ID001 ID002 ID003 ID004
1 ID001 1 0 1 1
2 ID001 0 0 0 1
3 ID003 1 0 1 0
4 ID004 0 0 1 0
5 ID003 1 0 0 1
6 ID004 1 0 0 1
I have an inefficient for loop for updating entries where the 'ID' column is matching a column name, we add to the value
for(rows in 1:nrow(df)) {
df[rows, match(df[rows,'ID'], names(df))] <- df[rows, match(df[rows,'ID'], names(df))] 1
}
df
ID ID001 ID002 ID003 ID004
1 ID001 2 0 1 1
2 ID001 1 0 0 1
3 ID003 1 0 2 0
4 ID004 0 0 1 1
5 ID003 1 0 1 1
6 ID004 1 0 0 2
this is the desired output. But I need to run this on millions of rows and its slow. I'm guessing this can be improved more than one way, maybe with apply or similar, but I've not attempted this and hoping to see how its done.
CodePudding user response:
Here is a dplyr
way:
library(dplyr)
df %>%
mutate(across(-ID, ~ifelse(ID == cur_column(), . 1, .)))
ID ID001 ID002 ID003 ID004
1 ID001 2 0 1 1
2 ID001 1 0 0 1
3 ID003 1 0 2 0
4 ID004 0 0 1 1
5 ID003 1 0 1 1
6 ID004 1 0 0 2
CodePudding user response:
Create a matrix with row/column
position index and then do the assignment
i1 <- cbind(seq_len(nrow(df)), match(df$ID, names(df)[-1]))
df[-1][i1] <- df[-1][i1] 1
-output
> df
ID ID001 ID002 ID003 ID004
1 ID001 2 0 1 1
2 ID001 1 0 0 1
3 ID003 1 0 2 0
4 ID004 0 0 1 1
5 ID003 1 0 1 1
6 ID004 1 0 0 2