I have a dataframe with the following fields
class bed_1 chair_1 bottle_1 table_1 ...
bed TRUE FALSE FALSE FALSE ...
chair FALSE TRUE FALSE FALSE ...
sofa FALSE FALSE TRUE FALSE ...
table FALSE FALSE FALSE FALSE ...
I want to compare class column with all other columns. The following is the expected output
class new_col
bed bed_1
chair chair_1
bottle bottle_1
table
So, essentially, I need to pickup column name with TRUE value for specific class.
The solution i tried takes long time due to large number of records, I am looking for an efficient way of doing this. Here is my solution.
new_df <- data.frame(class = df$class, new_col = NA)
for (row_n in 1:length(df$class) ){
indx <- which (df[row_n, ] == 'TRUE')
new_df$new_col[row_n] <- ifelse (length(indx) > 0, colnames(df)[idx], '')
}
CodePudding user response:
Get the data in long format and filter
the TRUE
values.
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = -class) %>%
filter(value) %>%
select(-value)
# class name
# <chr> <chr>
#1 bed bed_1
#2 chair chair_1
#3 sofa bottle_1
If you have one TRUE
value in each row you can use max.col
.
new_df <- cbind(df[1], new_col = names(df[-1])[max.col(df[-1])])
To extract only one TRUE
value you may do -
df %>%
pivot_longer(cols = -class) %>%
group_by(class) %>%
slice(match(TRUE, value)) %>%
select(-value)
CodePudding user response:
Here is a base R way that returns the empty string ""
in case all values in a row are FALSE
.
y <- apply(df1[-1], 1, \(i, x = names(df1[-1])) {
y <- x[i]
if(length(y)) y[1] else ""
})
df2 <- data.frame(class = df1$class, new_col = y)
df2
# class new_col
#1 bed bed_1
#2 chair chair_1
#3 sofa bottle_1
#4 table
Data
df1 <-
structure(list(class = c("bed", "chair", "sofa", "table"),
bed_1 = c(TRUE, FALSE, FALSE, FALSE),
chair_1 = c(FALSE, TRUE, FALSE, FALSE),
bottle_1 = c(FALSE, FALSE, TRUE, FALSE),
table_1 = c(FALSE, FALSE, FALSE, FALSE)),
class = "data.frame", row.names = c(NA, -4L))