I am trying to do something quite odd. I would like to somehow determine if the row of a separate data.frame is mostly negative or positive based on row numbers in a column of a different data.frame... I have included my example data frames (data.frame1 and data.frame2) and the desired output
>data.frame1
col_a col_b col_c
1 1 1 1
2 1 -1 -1
3 1 1 1
4 -1 -1 -1
5 1 1 -1
>data.frame2
col_a
1 2
2 3
3 5
Example output/result
col_a col_b
1 2 negative
2 3 positive
3 5 positive
Another example output could be
Example 2 output/result
col_a col_b
1 2 -1
2 3 1
3 5 1
CodePudding user response:
Subset first dataframe rows based on second dataframe, then check the sign, then get the sum for the rows, then get sign again:
cbind(data.frame2,
result = sign(rowSums(sign(data.frame1[ data.frame2$col_a, ]))))
# col_a result
# 1 2 -1
# 2 3 1
# 3 5 1
Note: if the first dataframe has only values -1 and 1, then we can drop the inner sign step:
sign(rowSums(data.frame1[ data.frame2$col_a, ]))
CodePudding user response:
This is my solution using rowMeans
instead of RowSums
.
#making the example
df1 <- data.frame("a" = c(1,1,1,-1,1),
"b" = c(1,-1,1,-1,1),
"c"= c(1,-1,1,-1,-1))
df2 <- data.frame("col_a" = c(2,3,5))
# Check if all indices of rows in d2 are in df1
stopifnot(all(df2$col_a %in% c(1:nrow(df1))))
# no second sign to get a logical value
df2$col_b <- rowMeans(sign(df1[df2$col_a,])) >= 0
# make to factor
df2$col_b <- ifelse(df2$col_b,"positive","negative")