How do I go from the first table to the second?
I do have vectors that I'm adhering to:
high_vector <- c("740", "742", "744")
all_vector <- c("736", "738")
- Notice how 'high_vector' has an input, 744, that I don't use.
If this helps, I have some code from an earlier project in which I gather all inputs of a "Yes" within select variables. It differs from this question since I'm trying to ** add ** the presence of them:
PurposeCols <- c("NEW_CAR", "USED_CAR", "FURNITURE", "RADIO/TV", "EDUCATION", "RETRAINING")
CD$PURPOSE <- PurposeCols[apply(CD[PurposeCols],1, function(x) match("Yes", x))] %>%
replace_na("OTHER") %>% str_to_title() %>% as.factor()
In summary, I want to count the presence of any of the inputs from my vectors and then a separate column which counts the presence of those within only the second vector of mine.
I'm performing this on a much, much larger dataset but I plan on using group_by
.
Thank You.
Data
foo <- data.frame(
ID = c("one", "one", "one", "one", "two", "two"),
first = c("736", "738", "997","200", "408", "675"),
second = c("800", "842", "740", "301", "742", "682"),
third = c("980", NA, NA, "742", "975", "738")
)
bar <- data.frame(
all = c(4,2),
high = c(2,1)
)
rownames(bar) <- c("one", "two")
CodePudding user response:
Reshape the data into long with pivot_longer
, grouped by 'ID', summarise
to get the count of those elements in value
column with the combined vector of 'high_vector' and 'all_vector' with sum
on a logical vector as well as the sum
on 'high_vector' converted to logical as well
library(dplyr)
library(tidyr)
library(tibble)
foo %>%
pivot_longer(cols = -ID) %>%
group_by(ID) %>%
summarise( all = sum(value %in% c(high_vector, all_vector)),
high = sum(value %in% high_vector)) %>%
column_to_rownames('ID')
-output
all high
one 4 2
two 2 1