Say I have a dataframe-
Column A | Column B |
---|---|
id1 | blue |
id1 | red |
id1 | grey |
id2 | red |
id3 | red |
id3 | grey |
I would like this output-
Column A | Column B |
---|---|
id1 | all.mixed |
id2 | red |
id3 | red.grey |
I tried this-
table1 <- mydf %>% group_by(ColA, ColB) %>% count(ColB)
and came to this-
ColA ColB n
<chr> <chr> <int>
1 id1 blue 1
2 id1 red 1
3 id1 grey 1
4 id2 red 1
5 id3 red 1
6 id3 grey1 1
But I am kind of lost after this. I thought of group_by and summing up the rows in the colB, but then if I have a situation such that-
Column A | Column B |
---|---|
id5 | grey |
id5 | grey |
Then what do i do?
CodePudding user response:
It's not clear how you are determining the values for column B in the output (e.g. "all.mixed"), but let's assume there is a calcMagicValue
function. The general flow would be something like this:
calcMagicValue <- function(vals){
# psuedo code
if(all(possibleVals %in% vals)){
"all.mixed"
} else if( hasRedAndGrey){
"red.grey"
} else if(onlyRed) {
"red"
} else {
#...other conditions...
}
}
myDf |>
group_by(ColA) |>
summarize(newColB = calcMagicValue(ColB))
After grouping by, summarize will iterate through the unique values in the grouped columns, and pass the matching rows the functions called (in this case calcMagicValue
). In this case, your result will then have one row for each unique combination/values in the grouped columns(s).
CodePudding user response:
A slightly different approach using group_by
summarise
which uses an indicator to replace the all.mixed
cases in an additional mutate
step:
library(dplyr)
dat %>%
group_by(Column.A) |>
summarise(all.mixed = all(unique(dat$Column.B) %in% Column.B),
Column.B = paste(Column.B, collapse = "."), .groups = "drop") |>
mutate(Column.B = if_else(all.mixed, "all.mixed", Column.B)) |>
select(-all.mixed)
#> # A tibble: 3 × 2
#> Column.A Column.B
#> <chr> <chr>
#> 1 id1 all.mixed
#> 2 id2 red
#> 3 id3 red.grey
DATA
dat <- data.frame(
Column.A = c("id1", "id1", "id1", "id2", "id3", "id3"),
Column.B = c("blue", "red", "grey", "red", "red", "grey")
)