I have a dataframe that has a column for a name and a column for an event, however some people attended multiple events and so there are multiple rows for each person, there are also multiple rows for each event for each person that attended it. Here is an example df1
df1 <-tribble(
~name, ~event,
"jon", "beach",
"jon", "party",
"mark", "beach",
"sam", "concert")
I would like to make a new dataframe that makes each event a column and puts a 1 or 0 if the person attended so there is only one row for each person, something that looks like this:
df2<- tribble(
~name, ~beach, ~party, ~concert,
"jon", 1,1,0,
"mark",1,0,0,
"sam",0,0,1)
I tried using a mutate to make the new columns and using an if else to get rows that include the person and the event (example below) but Im not sure im going about it the right way!
df2 <- df %>% mutate(`beach` = ifelse(group_by(name) %>% filter(event=="beach" ,1,0))
CodePudding user response:
You can use pivot wider in your case.
library(tidyverse)
df1 %>%
mutate(value=1) %>%
pivot_wider(names_from=event,
values_from=value,
values_fill=0)
# A tibble: 3 × 4
name beach party concert
<chr> <dbl> <dbl> <dbl>
1 jon 1 1 0
2 mark 1 0 0
3 sam 0 0 1
CodePudding user response:
Another option is using dcast
from the reshape2
package or table
like this:
library(tibble)
df1 <-tribble(
~name, ~event,
"jon", "beach",
"jon", "party",
"mark", "beach",
"sam", "concert")
library(reshape2)
dcast(df1, name~event, fun.aggregate=length)
#> Using event as value column: use value.var to override.
#> name beach concert party
#> 1 jon 1 0 1
#> 2 mark 1 0 0
#> 3 sam 0 1 0
with(df1, table(name, event))
#> event
#> name beach concert party
#> jon 1 0 1
#> mark 1 0 0
#> sam 0 1 0
Created on 2022-07-16 by the reprex package (v2.0.1)