I want to change the format of my data frame. Now its in long format but I want to change it to wide format so that each sample
has its own column indicating of the virus is present of absent based in the information in cond
. Present should be given 1, absent 0.
In:
virus sample cond
1 virusA A Present
2 virusB A Present
3 virusC A Absent
4 virusA B Absent
5 virusB B Present
6 virusC B Present
df <- structure(list(virus = c("virusA", "virusB", "virusC", "virusA",
"virusB", "virusC"), sample = c("A", "A", "A", "B", "B", "B"),
cond = c("Present", "Present", "Absent", "Absent", "Present",
"Present")), class = "data.frame", row.names = c(NA, -6L))
Out:
> df.out
virus A B
1 virusA 1 0
2 virusB 1 1
3 virusC 0 1
CodePudding user response:
Use pivot_wider
with values_fn
library(tidyr)
pivot_wider(df, names_from = sample, values_from = cond,
values_fn = list(cond = ~ sum(. == 'Present')))
-output
# A tibble: 3 × 3
virus A B
<chr> <int> <int>
1 virusA 1 0
2 virusB 1 1
3 virusC 0 1