I have a dataset with a set of values dispersed over multiple columns:
ID | Priority 1 | Priority 2 | Priority 3 |
---|---|---|---|
2308 | Writing | Reading | Reading |
0329 | Reading | Communication | Writing |
2389 | Communication | Writing | Communication |
2934 | Writing | Communication | Writing |
And I would like the output to be a table where the first column are the unique values found (Writing, Reading, Communication) and the rest of the columns are the priorities (Priority 1, Priority 2, Priority 3). In each column should be the count of the priority witin that instance. The output should look like:
Priority Type | Priority 1 | Priority 2 | Priority 3 |
---|---|---|---|
Writing | 2 | 1 | 2 |
Reading | 1 | 1 | 1 |
Communication | 1 | 2 | 1 |
In my actual dataset, there are many priorities so if possible could we include 1:n for columns?
Thank you in advance.
*Edit Table has been updated with new column for clarification. I would like to ignore the ID column completely and only have a count of the priorities in each of the priorities columns.
CodePudding user response:
table(stack(df))
ind
values Priority 1 Priority 2 Priority 3
Communication 1 2 1
Reading 1 1 1
Writing 2 1 2
If you want it as a dataframe:
as.data.frame.matrix(table(stack(df)))
Priority 1 Priority 2 Priority 3
Communication 1 2 1
Reading 1 1 1
Writing 2 1 2
CodePudding user response:
Try sapply
sapply( dat, table )
Priority1 Priority2 Priority3
Communication 1 2 1
Reading 1 1 1
Writing 2 1 2
CodePudding user response:
Here is a tidyverse
solution using values_fn = max
argument from pivot_wider
function of tidyr
package:
library(dplyr)
library(tidyr)
df %>%
pivot_longer(
cols= everything()
) %>%
group_by(name) %>%
add_count(value) %>%
pivot_wider(
names_from = name,
values_from =n,
values_fn = max
)
value Priority1 Priority2 Priority3
<chr> <int> <int> <int>
1 Writing 2 1 2
2 Reading 1 1 1
3 Communication 1 2 1