Let's say I have the following dataframe, where I have 3 columns (color (Yellow, Green or Red), label (A, B or C) and val(a numeric value)).
For each colour, there is 100 entries. This results in a 3 x 900 dataframe.
col | label | val |
---|---|---|
yellow1 | A | 23 |
yellow1 | B | 2 |
yellow1 | C | 12 |
green1 | A | 42 |
green1 | B | 55 |
green1 | C | 21 |
yellow2 | A | 33 |
yellow2 | B | 56 |
yellow2 | C | 45 |
How can I change this dataframe so there is a seperate column for each colour and label. For example...
Yellow.A | Yellow.B | Yellow.C | Green.A | Green.B | Green.C | Red.A |
---|---|---|---|---|---|---|
23 | 2 | 12 | 42 | 55 | 21 | 64 |
33 | 56 | 45 | 24 | 52 | 42 | 24 |
23 | 2 | 232 | 53 | 32 | 24 | 112 |
Note: I do not want separate columns for Yellow1.A, Yellow2.A...etc. All 'A's for 'Yellow' should be within a single column.
I use dyplr/tidyverse as well as base.
CodePudding user response:
base R
dat$col <- sub("[0-9] $", "", dat$col)
tmp <- lapply(split(dat, dat[c("col", "label")]), `[[`, "val")
tmp <- lapply(tmp, `length<-`, max(lengths(tmp)))
as.data.frame(tmp)
# green.A yellow.A green.B yellow.B green.C yellow.C
# 1 42 23 55 2 21 12
# 2 NA 33 NA 56 NA 45
Data
dat <- structure(list(col = c("yellow1", "yellow1", "yellow1", "green1", "green1", "green1", "yellow2", "yellow2", "yellow2"), label = c("A", "B", "C", "A", "B", "C", "A", "B", "C"), val = c(23L, 2L, 12L, 42L, 55L, 21L, 33L, 56L, 45L)), class = "data.frame", row.names = c(NA, -9L))