Home > Blockchain >  Partial group dataframe by two conditions
Partial group dataframe by two conditions

Time:05-19

Let's say I have the following dataframe, where I have 3 columns (color (Yellow, Green or Red), label (A, B or C) and val(a numeric value)).

For each colour, there is 100 entries. This results in a 3 x 900 dataframe.

col label val
yellow1 A 23
yellow1 B 2
yellow1 C 12
green1 A 42
green1 B 55
green1 C 21
yellow2 A 33
yellow2 B 56
yellow2 C 45

How can I change this dataframe so there is a seperate column for each colour and label. For example...

Yellow.A Yellow.B Yellow.C Green.A Green.B Green.C Red.A
23 2 12 42 55 21 64
33 56 45 24 52 42 24
23 2 232 53 32 24 112

Note: I do not want separate columns for Yellow1.A, Yellow2.A...etc. All 'A's for 'Yellow' should be within a single column.

I use dyplr/tidyverse as well as base.

CodePudding user response:

base R

dat$col <- sub("[0-9] $", "", dat$col)
tmp <- lapply(split(dat, dat[c("col", "label")]), `[[`, "val")
tmp <- lapply(tmp, `length<-`, max(lengths(tmp)))
as.data.frame(tmp)
#   green.A yellow.A green.B yellow.B green.C yellow.C
# 1      42       23      55        2      21       12
# 2      NA       33      NA       56      NA       45

Data

dat <- structure(list(col = c("yellow1", "yellow1", "yellow1", "green1", "green1", "green1", "yellow2", "yellow2", "yellow2"), label = c("A", "B", "C", "A", "B", "C", "A", "B", "C"), val = c(23L, 2L, 12L, 42L, 55L, 21L, 33L, 56L, 45L)), class = "data.frame", row.names = c(NA, -9L))
  • Related