Let's say I have a given dataframe df, that looks like this:
df
Group <- c("A","B","C","D","E","B","D","E")
Value <- c(2,3,2,2,1,5,4,4)
df <- data.frame(Group, Value)
df
Group Value
1 A 2
2 B 3
3 C 2
4 D 2
5 E 1
6 B 5
7 D 4
8 E 4
I want to look for duplicates in Group
and then put the two Values
that go with that Group
in seperate columns.
I also want to remove any row or Group
with only one Value
(per group).
So the resulting dataframe in this case would look like this:
new_df
Group Value1 Value2
1 B 3 5
2 D 2 4
3 E 1 4
I would be very happy, if someone could help me with this!
Thank you very much in advance!
CodePudding user response:
The basic reshaping component of this is well-informed by Reshape multiple value columns to wide format, though for those steps to work we need to bring in a "counter" column to filter and expand on.
dplyr and tidyr
library(dplyr)
library(tidyr) # pivot_wider
df %>%
group_by(Group) %>%
mutate(rn = paste0("Value", row_number())) %>%
filter(n() > 1) %>%
ungroup() %>%
pivot_wider(Group, names_from = rn, values_from = Value)
# # A tibble: 3 x 3
# Group Value1 Value2
# <chr> <dbl> <dbl>
# 1 B 3 5
# 2 D 2 4
# 3 E 1 4
base R and reshape2
df$rn <- paste0("Value", ave(seq_len(nrow(df)), df$Group, FUN = seq_along))
subset(df, ave(df$Group, df$Group, FUN = length) > 1) |>
reshape2::dcast(Group ~ rn, value.var = "Value")
# Group Value1 Value2
# 1 B 3 5
# 2 D 2 4
# 3 E 1 4
CodePudding user response:
Another possible solution, based on tidyverse
:
library(tidyverse)
df %>%
group_by(Group) %>%
filter(n() != 1) %>%
group_split() %>%
map_dfr(~ pivot_wider(mutate(.x, name = str_c("Value", 1:n())),
Group, values_from = "Value"))
#> # A tibble: 3 × 3
#> Group Value1 Value2
#> <chr> <dbl> <dbl>
#> 1 B 3 5
#> 2 D 2 4
#> 3 E 1 4