So I have a df containing codes and names associated with these codes. Basically, the codes are unique but may have multiple names. So, if you want to left_join()
with another table without creating duplicate rows, you need to reorganize the table. I want to do it so that it is grouped by codes and the columns are "name1", "name2" etc.
input df
df<-data.frame(code=c(1,1,2),
name=c("a", "b", "c"))
result df
resdf<-data.frame(code=c(1,2),
name1=c("a", "c"),
name2=c("b",""))
I don't care if the unique values are filled with NAs or ""s or whatever.
I can't find a good solution using aggregate etc. and am stumped.
CodePudding user response:
You can pivot_wider
after creating a variable that identifies the sequential numbers for each code.
library(tidyr)
library(dplyr)
df |>
group_by(code) |>
mutate(sequential = row_number()) |>
pivot_wider(names_prefix = "name",
names_from = "sequential",
values_from = "name")
#> # A tibble: 2 × 3
#> # Groups: code [2]
#> code name1 name2
#> <dbl> <chr> <chr>
#> 1 1 a b
#> 2 2 c <NA>
Created on 2022-07-07 by the reprex package (v2.0.1)
CodePudding user response:
Without external packages, you could use reshape()
.
reshape(transform(df, id = ave(code, code, FUN = seq_along)),
direction = "wide", idvar = "code", timevar = "id", v.names = "name")
# code name.1 name.2
# 1 1 a b
# 3 2 c <NA>
CodePudding user response:
Another way of solving your problem based on data.table package:
library(data.table)
mx = max(setDT(df)[, .N, code]$N)
df[, setNames(as.list(name)[1:mx], paste0("name", 1:mx)), code]
code name1 name2
1: 1 a b
2: 2 c <NA>