[R] I am trying to modify the format of my data frame (df) so that the column name is appended to each observation within that column within R. For example:
Soccer_Brand | Basketball_Brand |
---|---|
Adidas | Nike |
Nike | Under Armour |
And want to get it to look like
Soccer_Brand | Basketball_Brand |
---|---|
Adidas_Soccer_Brand | Nike_Basketball_Brand |
Nike_Soccer_Brand | Under_Armour_Basketball_Brand |
Im attempting a market basket analysis and need to remove column names eventually. However I will lose the information on what sport the brand belongs to without appending the column names to the observations themselves. Essentially I wont be able to tell whether a 'nike' entry belongs to soccer or basketball.
I've used Excel formulas to hack a solution thus far but want my R script to be self contained. I haven't found any solutions out there for this in R.
CodePudding user response:
You can paste
a column's name onto its contents. Just iterate through all the columns. Doing so with lapply
allows the one-liner:
df[] <- lapply(seq_along(df),\(i) paste(df[[i]], names(df)[i], sep = "_"))
resulting in
df
#> Soccer_Brand Basketball_Brand
#> 1 Adidas_Soccer_Brand Nike_Basketball_Brand
#> 2 Nike_Soccer_Brand Under Armour_Basketball_Brand
Data from question in reproducible format
df <- data.frame(Soccer_Brand = c("Adidas", "Nike"),
Basketball_Brand = c("Nike", "Under Armour"))