I have a data frame with media spending for different media channels:
TV <- c(200,500,700,1000)
Display <- c(30,33,47,55)
Social <- c(20,21,22,23)
Facebook <- c(30,31,32,33)
Print <- c(50,51,52,53)
Newspaper <- c(60,61,62,63)
df_media <- data.frame(TV,Display,Social,Facebook, Print, Newspaper)
My goal is to calculate the row sums of specific columns based on their name. For example: Per definition Facebook falls into the category of Social, so I want to add the Facebook column to the Social column and just have the Social column left. The same goes for Newspaper which should be added to Print and so on.
The challenge is that the names and the number of columns that belong to one category change from data set to data set, e.g. the next data set could contain Social, Facebook and Instagram which should be all summed up to Social.
There is a list of rules, which define which media types (column names) belong to each other, but I have to admit that I'm a bit clueless and can only think about a long set of if commands right now, but I hope there is a better solution.
I'm thinking about putting all the names that belong to each other in vectors and use them to find and summarize the relevant columns, but I have no idea, how to execute this. Any help is appreciated.
CodePudding user response:
You could something along those lines, which allows columns to not be part of every data set (with intersect
and setdiff
).
Define a set of rules, i.e. those columns that are going to be united/grouped together.
Create a vector
d
of the remaining columnsCompute the
rowSums
of every subset of the data set defined in the rulesappend
the remaining columnscbind
the columns of the list usingdo.call
.
#Rules
rules = list(social = c("Social", "Facebook", "Instagram"),
printed = c("Print", "Newspaper"))
d <- setdiff(colnames(df_media), unlist(rules)) #columns that are not going to be united
#data frame
lapply(rules, function(x) rowSums(df_media[, intersect(colnames(df_media), x)])) |>
append(df_media[, d]) |>
do.call(cbind.data.frame, args = _)
social printed TV Display
1 50 110 200 30
2 52 112 500 33
3 54 114 700 47
4 56 116 1000 55