I have a data frame in which I want to apply a calculation to a varying amount of columns that are specified, and store the results in separate objects, to be combined at the end.
A minimal example would look like:
Name <- c("Case 1", "Case 2", "Case 3", "Case 4", "Case 5")
Base <- c(0, 0, 0, 1, 1)
C1 <- c(1, 0, 1, 1, 0)
C2 <- c(0, 1, 1, 1, 0)
C3 <- c(0, 1, 0, 0, 0)
C4 <- c(1, 1, 0, 1, 0)
Data <- data.frame(Name, Base, C1, C2, C3, C4)
score.calc <- function(data, col.names){
# This is how I would to it outside a function and without loop:
Score1 <- sum(pmin(Data$C1, pmin(Data$Base)))/sum(pmin(Data$Base))
Score2 <- sum(pmin(Data$C2, pmin(Data$Base)))/sum(pmin(Data$Base))
Score3 <- sum(pmin(Data$C3, pmin(Data$Base)))/sum(pmin(Data$Base))
Scores <- c(Score1, Score2, Score3)
}
new.score <- score.calc(Data,
col.names= c("C1", "C2", "C3"))
And should return:
> new.score
[1] 0.5 0.5 0.0
Anyone with an idea? Many thanks!
CodePudding user response:
Try this:
score.calc <- function(data, col.names, base = "Base") {
sapply(subset(data, select = col.names),
function(z) sum(pmin(z, Data[[base]]))/sum(Data[[base]]))
}
score.calc(Data, c("C1", "C2", "C3"))
# C1 C2 C3
# 0.5 0.5 0.0
The changes I made:
- Changed from the hard-coded
$C1
(etc) to a more dynamicdata[[nm]]
mindset; - Changed the hard-coded
$Base
to one based on the arguments, and with a default value so that while you shouldn't need to change it, if you want to apply it to different data (with a different "Base" column), you still can; - Did it dynamically with
sapply
, which will return (here) a vector the same length ascol.names
... assuming that all columns provided are present in the data andnumeric
-like; - Use
subset(., select=)
instead of[
, since the latter can drop to a vector instead of a single-column frame in some circumstances (i.e.,base::[.data.frame
and a simpledata.frame
, not atbl_df
). - Removed two extraneous
pmin
. Its use with a single argument is a no-op: its functionality is providing the minimum element-wise between two or more vectors; to apply it to a single vector does nothing.