I am having trouble with my function 'subtract_within'. The goal of the function is for it to take a value in a variable, and then sum the differences between that value and all of the rest of the values in the variable.
When I call it, it returns the value for the first element (correctly!)in the for loop, but not the rest.
Here is my code:
x<- c(1,2,3,4,5,6) #the value i want to calculate with
y<- c(1,2,3,4,5,6) #the identifier of the value
this_df<- data.frame(x,y)
this_df$z = 0 #now theres a new variable that is 0 for now but will be filled in later
this_df %>%
mutate(z = subtract_within(x,y)) #this is how i will ideally call on this function 'subtract_within'
#start of my function 'subtract_within'
subtract_within<- function(x,y){ #this function takes in two variables from a dataframe
dataframe<- data.frame(x,y) # i locally make it a dataframe
dataframe$z<-0 # i add the variable that i will be returning
for (i in 1:nrow(dataframe)){
dataframe$z[i] <- (dataframe$x[i]*(length(which(dataframe$y != dataframe$y[i])))) - sum(dataframe$x[which(dataframe$y != dataframe$y[i])])
return(dataframe$z)
}
return(dataframe$z)
}
'''
My output is as follows:
x y z
1 1 1 -15
2 2 2 0
3 3 3 0
4 4 4 0
5 5 5 0
ideally my output would be:
x y z
1 1 1 -15
2 2 2 -5
3 3 3 0
4 4 4 5
5 5 5 10
CodePudding user response:
It is a case of adding return
within the for
loop. If we only use the last return
, it should work
subtract_within<- function(x,y){ #this function takes in two variables from a dataframe
dataframe<- data.frame(x,y) # i locally make it a dataframe
dataframe$z<-0 # i add the variable that i will be returning
for (i in 1:nrow(dataframe)){
dataframe$z[i] <- (dataframe$x[i]*(length(which(dataframe$y != dataframe$y[i])))) - sum(dataframe$x[which(dataframe$y != dataframe$y[i])])
#return(dataframe$z)
}
return(dataframe$z)
}
CodePudding user response:
I think the issue is that you have a return statement inside your loop
for (i in 1:nrow(dataframe)){
dataframe$z[i] <- (dataframe$x[i]*(length(which(dataframe$y != dataframe$y[i])))) - sum(dataframe$x[which(dataframe$y != dataframe$y[i])])
return(dataframe$z) # THIS RIGHT HERE
}
return(dataframe$z)
This makes the function immediately return the dataframe after the loop runs only once. If you remove that I think it should work.
for (i in 1:nrow(dataframe)){
dataframe$z[i] <- (dataframe$x[i]*(length(which(dataframe$y != dataframe$y[i])))) - sum(dataframe$x[which(dataframe$y != dataframe$y[i])])
}
return(dataframe$z)