If i run this code with my data i get the output i should have
my_mean <- function(simulated_data){
if(is.numeric(simulated_data)){
return(sum(simulated_data)/length(simulated_data))
}else{
return('Non-numeric data')
}
}
for (i in colnames(simulated_data)){
cat(paste(colnames(simulated_data[i]), "Mean:", my_mean(simulated_data[[i]]), "\n"))
}
This is the output:
total_cost Mean: 1897.21529700626
product_line Mean: Non-numeric data
day Mean: Non-numeric data
calander_week Mean: 25.5
quantity Mean: 113.759646705788
But if have generalize the formula (as I have to do for my assignment), I run the following code:
means_function <- function(input_data){
for(i in colnames(input_data)){
if(is.numeric(input_data)){
cat(paste(colnames(input_data[i]), "Mean:", mean(input_data[i]),"\n"))
}else{
cat(paste(colnames(input_data[i]), "Mean:", 'Non-numeric data', "\n"))
}
}
}
means_function(simulated_data)
And then I got the following output:
total_cost Mean: Non-numeric data
product_line Mean: Non-numeric data
day Mean: Non-numeric data
calander_week Mean: Non-numeric data
quantity Mean: Non-numeric data
Can someone tell me what I'm doing wrong? I have to use the for-loop, the if-function and the means-function
CodePudding user response:
You were nearly there, here is the dataframe I used for this example
df <- data.frame(movieID = c("A","A","A","B","B","C","C","C"),
crewID = c("Z","Y","X","Z","V","V","X","Y"),
Rating = c(7.3,7.3,7.3,2.1,2.1,9,9,9))
Your for loop you are saying for i (a number) in column name doesn't make much sense.
means_function <- function(input_data){
for(i in 1:ncol(input_data)){
if(is.numeric(input_data[[i]])){
cat(paste(colnames(input_data[i]), "Mean:", mean(input_data[[i]]),"\n"))
}else{
cat(paste(colnames(input_data[i]), "Mean:", 'Non-numeric data', "\n"))
}
}
}
Call:
means_function(df)
Output:
movieID Mean: Non-numeric data
crewID Mean: Non-numeric data
Rating Mean: 6.6375
CodePudding user response:
As for loop iterates over each instance and execute the function. So in the below case
my_mean <- function(simulated_data){
if(is.numeric(simulated_data)){
return(sum(simulated_data)/length(simulated_data))
}else{
return('Non-numeric data')
}
}
for (i in colnames(simulated_data)){
cat(paste(colnames(simulated_data[i]), "Mean:", my_mean(simulated_data[[i]]), "\n"))
}
my_mean function is applied on a vector (simulated_data[[i]] returns a vector) whereas applying the same function for generalization on a data.frame doesn't work as per your need.
Reason for that not working is in the if statement which actually checks a data.frame and ultimately return output as FALSE for is.numeric(input_data)
means_function <- function(input_data){
for(i in colnames(input_data)){
if(is.numeric(input_data)){
cat(paste(colnames(input_data[i]), "Mean:", mean(input_data[i]),"\n"))
}else{
cat(paste(colnames(input_data[i]), "Mean:", 'Non-numeric data', "\n"))
}
}
}
means_function(simulated_data)
To overcome this you may modify means_function as below which checks for a vector and also calculate mean of a vector and not data.frame:
means_function <- function(input_data){
for(i in colnames(input_data)){
if(is.numeric(input_data[, i])){
cat(paste(colnames(input_data[i]), "Mean:", mean(input_data[, i]),"\n"))
}else{
cat(paste(colnames(input_data[i]), "Mean:", 'Non-numeric data', "\n"))
}
}
}