I have this dataframe.
Sub <- c(1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2)
trial <-c(1,1,1,1,2,2,2,2,2,2,1,1,1,1,2,2,2,2,2,2)
One <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
Two <- c(1,0,0,0,1,0,0,0,0,0,1,0,0,0,1,1,1,0,0,1)
Three <- c(2,0,0,1,3,0,0,0,0,1,7,8,0,0,0,1,1,1,1,0)
Four <- c(3,4,5,4,3,4,5,6,7,8,6,5,4,5,6,7,6,5,6,5)
Five <- c(3,4,5,4,6,7,5,4,3,2,3,4,5,4,3,5,7,4,3,5)
Six <- c(3,4,5,4,6,7,5,4,3,2,3,4,5,4,3,5,7,4,3,5)
Seven <- c(3,4,5,4,9,7,5,4,3,2,3,4,5,4,3,5,7,4,3,5)
dat <- data.frame(Sub, trial, One, Two, Three, Four, Five, Six, Seven)
I created this function to calculate the correlation among my variables.
fun <- function(a,b,c,d,e,f,g) {
v = cor(a,b)
v1 = cor(a,c)
v2 = cor(a,d)
v3 = cor(a,e)
v4 = cor(a,f)
v5 = cor(a,g)
return(c(v,v1,v2,v3,v4,v5))
}
I need to apply this function to each group of my dataset (Sub,trial).
dat %>%
group_by(Sub,trial) %>%
summarize(as.data.frame(matrix(fun(One, Two, Three, Four, Five, Six, Seven), nr = 1)))
However I got this result:
Sub trial V1 V2 V3 V4 V5 V6
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 NA NA NA NA NA NA
2 1 2 NA NA NA NA NA NA
3 2 1 NA NA NA NA NA NA
4 2 2 NA NA NA NA NA NA
Sub/trial are well grouped. But I got NA results for the other variables.
Do you have any advice?
Thank you.
CodePudding user response:
The solution by user @user438383 is the correct one.
The reason you get NA has nothing to do with applying the function.
As you get the the warning that standard deviation is zero you may consider this: R - Warning message: "In cor(...): the standard deviation is zero"
Here is an example:
# generate a list of dataframes with your groups:
my_list <- dat %>%
group_by(Sub, trial) %>%
group_split()
[[1]]
# A tibble: 5 x 9
Sub trial One Two Three Four Five Six Seven
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 1 2 3 3 3 3
2 1 1 1 0 0 4 4 4 4
3 1 1 1 0 0 5 5 5 5
4 1 1 1 0 1 4 4 4 4
5 1 1 1 1 7 6 3 3 3
[[2]]
# A tibble: 6 x 9
Sub trial One Two Three Four Five Six Seven
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 2 1 1 3 3 6 6 9
2 1 2 1 0 0 4 7 7 7
3 1 2 1 0 0 5 5 5 5
4 1 2 1 0 0 6 4 4 4
5 1 2 1 0 0 7 3 3 3
6 1 2 1 0 1 8 2 2 2
[[3]]
# A tibble: 3 x 9
Sub trial One Two Three Four Five Six Seven
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2 1 1 0 8 5 4 4 4
2 2 1 1 0 0 4 5 5 5
3 2 1 1 0 0 5 4 4 4
[[4]]
# A tibble: 6 x 9
Sub trial One Two Three Four Five Six Seven
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2 2 1 1 0 6 3 3 3
2 2 2 1 1 1 7 5 5 5
3 2 2 1 1 1 6 7 7 7
4 2 2 1 0 1 5 4 4 4
5 2 2 1 0 1 6 3 3 3
6 2 2 1 1 0 5 5 5 5
Now apply cor
to the first group
my_list[[1]] %>%
summarise(across(Two:Seven, ~cor(One, .)))
# gives:
# A tibble: 1 x 6
Two Three Four Five Six Seven
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 NA NA NA NA NA NA
Warning messages:
1: In cor(One, Two) : Standardabweichung ist Null
2: In cor(One, Three) : Standardabweichung ist Null
3: In cor(One, Four) : Standardabweichung ist Null
4: In cor(One, Five) : Standardabweichung ist Null
5: In cor(One, Six) : Standardabweichung ist Null
6: In cor(One, Seven) : Standardabweichung ist Null
# or correlation of two columns only One and two of group one
cor(my_list[[1]]$One, my_list[[1]]$Two)
# gives:
[1] NA
Warning message:
In cor(my_list[[1]]$One, my_list[[1]]$Two) : Standardabweichung ist Null
An extrapolated example with the mtcars dataset:
mtcars %>%
relocate(cyl, vs, everything()) %>%
group_by(cyl, vs) %>%
summarise(across(hp:carb, ~cor(., mpg)))
cyl vs hp drat wt qsec am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 4 0 NA NA NA NA NA NA NA
2 4 1 -0.522 0.466 -0.721 -0.296 0.557 0.442 -0.189
3 6 0 -1 1 -0.101 0.931 NA -1 -1
4 6 1 -0.248 -0.249 -0.936 -0.0424 NA -0.442 -0.442
5 8 0 -0.284 0.0479 -0.650 -0.104 0.0496 0.0496 -0.394
Warning messages:
1: In cor(am, mpg) : Standardabweichung ist Null
2: In cor(am, mpg) : Standardabweichung ist Null