Home > Software engineering >  ggscatter displays a positive pearson correlation coefficient in kaggle notebook instead of a negati
ggscatter displays a positive pearson correlation coefficient in kaggle notebook instead of a negati

Time:03-13

I'm currently working on my first kaggle notebook and faced the issue stated in the title. The minus is only missing in kaggle, as you can see in the pictures. As I want to demonstrate a negative correlation, it is important to see the R is - 0.6.

plot in kaggle notebook
plot in RStudio

Here is the code I use in both pictures:

ggscatter(activity_sleep, x = "TotalMinutesAsleep", 
          y = "SedentaryMinutes", shape = 21, add = "loess",
          add.params = list(color = "blue", fill = "darkgrey"),
          conf.int = TRUE, cor.coef = TRUE, cor.method = "pearson")   
          labs(title="Sedentary Minutes vs. Minutes Asleep")

Is there a way to fix this?

CodePudding user response:

This looks like a kaggle bug in the graphics driver. Running this code in kaggle also leaves out the minus sign:

plot(1, main = expression(-1))

On the other hand, this works:

plot(1, main = "-1")

According to the docs for ggpubr::ggscatter(), you should be able to choose to display the stats in text rather than using an expression, but it didn't work when I tried this:

activity_sleep <- tibble::tibble(TotalMinutesAsleep = rnorm(20),
   SedentaryMinutes = rnorm(20) - TotalMinutesAsleep)
ggpubr::ggscatter(activity_sleep, x = "TotalMinutesAsleep", 
                  y = "SedentaryMinutes", shape = 21, add = "loess" ,
                  add.params = list(color = "blue", fill = "darkgrey"), 
                  conf.int = TRUE, 
                  cor.coef = TRUE, cor.method = "pearson", 
                  cor.coef.args = list(output.type = "text"))  
    labs(title="Sedentary Minutes vs. Minutes Asleep")

Bugs everywhere!

CodePudding user response:

@user2554330 Thank you for your help! Based on your input I figured out how to solve the issue following this doc.

The code now looks like this:

ggscatter(activity_sleep, y = "Calories", x = "SedentaryMinutes", 
      shape = 20, 
      add = "reg.line", 
      add.params = list(color = "blue", fill = "darkgrey"), 
      conf.int = TRUE)   
      stat_cor(
          aes(label = ..r.label..), label.x = 3,
          method = "pearson",
          output.type = "text")

If output.type = "expression", kaggle does not show the minus. All other options work.

  • Related