I have 3 different datasets that I have been plotting this way:
Each dataset was imported from a file to a data frame (respectively called vues
, likes
and commentaires
), and contains the date and the corresponding data (either views, likes or comments) for each date.
Now, I'd like to plot both linear models onto my graph (likes ~ views and comments ~ views).
Starting with the red one, I entered the following code:
abline(lm(likes$X2022.12.30 ~ vues$X2022.12.30,data=c(likes,vues)),col="red")
And this is what RStudio plots:
Now I don't understand if the problem comes from the dataset or somewhere else, but if I remove the data
parameter, or just choose one of the two datasets, it still does the exact same thing, i.e. the following:
abline(lm(likes$X2022.12.30 ~ vues$X2022.12.30,data=likes),col="red")
abline(lm(likes$X2022.12.30 ~ vues$X2022.12.30,data=vues),col="red")
abline(lm(likes$X2022.12.30 ~ vues$X2022.12.30),col="red")
Here is my data:
> vues$X2022.12.30
[1] 15900 8245 4531 546800 7149 10600 7774 45600 157100
[10] 348300 15000 7363 24000 6073 6469 5848 13100 185600
[19] 18700 7622 483800 6373 12000 7839 17100 10800 9846
[28] 5671 10100 8330 9031 183000 17600 5153 117700 39600
[37] 10300 27900 11200 29500 387800 15000 8968 465800 72500
[46] 9501 5816 9761 5814 16200 269700 8905 16300 14700
[55] 149600 7547 422600 40700 71100 18900 942000 12100 13400
[64] 551900 16500 12000 8648 131900 10700 18400 183700 13500
[73] 21500 1203000 14300 14700 108400 5233 388800 368400 1411000
[82] 286400 17900 261500 1049000 13500 11200 74300 1312000 6044
[91] 22200 9467 5975 143200 4552 502700 3971 9755 32000
[100] 46800 8844 31600 3671 60700 8249 20100 14500 3475
[109] 5745 2420 193700 2305 13500 90200 5746 5520 29200
[118] 7803 2502 4559 2120 3233 242100 5616 1371 1109
[127] 2123 2097 4019 1444 1515 2350 34600 2642 148000
[136] 2139 541400 13700 52600 421700 9876 3671 33600 6388
[145] 12300 3014 50200 2033 45900 5878 2221 1479
> likes$X2022.12.30
[1] 1572 935 229 39000 471 944 472 2149 15400 42000 1346
[12] 517 1977 488 569 462 1940 17200 2121 588 84800 587
[23] 987 618 1229 862 947 278 1048 628 795 19200 1529
[34] 319 9050 3119 868 2840 780 1912 40100 1130 759 47800
[45] 4197 815 470 786 502 1068 33200 698 1145 1442 11200
[56] 534 41600 3740 5119 2376 91700 904 983 20800 812 869
[67] 571 6653 807 1356 7332 1005 1597 104700 1171 982 14300
[78] 367 14900 29800 103500 11900 1073 22700 67700 872 894 3673
[89] 116800 251 2229 593 392 20400 267 29200 449 569 1933
[100] 2260 1031 3035 311 6370 1014 812 956 241 641 116
[111] 6543 113 503 5505 450 410 2067 494 76 350 155
[122] 122 11400 350 51 42 109 96 200 62 53 98
[133] 1207 153 15500 101 56900 718 4498 23600 619 248 1803
[144] 437 983 234 4188 147 2623 591 176 138
And here is the code I used for plotting the graph if that is relevant:
plot.new()
par(mar=c(4,4,4,4))
par(new=TRUE)
par(bg="#FFECDE")
rect(par("usr")[1], par("usr")[3],
par("usr")[2], par("usr")[4],
col = c("#E1DEFF"))
par(new=TRUE)
plot(vues$X2022.12.30,likes$X2022.12.30,col="red",axes=FALSE,xlab="",ylab="",
main="Nombre de j'aime et de commentaires en fonction du nombre de vues",
pch=-0x2022,bg="red")
axis(2,ylim=c(0,120000),col="red", col.axis="red",at=seq(0, 120000, by=20000))
mtext("Nombre de j'aime",side=2,line=2.5,col="red")
box()
par(new=TRUE)
plot(vues$X2022.12.30,commentaires$X2022.12.30,col="blue",axes=FALSE,xlab="",
ylab="",ylim=c(0,1500),pch=-0x2022,bg="blue")
axis(4,col="blue",col.axis="blue",at=seq(0, 1500, by=250))
mtext("Nombre de commentaires",side=4,line=2.5,col="blue")
axis(1,xlim=c(0,1500000),ylim=c(0,145000),col="black",col.axis="black",
at=seq(0, 1400000, by=100000))
mtext("Nombre de vues",side=1,line=2.5,col="black")
legend(x="topleft",legend=c("J'aime","Commentaires"),
text.col=c("black","black"),pch=c(-0x2022,-0x2022),col=c("red","blue"),
bg=c("#C9FFF1"),inset=0.02)
CodePudding user response:
Using par(new=TRUE)
and overplotting the commentaries data changes the y-axis scale; abline()
is still assuming the old scale is in effect.
The simple solution would be to use abline()
to add the regression line before you add the commentaries data.
Example:
dd <- data.frame(vues=
c(15900,8245,4531,546800,7149,10600,7774,45600,157100,
348300,15000,7363,24000,6073,6469,5848,13100,185600,
18700,7622,483800,6373,12000,7839,17100,10800,9846,
5671,10100,8330,9031,183000,17600,5153,117700,39600,
10300,27900,11200,29500,387800,15000,8968,465800,72500,
9501,5816,9761,5814,16200,269700,8905,16300,14700,
149600,7547,422600,40700,71100,18900,942000,12100,13400,
551900,16500,12000,8648,131900,10700,18400,183700,13500,
21500,1203000,14300,14700,108400,5233,388800,368400,1411000,
286400,17900,261500,1049000,13500,11200,74300,1312000,6044,
22200,9467,5975,143200,4552,502700,3971,9755,32000,
46800,8844,31600,3671,60700,8249,20100,14500,3475,
5745,2420,193700,2305,13500,90200,5746,5520,29200,
7803,2502,4559,2120,3233,242100,5616,1371,1109,
2123,2097,4019,1444,1515,2350,34600,2642,148000,
2139,541400,13700,52600,421700,9876,3671,33600,6388,
12300,3014,50200,2033,45900,5878,2221,1479),
likes = c(1572,935,229,39000,471,944,472,2149,15400,42000,1346,
517,1977,488,569,462,1940,17200,2121,588,84800,587,
987,618,1229,862,947,278,1048,628,795,19200,1529,
319,9050,3119,868,2840,780,1912,40100,1130,759,47800,
4197,815,470,786,502,1068,33200,698,1145,1442,11200,
534,41600,3740,5119,2376,91700,904,983,20800,812,869,
571,6653,807,1356,7332,1005,1597,104700,1171,982,14300,
367,14900,29800,103500,11900,1073,22700,67700,872,894,3673,
116800,251,2229,593,392,20400,267,29200,449,569,1933,
2260,1031,3035,311,6370,1014,812,956,241,641,116,
6543,113,503,5505,450,410,2067,494,76,350,155,
122,11400,350,51,42,109,96,200,62,53,98,
1207,153,15500,101,56900,718,4498,23600,619,248,1803,
437,983,234,4188,147,2623,591,176,138))
set.seed(101)
dd$other <- runif(nrow(dd), min=0, max = 1500)
plot(likes~vues, data =dd)
abline(lm(likes~vues, data =dd))
par(new=TRUE)
plot(other~vues, data = dd, axes=FALSE, col = 2)
abline(lm(likes~vues, data =dd), col =4, lwd =2)