Home > OS >  how to make a dot plot based on each column and highlight the beginning and the end
how to make a dot plot based on each column and highlight the beginning and the end

Time:12-10

I have a data like this

df<- structure(list(Number = 1:23, Value1 = c(0.054830335, 1.19531842, 
3.27820329, 1.03530176, 5.77430976, 3.72944, -0.683513395, 0.029550239, 
2.487922644, 0.533448117, 0.098825565, -1.089022938, 2.301631235, 
-0.095666867, -1.359480317, -1.359480317, 1.089441628, 3.307589929, 
4.67838434, 3.562761178, 2.630726653, 1.795107015, 2.616255192
), Value2 = c(-0.296874921, 1.491747294, 2.951219257, 1.258677675, 
-8.68096591, 3.361029751, -1.824459195, -1.445827538, 1.889631269, 
-15.47774216, 3.085461276, -1.078286963, 0.948056999, -2.109354753, 
-1.36703068, -1.36703068, 1.074642842, 2.945589842, 3.757911793, 
2.765225717, 2.44452491, 1.784451022, 1.158493893)), class = "data.frame", row.names = c(NA, 
-23L))

I am trying to make a dot plot (one color for the Value1 vrsus number) and one with Value2 versus Number. Then show the first 5 values in bigger size and the bottom 5 in bigger size

I tried to plot it like this

df$Number <- factor(df$Number, levels = paste0("D", 1:23), ordered = TRUE)
ggplot(df, aes(x=Value1, y=Value2, color= Number))   
  geom_text() 
  theme_classic()

I can plot one of them like this

ggplot(data = df, aes(x = Number, y = Value1)) 
  geom_point()

when it comes to have the second one on the same plot, kinda fuzzy.

I can put them together in this way

# wide to long format
plotDf <- gather(df, Group, Myvalue, -1)
# plot
ggplot(plotDf, aes(Number, Myvalue, col = Group))  
  geom_point()

I still don't know how to show the first 5 values in bigger size and last 5 values in bigger size

The first 5 and the last 5 I mean these ones

 df
       Number      Value1      Value2
    1       1  0.05483034  -0.2968749
    2       2  1.19531842   1.4917473
    3       3  3.27820329   2.9512193
    4       4  1.03530176   1.2586777
    5       5  5.77430976  -8.6809659
    6       6  3.72944000   3.3610298
    7       7 -0.68351339  -1.8244592
    8       8  0.02955024  -1.4458275
    9       9  2.48792264   1.8896313
    10     10  0.53344812 -15.4777422
    11     11  0.09882557   3.0854613
    12     12 -1.08902294  -1.0782870
    13     13  2.30163123   0.9480570
    14     14 -0.09566687  -2.1093548
    15     15 -1.35948032  -1.3670307
    16     16 -1.35948032  -1.3670307
    17     17  1.08944163   1.0746428
    18     18  3.30758993   2.9455898
    19     19  4.67838434   3.7579118
    20     20  3.56276118   2.7652257
    21     21  2.63072665   2.4445249
    22     22  1.79510701   1.7844510
    23     23  2.61625519   1.1584939

These are the first 5

1       1        0.05483034  -0.2968749
        2       2  1.19531842   1.4917473
        3       3  3.27820329   2.9512193
        4       4  1.03530176   1.2586777
        5       5  5.77430976  -8.6809659

and these are the last 5

 19     19  4.67838434   3.7579118
    20     20  3.56276118   2.7652257
    21     21  2.63072665   2.4445249
    22     22  1.79510701   1.7844510
    23     23  2.61625519   1.1584939

CodePudding user response:

Using the original data (without factor):

ggplot(df, aes(Number, Value1, size = (Number <= 5 | Number > 18)))  
  geom_point()  
  geom_point(aes(y=Value2))  
  scale_size_manual(name = NULL, values = c("TRUE" = 2, "FALSE"  = 0.5))  
  scale_x_continuous(breaks = function(z) do.call(seq, as.list(round(z,0))))

ggplot2

Because using a logical condition to determine size=, the manual values assigned to it need to correspond to character versions of the various values observed, which are of course TRUE and FALSE logicals into "TRUE" and "FALSE". My choice of 2 and 0.5 is arbitrary.

Feel free to name the legend better with name="some name" if desired. If you want no legend (which makes sense), you can use

...  
  scale_size_manual(guide = "none", values = c("TRUE" = 2, "FALSE"  = 0.5))

instead.

Another alternative, in case you want to make distinct the dots by which value they are, is to melt the data into a long format before plotting.

ggplot(reshape2::melt(df, "Number"),
       aes(Number, value, color = variable,
           size = (Number <= 5 | Number >= 18)))  
  geom_point()  
  scale_size_manual(guide = "none", values = c("TRUE" = 2, "FALSE" = 0.5))

ggplot2 with colored points for each value1 and value2

One can use tidyr::pivot_longer or data.table::melt with similar results, see Reshaping data.frame from wide to long format.

  •  Tags:  
  • r
  • Related