Home > Mobile >  How do I plot data points that are very close together in R?
How do I plot data points that are very close together in R?

Time:06-14

Here's a sample of the three dataframe I'm working with. The full dataset contains 1,087 rows.

          Day Length  Category
1          1  33.807    Red
2          2  33.909    Red
3          3  34.011    Red
4          4  34.556    Red
5          5  34.789    Red
5          6  35        Red

          Day Length  Category
1          1  33.737    Blue
2          2  33.898    Blue
3          3  34.211    Blue
4          4  34.657    Blue
5          5  34.714    Blue
5          6  34.912    Blue

          Day Length  Category
1          1  33.631    Green
2          2  33.777    Green
3          3  34.101    Green
4          4  34.244    Green
5          5  34.590    Green
5          6  34.128    Green

My current code is as follows:

ggplot(data = df, aes(x = Day, y = Length, group = Category)) geom_line(aes(color = Category, alpha = 1), size = 2)

But this results in three lines that are overlapping. Is there a better solution for this? Again, this dataset is a sample and the full dataset is much larger. So a solution that would work for a dataset of any size would be appreciated!

CodePudding user response:

If you want to focus on the difference, plot the difference:

dd = data.frame(Day = d1$Day, diff = d1$Length - d2$Length)

library(ggplot2)
ggplot(dd, aes(x = Day, y = diff))  
  geom_hline(yintercept = 0, lwd = 1)  
  geom_line()  
  geom_point()  
  labs(title = "Difference (first - second)")

enter image description here

Using this data:

d1 = read.table(text = '          Day Length  
1          1  33.807
2          2  33.909
3          3  34.011
4          4  34.556
5          5  34.789
6          6  35', header = T)

d2 = read.table(text = '          Day Length  
1          1  33.737
2          2  33.898
3          3  34.211
4          4  34.657
5          5  34.714
6          6  34.912', header = T)

CodePudding user response:

The typical ggplot2 workflow would combine the two data sets into one table, with a column distinguishing the source, which you could then map to the color aesthetic, for instance. You might also add text labels if the distinction in the underlying data is important to show.

library(tidyverse)

tribble(
  ~Source, ~Day, ~Length,
"A",        1,  33.807,
"A",        2,  33.909,
"A",        3,  34.011,
"A",        4,  34.556,
"A",        5,  34.789,
"A",        6,  35,
"B",        1,  33.737,
"B",        2,  33.898,
"B",        3,  34.211,
"B",        4,  34.657,
"B",        5,  34.714,
"B",        6,  34.912) %>%
  
  ggplot(aes(Day, Length, color = Source))  
  geom_line()  
  ggrepel::geom_text_repel(aes(label = Length), 
                       direction = "y", box.padding = 0.5)

Or, if your data is in two tables like the d1 and d2 in the answer from @gregor-thomas, you could use something like this to combine them:

bind_rows("A" = d1, "B" = d2, .id = "Source") %>%
  ggplot(aes(Day, Length, color = Source))  
  geom_line()  
  ggrepel::geom_text_repel(aes(label = Length), 
                           direction = "y", box.padding = 0.5)

enter image description here


Edit:

If it's a visual readability issue, you might try variations like ggshadow::geom_shadowline to highlight overlaps:

devtools::install_github("marcmenem/ggshadow")
df %>%
  ggplot(aes(Day, Length, color = Category))  
  ggshadow::geom_shadowline(size = 2)

enter image description here

Using this data:

df = read.table(text = 
'Day  Length   Category
1  33.807    Red
2  33.909    Red
3  34.011    Red
4  34.556    Red
5  34.789    Red
6  35        Red
1  33.737    Blue
2  33.898    Blue
3  34.211    Blue
4  34.657    Blue
5  34.714    Blue
6  34.912    Blue
1  33.631    Green
2  33.777    Green
3  34.101    Green
4  34.244    Green
5  34.590    Green
6  34.128    Green', header = T)
  • Related