I wish to show on my scatter plot :
- how far away are values from the plot origin [0,0], and
- are they closer to the center, X, Y axis, both XY, or far away? In terms of spliting the plot into sectors by 2 lines (at 33 degrees) and cirle-like sections.
To estimate how far are the points from the origin [0,0], I can easily calculate the Euclidian distace.
But I am not sure how to classify my points based to their distance to the origin [0,0], and to the X, or Y axis or both? I think my issue here is that I can't simply set the classification rule s as:
Center = if X < 0.5 & Y < 0.5
, as this represents a square. Rather, my values should follow the euclidian distance here? e.g Center = X < 0.5 & Y < 0.5 & Euclid_dist < 0.5
But how to get the classification for the 'X', 'Y', 'XY' and their 'far' alternatives, considering at the same time both lines and circles as sectors? It is likely a simple trigonometric question, but I can't figure it out.
Here is my ideal case:
And my dummy example:
dd <- data.frame(x = runif(10, min=0, max=2),
y = runif(10, min=0, max=2))
# Get euclidean distance
euclidean <- function(a, b) sqrt(sum((a - b)^2))
dd <- dd %>%
mutate(euclid_dist = euclidean(x, y))
dd %>%
ggplot(aes(x = x,
y = y))
geom_point()
theme_bw()
theme_update(aspect.ratio=1)
CodePudding user response:
This is probably not the most efficient way to do it, but it's a start.
set.seed(4242)
dd <- data.frame(x = runif(20, min=0, max=2),
y = runif(20, min=0, max=2))
# I changed your euclidean distance function to return distance of each point from the origin
euclidean <- function(a, b) {
sqrt((dd$x)^2 (dd$y)^2)
}
# Define the slopes of the lines that divide the area into x, y, xy
slope1 <- 0.5
slope2 <- 2
# Define the radii of the circles that define the origin,?, far areas,
# which I've called near, mid, far
r1 <- 0.5
r2 <- 1.5
dd2 <- dd %>%
mutate(
euclid_dist = euclidean(x, y),
computed_y1 = x * slope1,
computed_y2 = x * slope2,
dist = cut(euclid_dist, breaks=c(0, r1, r2, 5), label=c('near', 'mid', 'far'))
)
# There's presumably a way to do this within the above mutate function using case_when()
dd2$pos <- 'xy'
dd2$pos[dd2$y < dd2$computed_y1] <- 'x'
dd2$pos[dd2$y > dd2$computed_y2] <- 'y'
dd2$pos <- as.factor(dd2$pos)
ggplot(dd2)
geom_point(aes(x = x, y = y, col=dist, shape=pos), size=3)
annotate("path",
x = r1*cos(seq(0,2*pi,length.out=100)),
y = r1*sin(seq(0,2*pi,length.out=100))
)
annotate("path",
x = r2*cos(seq(0,2*pi,length.out=100)),
y = r2*sin(seq(0,2*pi,length.out=100))
)
geom_abline(intercept = 0, slope=0.5, col='red')
geom_abline(intercept = 0, slope=2, col='blue')
scale_x_continuous(expand = c(0, 0), limits = c(0, 2))
scale_y_continuous(expand = c(0, 0), limits = c(0, 2))
theme_classic()
CodePudding user response:
library(tidyverse)
library(ggthemes)
set.seed(123)
dd <- data.frame(x = runif(200, min=0, max=2),
y = runif(200, min=0, max=2))
slope = 33 #degrees
dd %>%
#calculate dfistance from origin
mutate(orig_dist = sqrt(x^2 y^2)) %>%
#calculate position (origin, far, etc..)
mutate(position = case_when(orig_dist < 0.5 ~ "origin",
orig_dist >= 1 ~ "-far",
TRUE ~ "")) %>%
#calculate XY label
mutate(labelXY = case_when((180*atan(x / y) / pi) < slope ~ "Y",
(180*atan(x / y) / pi) > (90 - slope) ~ "X",
TRUE ~ "XY")) %>%
#create group category
mutate(group = ifelse(position == "origin",
position,
paste0(labelXY, position))) %>%
#plot
ggplot(aes(x = x, y = y, color = group)) geom_point()
ggthemes::scale_color_colorblind()
theme_bw() theme_update(aspect.ratio=1)