I have a plot with a lot of data all bunched into the lower numbers, and a few outliers make the scale of the X and Y axis huge.
I don't want to delete these outliers, as they are not incorrect, but I would like to create a separate plot focussing on the data between X axis 2,000,000 views and Y axis 5,000,000 subscribers.
CodePudding user response:
I recommond to set a log scale. Or you may want to set x and y axis limites by specifying the minimum and the maximum values of each axis for your needs.
Indicating if x or y or both coordinates should be plotted in log scale:
- log = “x”
- log = “y”
- log = “xy”
OR axis limites
- xlim: the limit of x axis; format : xlim = c(min, max)
- ylim: the limit of y axis; format: ylim = c(min, max)
# --- sample data
view_count <- c(1,2,3,4,5,6,7,8,9,90,100,110)
subscriber_count <- c(1,2,3,4,5,6,7,8,9,190,200,210)
covviddf <- data.frame(view_count, subscriber_count)
covviddf
plot(covviddf$view_count, covviddf$subscriber_count,
main = "Comparison between channel subscribers and video views",
xlab = "Views", ylab = "Subscribers")
# set log scale for x and y ----------------------------------------------------
plot(covviddf$view_count, covviddf$subscriber_count,
main = "Comparison between channel subscribers and video views",
xlab = "Views", ylab = "Subscribers", log="xy")
# set x and y axis limites by specifying the minimum and the maximum -----------
plot(covviddf$view_count, covviddf$subscriber_count,
main = "Comparison between channel subscribers and video views",
xlab = "Views", ylab = "Subscribers", xlim=c(1,80), ylim=c(1,180))
Log Scale result: