Home > Software design >  Plotting in R ggplot- beginner
Plotting in R ggplot- beginner

Time:04-07

I have a project at work where I am trying to plot data on movies.

My goal is to plot 'definite interest' & 'total awareness' both on the Y-axis and 'estimated admissions' on the X-axis. I will be using ggplot2 and will have both Y- values in different colors.

The main issue I am having is filtering the movies by 'year' and 'window' in ggplot. The window I want is T-0 and the year of release should be 2018. Since I only know how to implement values on the X- and Y-Axis in ggplot without conditions, I could use some guidance.

How should I go about filtering the data and plotting the X and Y's? The excel sheet is attached. Also, I understand the code may not be correct, but I come from a Java background, so no idea what I am doing.

Excel input

If my_data$window = T-0 {
  my_data$window <- TRUE 
} else { 
FALSE 
}

I expected to make 'window' true if T-0 = TRUE. I attempted the same for 'year'

x <- my_data$estimatedAdmissions
y1 <- my_data$definiteInterest 
y2 <- my_data$totalAwareness
plot(x, y1, y2, filter 1, filter 2)

I expect to plot those values and will change y1 & y2 to different colors later.

CodePudding user response:

Please provide a reproducible example next time - we can't copy your screenshot into R, so I had to recreate your dataset myself like so.

dat <- data.frame(year = 2018,
                  title = rep(c("a", "b"), each = 6),
                  estimatedAdmissions = rep(c(287351, 29518), each = 6),
                  window = rep(c("T 2", "T 1", "T-0", "T-1", "T-2", "T-3"), 2),
                  definiteInterest = c(13, 15, 25, 22, 16, 27, 15, 14, 23, 20, 28, 29),
                  totalAwareness = round(runif(4, 31, n = 12)))

You can use subset to filter rows, and then ggplot with geom_point to create a scatterplot. Ggplot works best with data in long format, so I reshaped your data longer with pivot_longer() from tidyr first.

library(tidyr)
library(ggplot2)

dat |> subset(year == 2018 & window == "T-0") |>
        pivot_longer(cols = c(definiteInterest, totalAwareness), names_to = "measure", values_to = "percentage") |> 
        ggplot(aes(x = estimatedAdmissions, y = percentage, colour = measure))  
        geom_point()
  • Related