Home > Software design >  Scatterplot of a binary variable (ggplot)
Scatterplot of a binary variable (ggplot)

Time:09-22

I need some help trying to plot a scatterplot via ggplot. In the dataset below, I want to see the the percent female on the x-axis and the Unit variable on the y axis in two panels by conference year (see the picture for reference Scatter plot.

I tried subseting the dataset to only females and then attempting to plot the graph, but I was not sure how to do this.

Could someone help me?

Thanks!

structure(list(gender = c("Male", "Male", "Female", "Male", "Female", 
"Female", "Male", "Female", "Female", "Unknown"), race_ethnicity = c("Latino or Hispanic American", 
"Black, Afro-Caribbean, or African American", "Latino or Hispanic American", 
"East Asian or Asian American", "Latino or Hispanic American", 
"Non-Hispanic White or Euro-American", "Non-Hispanic White or Euro-American", 
"Non-Hispanic White or Euro-American", "Non-Hispanic White or Euro-American", 
"No Response"), year_of_birth = c("1979", "1976", "1981", "1977", 
"1985", "No Response", "No Response", "1961", "1978", "No Response"
), primary_field = c("American Politics", "American Politics", 
"American Politics", "American Politics", "American Politics", 
"American Politics", "American Politics", "American Politics", 
"International Politics", "No Response"), role_s = c("Chair Presenter Author", 
"Discussant", "Author", "Author", "Author", "Discussant", "Chair", 
"Discussant", "Author", "Author"), unit = c("Elections, Public Opinion, and Voting Behavior", 
"Elections, Public Opinion, and Voting Behavior", "Elections, Public Opinion, and Voting Behavior", 
"Elections, Public Opinion, and Voting Behavior", "Elections, Public Opinion, and Voting Behavior", 
"Political Communication", "Political Communication", "Political Communication", 
"Political Communication", "Political Communication"), conference_year = c(2017L, 
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L
)), row.names = c(NA, 10L), class = "data.frame")

CodePudding user response:

For each year and unit you may calculate proportion of females in the conference and plot a scatter plot for each year in different facets.

library(dplyr)
library(ggplot2)

df %>%
  group_by(conference_year, unit) %>%
  summarise(percent_female = mean(gender == 'Female')) %>%
  ggplot(aes(unit, percent_female))   
  geom_point()   
  facet_wrap(~conference_year)
  • Related