Home > front end >  Why does the y-axis of my stacked bar chart not match the values in my data frame?
Why does the y-axis of my stacked bar chart not match the values in my data frame?

Time:02-08

I have a data frame (190 x 27) where each sample has a certain proportion of different cell types. Below is a snapshot of this data frame.

# A tibble: 3 × 5
  projid   age_cat Monocytes_EPIC Neutrophils_EPIC Vascular_endothelial_cells
  <chr>    <fct>            <dbl>            <dbl>                      <dbl>
1 02525608 95-99            0.062            0.832                      0.066
2 03052480 85-89            0.075            0.762                      0.051
3 03283241 75-79            0.021            0.876                      0.031

I need to create a stacked bar chart, where the x-axis is age category, the bars are colored/filled by cell type, and the y-axis reflect the proportion of each cell type within each age-category.

I tried the code below and generated the following data frame and stacked bar chart:

df_long <- df %>% gather( CellType, Proportion, 3:27 )
df_long
# A tibble: 4,750 × 4
   projid   age_cat CellType       Proportion
   <chr>    <fct>   <chr>               <dbl>
 1 02525608 95-99   Monocytes_EPIC      0.062
 2 03052480 85-89   Monocytes_EPIC      0.075
 3 03283241 75-79   Monocytes_EPIC      0.021
 4 03430444 75-79   Monocytes_EPIC      0.032
 5 03806878 85-89   Monocytes_EPIC      0.127
 6 03889845 85-89   Monocytes_EPIC      0.025
 7 04576591 90-94   Monocytes_EPIC      0.129
 8 05498462 85-89   Monocytes_EPIC      0.021
 9 05522533 75-79   Monocytes_EPIC      0    
10 06073025 80-84   Monocytes_EPIC      0.07 
# … with 4,740 more rows
ggplot( pd2_long, aes( x = age_cat, y = Proportion, fill = CellType ) )  
  geom_col( position = position_stack( ) )  
  scale_fill_manual( values = mycolors, name = "Cell Type" )  
  scale_y_continuous( expand = c( 0.01,1 ) )  
  theme( axis.text.x = element_text( angle = -90 ) )  
  xlab( "Age Category" )   
  ylab( "Proportion" )  
  labs( title = "Proportion of Cell Types" )  
  theme( axis.text.x = element_text( angle = 0 ) )

Stacked Bar Chart

As you can see, the Y-axis seems to reflect a count of the number of samples within each category, not the proportion of each cell type. Can someone please explain why this is happening and how I can obtain a similar graph where the Y-axis reflects the proportion of each cell type within each age category?

CodePudding user response:

It's summing the Proportion values for celltype within each age. If you then want to represent that summed data as the proportion (summing to 1.00 for each age group) of celltype within each age you can change geom_col ->

geom_col(position = "fill")
  •  Tags:  
  • Related