suppose I have a data frame:
df<-data.frame(specie=c('apple','banana','orange'),total=c(1000,500,850),fresh=c(NA,250,700),processed=c(NA,250,150))
Note that for Apple, only Total value is available. I would like to create a bar plot that show both fresh and processed values for each specie, and show only total value for the case of Apple where fresh and processed values are NAs.
I tried with the codes below, but didn't get the plot I want:
p<-df %>% ggplot(aes(x=total,y=specie))
p geom_col() geom_col(aes(x=fresh,fill="fresh")) geom_col(aes(x=processed,fill="processed"))
Why total area exceeds fresh and processed?
CodePudding user response:
So, you have three categories. These are overlaying the bars one on top of another. So ggplot draws the NA bars, then the fresh on top, then finally the processed ones. I imagine that you want stacked bars, that add up the total numbers. So, instead, you will want a longer dataframe with a column for fruit type and you don't need the totals. The you map fill onto the type as you plot. Like so:
df<-data.frame(specie=c('apple','banana','orange', 'banana','orange'),type = c(NA, "fresh", "fresh", "processed","processed"), num=c(1000, 250, 700,250,150))
df %>% ggplot(aes(x=num,y=specie, fill= type)) geom_col(position='stack')
CodePudding user response:
You need to rework your data set in a way that your columns name became the modalities of a variable, and all the values are in the same columns.
df<-data.frame(specie=c('apple','banana','orange'),unknow=c(1000,NA,NA),fresh=c(NA,250,700),processed=c(NA,250,150))
df <- df %>% tidyr::pivot_longer(cols = c("unknow", "fresh", "processed"),names_to = "type")
ggplot2::ggplot()
ggplot2::geom_bar(data = df,
mapping = ggplot2::aes(x = specie,
y = value,
fill = type),
stat = "identity")