Home > OS >  How to graph all the occurences of NA by bins of positions
How to graph all the occurences of NA by bins of positions

Time:12-19

I am a little new to R, and I am looking to graph all the occurrences of NA over the change in position as a histogram. I am starting to use ggplot but I'm not sure where to start?

df <- data.frame(position=c(3, 5, 6, 7, 16, 17, 19, 20, 21, 35, 46, 78, 98, 100, 
                            145, 146, 147), 
                 petal15=c(0, 1, 1, 1, 0, NA, 1, NA, NA, NA, 1, 0, 0, 1, NA, 
                           0, NA), 
                 petal20=c(1, 0, 0, 1, 1, 1, 1, 1, NA, NA, 1, 1, NA, NA, 0, 
                           0, 0))

CodePudding user response:

If I understood you correctly you are looking for this.

hist(rowSums(is.na(df)), xlab='occurrences', col=4, main="NA's")

enter image description here

Edit

To implement your idea in the comment, we first want a list of the positions.

(pos <- lapply((0:14)*10   1, \(x) (0:9)   x))
# [[1]]
# [1]  1  2  3  4  5  6  7  8  9 10
# 
# [[2]]
# [1] 11 12 13 14 15 16 17 18 19 20
#
# [...]
#
# [[15]]
# [1] 141 142 143 144 145 146 147 148 149 150

Using sapply we create the desired matrix and pipe it to barplot().

sapply(pos, \(i) colSums(is.na(df[df$position %in% i, -1]))) |>
  barplot(beside=TRUE, xlim=c(1, 48), ylim=c(0, 2.5), col=3:4,
          xlab='Occurences', ylab='Freq', main="NA's",
          legend.text=names(df[-1]), args.legend=list(x='topright'),
          names.arg=Reduce(\(x, y) paste0(x, '-', y), 
                           as.data.frame(t(sapply(pos, range)))),
          cex.names=.8) 
box()

enter image description here

Note:

R.version.string
# [1] "R version 4.1.2 (2021-11-01)"

Data:

df <- structure(list(position = c(3, 5, 6, 7, 16, 17, 19, 20, 21, 35, 
46, 78, 98, 100, 145, 146, 147), petal15 = c(0, 1, 1, 1, 0, NA, 
1, NA, NA, NA, 1, 0, 0, 1, NA, 0, NA), petal20 = c(1, 0, 0, 1, 
1, 1, 1, 1, NA, NA, 1, 1, NA, NA, 0, 0, 0)), class = "data.frame", row.names = c(NA, 
-17L))

CodePudding user response:

A tidyverse solution - pivot longer, then filter for all NAs, then plot a histogram as suggested above, but with ggplot instead.

library(tidyverse)

df_na <- 
  df %>%
  pivot_longer(-position) %>%
  filter(is.na(value)) 
  
ggplot(df_na, aes(position, fill = name))  
  geom_histogram(binwidth = 5, position = position_dodge())  
  scale_x_continuous(breaks = seq(0,150,10))

Created on 2021-12-19 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related