Home > database >  Split vector by each NA in R
Split vector by each NA in R

Time:12-04

I have the following vector called input:

input <- c(1,2,1,NA,3,2,NA,1,5,6,NA,2,2)

[1]  1  2  1 NA  3  2 NA  1  5  6 NA  2  2

I would like to split this vector into multiple vectors by each NA. So the desired output should look like this:

> output
[[1]]
[1] 1 2 1

[[2]]
[1] 3 2

[[3]]
[1] 1 5 6

[[4]]
[1] 2 2

As you can see every time a NA appears, it splits into a new vector. So I was wondering if anyone knows how to split a vector by each NA into multiple vectors?

CodePudding user response:

One way could go like follows:

  1. identify the NAs
  2. do cumsum
  3. split according to the cumulative sums
  4. remove the NAs
input <- c(1,2,1,NA,3,2,NA,1,5,6,NA,2,2)
tmp <- cumsum(is.na(input))
lapply(split(input, tmp), na.omit)

CodePudding user response:

Using a similar logic to @tpetzoldt, but removing the NAs before the split:

split(na.omit(input), cumsum(is.na(input))[!is.na(input)])

$`0`
[1] 1 2 1

$`1`
[1] 3 2

$`2`
[1] 1 5 6

$`3`
[1] 2 2

CodePudding user response:

This one is too verbose and overcomplicated, but for me it is easier to think of such problems, just wanted to share:

library(tidyverse)

tibble(input) %>% 
  group_by(id = cumsum(is.na(input))) %>% 
  na.omit %>% 
  group_split() %>% 
  map(.,~(.x %>%select(-id))) %>% 
  map(.,~(.x %>%pull))

[[1]]
[1] 1 2 1

[[2]]
[1] 3 2

[[3]]
[1] 1 5 6

[[4]]
[1] 2 2

CodePudding user response:

Here's a solution using recursion:

split_by_na <- function(x) {
  sep <- match(NA, x)
  
  if (is.na(sep)) {
    list(x)
  } else {
    head <- x[seq(to = sep - 1)]
    tail <- x[seq(from = sep   1, to = length(x))]
    
    append(list(head), split_by_na(tail))
  }
}

split_by_na(input)
# [[1]]
# [1] 1 2 1
# 
# [[2]]
# [1] 3 2
# 
# [[3]]
# [1] 1 5 6
# 
# [[4]]
# [1] 2 2
  • Related