Create columns if missing in dataframe with R (tidyverse)-CodePudding

I have a df with this structure

What I would like is to complete it as I was need it to conform to a format that goes on columns from 1 thru 6, so the expected result would be

id 1 2 3 4 5 6 
1  1 0 2 0 0 0
2  3 4 1 0 0 0
3  1 1 0 0 2 0

This is an example 'missing' columns can vary in this example it was 4 and 6, so the idea is that if the column is missing it will be created and filled with zeros.

Thanks!

CodePudding user response：

One way you could do this would be to reshape long, use tidyr::complete to get the range of column names, then reshape wide. Since id is unknown for the new column, I also drop the id = NA row.

Note, R doesn't always play well with numeric column names, and they are not considered syntactic. https://stat.ethz.ch/R-manual/R-devel/library/base/html/make.names.html

A syntactically valid name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number.

But we can make a dataframe with numeric strings as the column names if we tell R not to check:

library(tidyverse)
data.frame(                 
  check.names = FALSE,          
           id = c(1L, 2L, 3L),
          `1` = c(1L, 3L, 1L),
          `2` = c(0L, 4L, 1L),
          `3` = c(2L, 1L, 0L),
          `5` = c(0L, 0L, 2L)
) %>% 
  pivot_longer(-id, names_transform = as.numeric) %>%
  complete(name = 1:6) %>%
  pivot_wider(names_from = name, values_from = value, values_fill = 0) %>%
  drop_na(id)

Result

# A tibble: 3 × 7
     id   `1`   `2`   `3`   `4`   `5`   `6`
  <int> <int> <int> <int> <int> <int> <int>
1     1     1     0     2     0     0     0
2     2     3     4     1     0     0     0
3     3     1     1     0     0     2     0