I'm processing multiple R files some have 3 columns
id v_1 v_2 V_3
1 1 2 3
Other have more some less
id v_1 v_2
1 1 7
Is is possible to "force" every file to have 10 columns (not counting id col) and the extra ones with no values such that you get
For the first example
id v_1 v_2 v_3 v_4 v_5 v_6 v_7 v_8 v_9 v_10
1 1 2 3
And for the second example
id v_1 v_2 v_3 v_4 v_5 v_6 v_7 v_8 v_9 v_10
1 1 7
So basically regardless of the number of columns from the input file you always get 10 columns
CodePudding user response:
Here is a possibility using bind_rows
, where you first create an empty dataframe with the desired columns, then bind them together.
library(dplyr)
df_cols <- setNames(data.frame(matrix(ncol = 11, nrow = 0)), c("id", paste0("v_", 1:10)))
bind_rows(df_cols, df)
Another (probably faster option) is to use data.table
:
library(data.table)
rbindlist(list(df_cols, df), fill = TRUE)
Another option is to use plyr
:
plyr::rbind.fill(df_cols, df)
Output
id v_1 v_2 v_3 v_4 v_5 v_6 v_7 v_8 v_9 v_10
1 1 1 2 3 NA NA NA NA NA NA NA
If you have a list of dataframes, then you could use map
to update all the dataframes in the list:
library(tidyverse)
map(list(df, df2), bind_rows, df_cols)
#[[1]]
# id v_1 v_2 v_3 v_4 v_5 v_6 v_7 v_8 v_9 v_10
#1 1 1 2 3 NA NA NA NA NA NA NA
#[[2]]
# id v_1 v_2 v_3 v_4 v_5 v_6 v_7 v_8 v_9 v_10
#1 1 1 2 NA NA NA NA NA NA NA NA
Data
df <- read.table(text = "id v_1 v_2 v_3
1 1 2 3", header = T)
df2 <- read.table(text = "id v_1 v_2
1 1 2", header = T)