Home > Back-end >  how to add a new row with extra column in R?
how to add a new row with extra column in R?

Time:03-13

I was trying to add results of a for loop into a dataframe as new rows, but it gets an error when there is a new result with more columns than the original dataframe, how could I add the new result with extra columns to the dataframe with adding the extra column names to the original dataframe?

e.g. original dataframe:

-______A B C

  • x1 1 1 1
  • x2 2 2 2
  • x3 3 3 3

I want to get

-______A B C D

  • x1 1 1 1 NA
  • x2 2 2 2 NA
  • x3 3 3 3 NA
  • X4 4 4 4 4

I tried rbind (Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match) and rbind_fill (Error: All inputs to rbind.fill must be data.frames) and bind_rows (Argument 2 must have names)

CodePudding user response:

In base R, this can be done by creating a new column 'D' with NA and then assign new row with 4.

df1$D <- NA
df1['x4', ] <- 4

-output

> df1
   A B C  D
x1 1 1 1 NA
x2 2 2 2 NA
x3 3 3 3 NA
x4 4 4 4  4

Or in a single line

rbind(cbind(df1, D = NA), x4 = 4)
   A B C  D
x1 1 1 1 NA
x2 2 2 2 NA
x3 3 3 3 NA
x4 4 4 4  4

Regarding the error in bind_rows, it happens when the for loop output is not a named vector

library(dplyr)
> vec1 <- c(4, 4, 4, 4)
> bind_rows(df1, vec1)
Error: Argument 2 must have names.
Run `rlang::last_error()` to see where the error occurred.

If it is a named vector, then it should work

> vec1 <- c(A = 4, B = 4, C = 4, D = 4)
> bind_rows(df1, vec1)
     A B C  D
x1   1 1 1 NA
x2   2 2 2 NA
x3   3 3 3 NA
...4 4 4 4  4

data

df1 <- structure(list(A = 1:3, B = 1:3, C = 1:3), 
class = "data.frame", row.names = c("x1", 
"x2", "x3"))

CodePudding user response:

You probably have something like this, if you list the elements of your for loop.

(l <- list(x1, x2, x3, x4, x5))
# [[1]]
# [1] 1 1 1
# 
# [[2]]
# [1] 2 2 2 2
# 
# [[3]]
# [1] 3 3
# 
# [[4]]
# [1] 4
# 
# [[5]]
# NULL

Multiple elements can be rbinded using a do.call(rbind, .) approach, your problem is, how to rbind multiple elements that differ in length.

There's a `length<-` function with which you may adjust the length of a vector. To know to which length, there's another function, lengths, that gives you the lengths of each list element, where you are interested in the maximum.

I include the special case when an element has length NULL (our 5th element of l); since length of NULL cannot be changed, replace those elements with NA.

So altogether you may do:

do.call(rbind, lapply(replace(l, lengths(l) == 0L, NA), `length<-`, max(lengths(l))))
#       [,1] [,2] [,3] [,4]
# [1,]    1    1    1   NA
# [2,]    2    2    2    2
# [3,]    3    3   NA   NA
# [4,]    4   NA   NA   NA
# [5,]   NA   NA   NA   NA

Or, since you probably want a data frame with pretty row and column names:

ml <- max(lengths(l))
do.call(rbind, lapply(replace(l, lengths(l) == 0L, NA), `length<-`, ml)) |>
  as.data.frame() |> `dimnames<-`(list(paste0('x', 1:length(l)), LETTERS[1:ml]))
#     A  B  C  D
# x1  1  1  1 NA
# x2  2  2  2  2
# x3  3  3 NA NA
# x4  4 NA NA NA
# x5 NA NA NA NA

Note: R >= 4.1 used.


Data:

x1 <- rep(1, 3); x2 <- rep(2, 4); x3 <- rep(3, 2); x4 <- rep(4, 1); x5 <- NULL
  • Related