How Do You Copy A Column and Paste Values Below Another Column in R?-CodePudding

I am looking to copy all the values of one column and paste them beneath an existing column within R. The thing is that these columns are on different sheets and it can't include adding a column, rather adding to the bottom of the existing column. There are values below those shown below and there are numerous other columns on both Sheets mentioned. It would replicate this:

Current:

SheetA$NumberOfClients         SheetB$ClientTotal
20                             16
10                             19
23                             12
24                             27
25                             31
26                             ...
19
17
...

Need to Get To:

SheetA$NumberOfClients
20
10
23
24
25
26
19
17
16
19
12
27
31
...

I have attempted to use rbind as well as the following code, but have received an "Error in libxlsxwriter: 'Worksheet row or column index out of range.'" when attempting to write the data frame as an xlsx file (note it is NOT too large of a file):

SheetA$NumberOfClients <- data.frame(SheetA$NumberOfClients, SheetB$ClientTotal)

CodePudding user response：

To append these to the column NumberOfClients in SheetA you will need more rows than SheetA currently has. This structure is weird, but if for whatever reason you wanted to do this, you could do:

SheetA[(nrow(SheetA) 1):(nrow(SheetA) nrow(SheetB)), "NumberOfClients"] <- SheetB$ClientTotal

Since there is no sample data, I cant test but in the example:

a <- data.frame(col1 = 1:20,
                col2 = LETTERS[1:20])
b <- data.frame(col1 = 1:10,
               col2 = LETTERS[1:10])

a[(nrow(a) 1):(nrow(a) nrow(b)), "col1"] <- b$col1

CodePudding user response：

Since it appears that your frames have other columns not included here, I think your easiest option is to use dplyr::bind_rows or data.table::rbindlist, since they are much more user-friendly when it comes to additional columns and/or different order of columns.

Reproducible data, with an extra column thrown in for good measure.

SheetA <- structure(list(NumberOfClients = c(20L, 10L, 23L, 24L, 25L, 26L, 19L, 17L), other = c(0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, -8L), class = "data.frame")
SheetB <- structure(list(ClientTotal = c(16L, 19L, 12L, 27L, 31L)), class = "data.frame", row.names = c(NA, -5L))

No row-combining function is going to be able to infer how you want to combine columns of different names, so you need to be explicit and change the names so they match:

names(SheetB) <- "NumberOfClients"

From here, using rbind itself, given your comment, is going to fail:

rbind(SheetA, SheetB)
# Error in rbind(deparse.level, ...) : 
#   numbers of columns of arguments do not match

base R

SheetB2 <- cbind(SheetB, subset(SheetA, select = setdiff(colnames(SheetA), colnames(SheetB)))[1,,FALSE][NA,,FALSE])
# Warning in data.frame(..., check.names = FALSE) :
#   row names were found from a short variable and have been discarded
rbind(SheetA, SheetB2)
#    NumberOfClients other
# 1               20     0
# 2               10     0
# 3               23     0
# 4               24     0
# 5               25     0
# 6               26     0
# 7               19     0
# 8               17     0
# 9               16    NA
# 10              19    NA
# 11              12    NA
# 12              27    NA
# 13              31    NA

The subset(.., setdiff(..)) part is to extra columns in SheetA that we are missing in SheetB. The [1,,FALSE] gives us just 1 row of that. The [NA,,FALSE] gives us the appropriate "NA" for each of those columns. In this case, that portion (before the call to cbind) effectively returns data.frame(other=NA_real_). The reason to go through this? R has at least six different types of NA, and this is just a clear/declarative way to make sure that we get the correct one. If we don't, then it's possible one of the new columns will be coerced to a class that is different from the original SheetA. (This might be a bit paranoid, but it is borne from many sessions of troubleshooting.)

dplyr

dplyr::bind_rows(SheetA, SheetB)
#    NumberOfClients other
# 1               20     0
# 2               10     0
# 3               23     0
# 4               24     0
# 5               25     0
# 6               26     0
# 7               19     0
# 8               17     0
# 9               16    NA
# 10              19    NA
# 11              12    NA
# 12              27    NA
# 13              31    NA

data.table

data.table::rbindlist(list(SheetA, SheetB), fill = TRUE)
#     NumberOfClients other
#               <int> <num>
#  1:              20     0
#  2:              10     0
#  3:              23     0
#  4:              24     0
#  5:              25     0
#  6:              26     0
#  7:              19     0
#  8:              17     0
#  9:              16    NA
# 10:              19    NA
# 11:              12    NA
# 12:              27    NA
# 13:              31    NA