Home > database >  add a row to dataframe with value in specific columns in R
add a row to dataframe with value in specific columns in R

Time:09-21

Thanks in advance as I know this is an extremely basic question. Yet I am having trouble finding an answer.

I wanted to add a row to a dataframe where pre-defined columns receive a specific value. For example, if the original dataframe looks like this:

       A B C D
Label1 0 1 4 2
Label2 0 0 0 0
Label3 2 1 0 0

I want to add the value of "20" in a new row called New.Label only to columns C & D such that:

          A B  C  D
Label1    0 1  4  2
Label2    0 0  0  0
Label3    2 1  0  0
New.Label 0 0 20 20

In my actual situation, I have a .txt file containing hundreds of column names that exist in a matrix containing many thousands of columns, so I'm looking for a much needed shortcut :) I am most comfortable using R so something that works in R would be really appreciated!

Thanks in advance and sorry again for the basic nature of the question!

CodePudding user response:

There are a few ways to do this, rbind using a named vector is probably easiest.

I'm not exactly sure how this scales to your "actual situation" though, it may help to provide a slightly more realistic example (obviously not with the hundreds and thousands of columns).

mydata <- structure(list(A = c(0L, 0L, 2L), 
                         B = c(1L, 0L, 1L), 
                         C = c(4L, 0L, 0L), 
                         D = c(2L, 0L, 0L)), 
                    class = "data.frame", 
                    row.names = c("Label1", "Label2", "Label3"))

mydata

       A B C D
Label1 0 1 4 2
Label2 0 0 0 0
Label3 2 1 0 0


mydata <- rbind(mydata, New.Label = c(0, 0, 20, 20))

mydata

          A B  C  D
Label1    0 1  4  2
Label2    0 0  0  0
Label3    2 1  0  0
New.Label 0 0 20 20

CodePudding user response:

You can use the function bind_rows from the package dplyr. So instead of adding a vector to your dataframe / matrix you simply add the values you want to add. This helps if you have a large number of columns and you want to add a new row with only values in some columns. Because the number of columns don't have to be the same.

mydata = data.frame(A = c(0, 0, 2), 
                    B = c(1, 0, 1), 
                    C = c(4, 0, 0), 
                    D = c(2, 0, 0),
                    row.names = c("Label1", "Label2", "Label3"))

mydata

       A B C D
Label1 0 1 4 2
Label2 0 0 0 0
Label3 2 1 0 0


library(dplyr)
add_data = data.frame(C = 20, D = 20, row.names = "New.Label")
mydata <- bind_rows(mydata, add_data)

mydata

           A  B  C  D
Label1     0  1  4  2
Label2     0  0  0  0
Label3     2  1  0  0
New.Label NA NA 20 20

If you want to replace the NA with 0 just add this line

mydata[is.na(mydata)] <- 0


mydata

          A B  C  D
Label1    0 1  4  2
Label2    0 0  0  0
Label3    2 1  0  0
New.Label 0 0 20 20
  • Related