Home > Software design >  Copying values from other columns based on the column names given in a single column in R
Copying values from other columns based on the column names given in a single column in R

Time:01-01

I have a big dataset in R, and one of the columns has specific column names (target) for each row, as you can see below. What I want to do is to check every row for the column name, copy the values from that specific column, and paste it back into the relevant row of the "target" column.

target  col1         col2      col3 
col1    green        one       dog
col1    pink         two       cat
col3    blue         three     spider  
col2    black        four      pig
col3    purple       five      elephant
col2    yellow       six       lion

This is my desired outcome:

target  col1         col2      col3 
green   green        one       dog
pink    pink         two       cat
spider  blue         three     spider  
four    black        four      pig
fox    purple       five       fox
six    yellow       six       lion

I tried the code below, but after first a few hundreds row, it copies the wrong values.

df$target[df$target=="[col1]"] <- df$col1

df$target[df$target=="[col2]"] <- df$col2

df$target[df$target=="[col3]"] <- df$col3

df$target[df$target=="[col4]"] <- df$col4

Any suggestions? Thanks in advance

CodePudding user response:

You can use sapply:

df$test <- sapply(1:nrow(df), 
                  function(x) df[x, 1] <- df[x, names(df) == df[x,1]])

Output:

#   target   col1  col2     col3     test
# 1   col1  green   one      dog    green
# 2   col1   pink   two      cat     pink
# 3   col3   blue three   spider   spider
# 4   col2  black  four      pig     four
# 5   col3 purple  five elephant elephant
# 6   col2 yellow   six     lion      six

Note I created a test column to check but you could just overwrite target

Data

df <- read.table(text = "target  col1         col2      col3 
col1    green        one       dog
col1    pink         two       cat
col3    blue         three     spider  
col2    black        four      pig
col3    purple       five      elephant
col2    yellow       six       lion", header = TRUE)

CodePudding user response:

The base vectorized approach: If you use a two-column numeric matrix with "[" you can get values for the rows and columns.

dat <- read.table(text="target  col1         col2      col3 
col1    green        one       dog
col1    pink         two       cat
col3    blue         three     spider  
col2    black        four      pig
col3    purple       five      elephant
col2    yellow       six       lion", head=TRUE)
# extract col number from first column and add one to offset it
# then use that as the column numer for 
dat[ matrix( c( 1:nrow(dat), as.numeric(gsub("col", "", dat[[1]])) 1 ), ncol=2) ]
 [1] "green"    "pink"     "spider"   "four"     "elephant" "six"    

Then just assign it to column "target".

>  dat[, 1] <- dat[ matrix( c( 1:nrow(dat), as.numeric(gsub("col", "", dat[[1]])) 1 ), ncol=2) ]
> dat
    target   col1  col2     col3
1    green  green   one      dog
2     pink   pink   two      cat
3   spider   blue three   spider
4     four  black  four      pig
5 elephant purple  five elephant
6      six yellow   six     lion
  •  Tags:  
  • r
  • Related