R: How do I replace a column with another column from a different dataframe?-CodePudding

I want to replace the 6th column of the ped dataframe with the Phenotype column from bipolar_ctl dataframe.

My attempt:

dplyr::mutate_at(ped, vars(-one_of("bipolar_ctl$Phenotype")))

CodePudding user response：

Using base R,

ped[,6] = bipolar_ctl$Phenotype

CodePudding user response：

another option would be:

ped$sixth_col = bipolar_ctl$Phenotype

CodePudding user response：

Beware of the possible occurrence of different row orders in the two data sets, which easily may happen in a long script.

You can easily take this into account by using match:

ped[,6] <- bipolar_ctl[match(ped$id1, bipolar_ctl$id2), ]$Phenotype
ped
#   id1        V2        V3        V4        V5        V6        V7
# 1   1 0.9148060 0.5190959 0.4577418 0.9400145 0.4357716 0.5142118
# 2   2 0.9370754 0.7365883 0.7191123 0.9782264 0.9066014 0.3902035
# 3   3 0.2861395 0.1346666 0.9346722 0.1174874 0.6117786 0.9057381
# 4   4 0.8304476 0.6569923 0.2554288 0.4749971 0.2076590 0.4469696
# 5   5 0.6417455 0.7050648 0.4622928 0.5603327 0.3795592 0.8360043

Or, use merge.

merge(ped[-6], bipolar_ctl[c('id2', 'Phenotype')], by.x='id1', by.y='id2')
#   id        V2        V3        V4        V5        V7 Phenotype
# 1  1 0.9148060 0.5190959 0.4577418 0.9400145 0.5142118 0.4357716
# 2  2 0.9370754 0.7365883 0.7191123 0.9782264 0.3902035 0.9066014
# 3  3 0.2861395 0.1346666 0.9346722 0.1174874 0.9057381 0.6117786
# 4  4 0.8304476 0.6569923 0.2554288 0.4749971 0.4469696 0.2076590
# 5  5 0.6417455 0.7050648 0.4622928 0.5603327 0.8360043 0.3795592

Alternatively use a stopifnot.

stopifnot(identical(ped$id1, bipolar_ctl$id2))
# Error: identical(ped$id1, bipolar_ctl$id2) is not TRUE

Data (with different id order):

set.seed(42)
ped <- cbind(id1=1:5, matrix(runif(30), 5, 6)) |> as.data.frame()
bipolar_ctl <- data.frame(id2=sample(1:5), Phenotype=runif(5))