Home > OS >  How to modify a specific column in each dataframe in a list using lapply instead of a for loop
How to modify a specific column in each dataframe in a list using lapply instead of a for loop

Time:01-12

I have a list of dataframes and am currently using the following for loop:

  for (i in 1:length(genotypeGOI)){
    genotypeGOI[[i]]$SEQSTRAND <- '*'
  }

But I'd really like to learn how to use lapply properly.

I've tried many different options using the mutate function but nothing is giving me what I want. My latest attempt is:

genotypeGOI <- lapply(X = genotypeGOI, FUN = function(x){
  x <- x$SEQSTRAND, '*')
})

But this is giving me an error:

Error: unexpected ',' in:
"genotypeGOI <- lapply(X = genotypeGOI, FUN = function(x){
  x <- x$SEQSTRAND,"

Basically I would like to know how to change the values in a specific column for each dataframe in a list using lapply and don't really care about this specific problem.

I've looked at the other posted questions related to this and the most similar one says to make a function and to call that in lapply but I really don't want to do that for a one-liner.

Thanks

CodePudding user response:

Your command lapply(genotypeGOI, FUN = function(x) x <- x$SEQSTRAND,'*') doesn't quite make sense in R coding. Even still, if you just did lapply(genotypeGOI, FUN = function(x) x[,"SEQSTRAND"] <- '*') is closer but will still not return what you want:

df <- data.frame(ID = 1:10,
                 X = letters[1:10],
                 SEQSTRAND = NA)

lapply(genotypeGOI, FUN = function(x) x[,"SEQSTRAND"] <- '*')

genotypeGOI <- list(df, df, df)

#[[1]]
#[1] "*"

#[[2]]
#[1] "*"

#[[3]]
#[1] "*" 

To return the data frame with SEQSTRAND as a *, you can use lapply like this (returning the x value)

lapply(genotypeGOI, function(x) {
  x[,"SEQSTRAND"] <- "*"
  x
})

# [[1]]
#      ID X SEQSTRAND
#   1   1 a         *
#   2   2 b         *
#   3   3 c         *
#   4   4 d         *
#   5   5 e         *
#   6   6 f         *
#   7   7 g         *
#   8   8 h         *
#   9   9 i         *
#   10 10 j         *
#   
#   [[2]]
#      ID X SEQSTRAND
#   1   1 a         *
#   2   2 b         *
#   3   3 c         *
#   4   4 d         *
#   5   5 e         *
#   6   6 f         *
#   7   7 g         *
#   8   8 h         *
#   9   9 i         *
#   10 10 j         *
#   
#   [[3]]
#      ID X SEQSTRAND
#   1   1 a         *
#   2   2 b         *
#   3   3 c         *
#   4   4 d         *
#   5   5 e         *
#   6   6 f         *
#   7   7 g         *
#   8   8 h         *
#   9   9 i         *
#   10 10 j         *

CodePudding user response:

If you'd like to stick with mutate as had appeared in the original question, this answer'll work:

dd<-data.frame(Value=c(1,5,7,9))
ee<-data.frame(Value=c(2,4,6,8))

ff<-list(dd,ee)

gg<-lapply(ff, FUN=function(x){
  x%>%
    mutate(SEQSTRAND ="*")
})

We start by making a list of dataframes with ff<-list(dd,ee). When using the pipe (%>%) in the lapply, since we've specified our x to be the list named "ff", we write our function to be applied as a reference to x.

  • Related