Home > Enterprise >  Grouping a mass spectrometry data in R by loop
Grouping a mass spectrometry data in R by loop

Time:07-07

I want to make group of every 3 columns in different group in the data frame. I have attached the example data frame image. It is having 12 columns.

Example data frame 1

Based on the pattern I have to group the columns. For 6 column data frame, the user can give the pattern like (0,0,0,1,1,1) and I splitted the data frame by following way.

data = as.matrix(input_data)
  
    id0 = which(class == 0)
    id1 = which(class == 1)
  
    data0 <- data[,id0]
    data1 <- data[,id1]

By this way, I can do for 12 column data frame also.

But the data frame is not a fixed one. It is a mass spectrometry expression dataset. It may have any number of columns, but all are triplicate data means they can divided by 3.

class = c(0,0,0,1,1,1)
length(class)
x <- length(class)/3 
x

The above code I have tried for finding out the required number of groups.

So, kindly provide any loop operation for this problem. Because the dataframe may be change for every experiment.

CodePudding user response:

Let's write this script with a dummy data

df<-data.frame(rep(data.frame(n1=c(0:6),n2=c(0:6)),3))

stack_scipt1<-function(df,replicate_value){   #df= dataframe, replicate value= no, of replicates input by user
df<-read.csv(file.choose(),header = TRUE)       #loads csv file into R as dataframe from pc folder where it is stored
df_col<-ncol(df)                                #calculates no. of columns in dataframe
groups<-sort(rep(0:((df_col/replicate_value)-1),replicate_value)) #creates user determined groups
id<-list()                                      #creates empty list
for (i in 1:length(unique(groups))){
id[[i]]<-which(groups == unique(groups)[i])}  #creates list of groups
names(id)<-paste0("id",unique(groups))          #assigns group based names to the list "id"
data<-list()                                    #creates empty list
for (i in 1:length(id)){
data[[i]]<-df[,id[[i]]]}                      #creates list of dataframe columns sorted by groups
names(data)<-paste0("data",unique(groups))      #assigns group based names to the list "data"
return(data)}

Executing the script:

>stack_script1(df,3)

Note:

  1. This script will work for dataframes of any column number as long as column number is a multiple of replicate value
  2. The line df<-read.csv(file.choose(),header = TRUE) is not compulsory if you have imported the data into R previously. I prefer to use it because it allows me to import data from save folder without needing to import it additionally
  3. The output is a list containing grouped data frames. It's not possible to return multiple dataframe vectors from a script
  • Related