Home > OS >  Coverting a set of commands into a function for manipulating dataframe in R
Coverting a set of commands into a function for manipulating dataframe in R

Time:06-03

My data frame (DF1) looks like this :-

[Representative, DF1] [1]: https://i.stack.imgur.com/wZjzR.png

Doing the following for row 1 (MED1):

MED1 <- data_frame_merge[1,]
rownames(MED1) <- NULL

MED1 <- t(MED1)
MED1 <- as.data.frame(MED1)
MED1 <- tibble::rownames_to_column(MED1, "Fusion_Type")

MED1$Fusion_Type <- gsub("\\..*", "", MED1$Fusion_Type)
MED1$Fusion_Type <- as.factor(MED1$Fusion_Type)

names(MED1)[names(MED1) == "V1"] <- "TPM"

I get this:

[MED1, DF] [2]: https://i.stack.imgur.com/hhMf6.png

Basically, I am extracting first row from the 1st data frame and converting it into a data frame to look a certain way as (see image above).

My question is, how do I convert what I did with MED1 into a function so I can do this transformation for all rows I have in the first data frame.

I tried using for loops and functions several times but always run into errors.

I tried the code below but I am pretty sure its fraught with errors:

lsEOG<-list()

for (i in 1:nrow(df_main)) {
  rownames(df)[i] <- df[i]
  df[i] <- df[i]
  rownames(df)[i] <- NULL
  df[i] <- t(df[i])
  df[i] <- as.data.frame(df[i])
  df[i] <- tibble::rownames_to_column(df[i], "Fusion_Type")
  df[i]$Fusion_Type <- gsub("\\..*", "", df[i]$Fusion_Type)
  df[i]$Fusion_Type <- as.factor(df[i]$Fusion_Type)
  names(df[i])[names(df[i]) == "V1"] <- "TPM"
  lsEOG(df[i])<- df[i]
}

Error in rownames(df)[i] <- NULL : replacement has length zero (amongst many others)

Could you please help me fix it by turning the code into a function?

So that would be turning the below,

MED1 <- data_frame_merge[1,]
rownames(MED1) <- NULL

MED1 <- t(MED1)
MED1 <- as.data.frame(MED1)
MED1 <- tibble::rownames_to_column(MED1, "Fusion_Type")

MED1$Fusion_Type <- gsub("\\..*", "", MED1$Fusion_Type)
MED1$Fusion_Type <- as.factor(MED1$Fusion_Type)

names(MED1)[names(MED1) == "V1"] <- "TPM"

into a function? So it can be applied to all the rows of the data frame and outputs

CodePudding user response:

Try this:

library(tidyverse)

as_tibble(data_frame_merge,rownames = "MED") %>% 
  pivot_longer(cols = -MED, names_to = "Fusion_Type",values_to = "TMP") %>% 
  mutate(Fusion_Type = stringr::str_extract(Fusion_Type, "(?<=\\d_).*$"))

Output:

# A tibble: 28 × 3
   MED   Fusion_Type   TMP
   <chr> <chr>       <dbl>
 1 MED1  PML_RARA     57.5
 2 MED1  PML_RARA    178. 
 3 MED1  PML_RARA     20.6
 4 MED1  PML_RARA    139. 
 5 MED10 PML_RARA    158. 
 6 MED10 PML_RARA    110. 
 7 MED10 PML_RARA    180. 
 8 MED10 PML_RARA    128. 
 9 MED11 PML_RARA     81.8
10 MED11 PML_RARA     91.3
# … with 18 more rows

Input:

data_frame_merge = structure(list(`TCGA-AB-2991_PML_RARA` = c(57.5155040249228, 
157.661027088761, 81.79538436234, 176.603480800986, 188.093456858769, 
9.11129987798631, 105.621097609401), `TCGA-AB-3012_PML_RARA` = c(178.483808878809, 
110.287002893165, 91.3229470606893, 191.366669069976, 90.6668312381953, 
135.514127090573, 114.526680391282), `TCGA-AB-2872_PML_RARA` = c(20.5849365331233, 
179.964994080365, 49.217546870932, 8.41190670616925, 65.5841438565403, 
190.900729829445, 177.907863212749), `TCGA-AB-2999_PML_RARA` = c(138.56068123132, 
128.101362753659, 198.853955324739, 131.141159823164, 141.706093633547, 
108.813204942271, 118.828404089436)), class = "data.frame", row.names = c("MED1", 
"MED10", "MED11", "MED12", "MED12L", "MED13", "MED13L"))

data_frame_merge looks like this:

       TCGA-AB-2991_PML_RARA TCGA-AB-3012_PML_RARA TCGA-AB-2872_PML_RARA TCGA-AB-2999_PML_RARA
MED1                57.51550             178.48381             20.584937              138.5607
MED10              157.66103             110.28700            179.964994              128.1014
MED11               81.79538              91.32295             49.217547              198.8540
MED12              176.60348             191.36667              8.411907              131.1412
MED12L             188.09346              90.66683             65.584144              141.7061
MED13                9.11130             135.51413            190.900730              108.8132
MED13L             105.62110             114.52668            177.907863              118.8284
  • Related