Home > other >  t function for transpose dataframe is giving wrong values
t function for transpose dataframe is giving wrong values

Time:04-23

I have a .tsv file that I´m loaded to R with the correct parameters. This is a data frame that I want to transpose (here is the explanation: https://www.dataanalytics.org.uk/rotating-or-transposing-r-objects/). Data frame:

GCA_910579985.1_MGBC115109_genomic_prkk
OG0000000                                      20
OG0000001                                       9
OG0000002                                      13
OG0000003                                       6
OG0000004                                      14
OG0000005                                       4
GCA_910578415.1_MGBC110107_genomic_prkk
OG0000000                                      15
OG0000001                                       6
OG0000002                                       8
OG0000003                                       5
OG0000004                                       3
OG0000005                                       3

I´m using the t-function for this purpose:

tdata<-t(data)

I already force the variable data into the data frame and into a matrix just to test Then, there is an error, because all the matrix becomes 0, 1 and 2

     OG0000000       OG0000001       OG0000002       OG0000003       OG0000004       OG0000005
GCA_910579985.1_MGBC115109_genomic_prkk 0       0       0       1       1       0
GCA_910578415.1_MGBC110107_genomic_prkk 1       1       1       0       0       2

This is wrong because the original values are changing, for example, this is my expected result:

     OG0000000       OG0000001       OG0000002       OG0000003       OG0000004       OG0000005
GCA_910579985.1_MGBC115109_genomic_prkk 20       9       13       6       14       4
GCA_910578415.1_MGBC110107_genomic_prkk 15       6       8       5       3       3

So, clearly, here the function is changing the original values. Any idea or comment?

CodePudding user response:

First off, please always make your example data easily available to others by e.g. pasting the output of dput(your_data). Your data, as dataframe df are these:

df <- 
## output of dput
structure(list(ID = c("OG0000000", "OG0000001", "OG0000002", 
"OG0000003", "OG0000004", "OG0000005"), GCA_910579985.1_MGBC115109_genomic_prkk = c(15L, 6L, 8L, 5L, 3L, 3L), GCA_910578415.1_MGBC110107_genomic_prkk = c(20L, 
9L, 13L, 6L, 14L, 4L)), class = "data.frame", row.names = c(NA, 
6L))

The type conversion from numeric to character occurs upon transposition t of a dataframe. You can circumvent it as follows (df being your dataframe):

library(dplyr) ## provides the pipe operator %>%

df %>% as.matrix %>% t %>% as.data.frame

While the pipe operator %>% is not necessary, I used it to emphasise the order of transformation. It's equivalent to as.data.frame(t(as.matrix(df))).

  •  Tags:  
  • r
  • Related