Home > front end >  R - Transpose empty wide dataframe to long no ID vars
R - Transpose empty wide dataframe to long no ID vars

Time:04-20

I am very new to R so please excuse any stupid questions.

I have a dataframe which consists of multiple columns, lets for argument sake say A through E, so A, B, C, D, E

I want to transpose these columns to create one new column called Variable which will hold the values of A, B, C, D, E all populating down in a new dataframe

Variable
A
B
C
D
E

Is this possible as nothing I have tried works such as MELT or t(x) etc

So I have:

enter link description here

I need:

enter link description here

Just to say also, the variables will greatly differ from one file to another, they will not always be the same names, so need something like allvars etc rather than naming them ?

Thanks

Colm

CodePudding user response:

You can use stack to Stack Vectors from a Data Frame.

stack(x)
#  values ind
#1      1   A
#2      2   A
#3      3   A
#4      4   B
#5      5   B
#6      6   B
#7      7   C
#8      8   C
#9      9   C

Or do you want something like?

data.frame(variable = names(x))
#  variable
#1        A
#2        B
#3        C

Data:

x <- data.frame(A = 1:3, B=4:6, C=7:9)

CodePudding user response:

Here is a update with base R. I think I now understand what you want.

  1. Create a dataframe df
  2. Then use colnames() function to store them in a new dataframe with the column called Variable:

Update:

df <- data.frame(VARIABLE1=character(),
                 VARIABLE2=character(), 
                 VARIABLE3=character(), 
                 VARIABLE4=character(),
                 VARIABLE5=character(),
                 stringsAsFactors=FALSE) 

df_result <- data.frame(Variable = c(colnames(df)))
df_result
   Variable
1 VARIABLE1
2 VARIABLE2
3 VARIABLE3
4 VARIABLE4
5 VARIABLE5

First answer:

library(tidyverse)

df <- tribble(
  ~A, ~B, ~C, ~D, ~E,
  NA_character_, NA_character_, NA_character_, NA_character_, NA_character_)

df %>% 
  pivot_longer(
    everything()
  ) %>% 
  select(1)
  name 
  <chr>
1 A    
2 B    
3 C    
4 D    
5 E   

CodePudding user response:

If I understand you correctly, you can use the following code. First, create fake data:

df <- data.frame(A = runif(10, 0, 1),
                 B = runif(10, 0, 1),
                 C = runif(10, 0, 1),
                 D = runif(10, 0, 1),
                 E = runif(10, 0, 1))

df

Output:

           A          B         C          D          E
1  0.3711955 0.60529893 0.4139218 0.38474364 0.33032247
2  0.5689305 0.08455396 0.4547466 0.50095761 0.44413510
3  0.3220088 0.42268926 0.3171933 0.30313125 0.09273389
4  0.3734473 0.90371386 0.8634857 0.66372492 0.42984325
5  0.5363005 0.64373088 0.2279042 0.11793754 0.12238873
6  0.2101622 0.64878858 0.0329526 0.01596763 0.26302076
7  0.9438709 0.84500107 0.5446587 0.27072456 0.57295920
8  0.4185947 0.33992763 0.4755329 0.36225772 0.47469022
9  0.7987066 0.84397516 0.3679846 0.87131562 0.95563427
10 0.7866157 0.46601434 0.5662175 0.66513184 0.29728068

Next use melt from the reshape package like this:

library(reshape)
melt(df)

Output:

Using  as id variables
   variable      value
1         A 0.37119552
2         A 0.56893045
3         A 0.32200884
4         A 0.37344734
5         A 0.53630053
6         A 0.21016220
7         A 0.94387092
8         A 0.41859472
9         A 0.79870656
10        A 0.78661568
11        B 0.60529893
12        B 0.08455396
13        B 0.42268926
14        B 0.90371386
15        B 0.64373088
16        B 0.64878858
17        B 0.84500107
18        B 0.33992763
19        B 0.84397516
20        B 0.46601434
21        C 0.41392181
22        C 0.45474660
23        C 0.31719334
24        C 0.86348573
25        C 0.22790422
26        C 0.03295260
27        C 0.54465868
28        C 0.47553287
29        C 0.36798461
30        C 0.56621750
31        D 0.38474364
32        D 0.50095761
33        D 0.30313125
34        D 0.66372492
35        D 0.11793754
36        D 0.01596763
37        D 0.27072456
38        D 0.36225772
39        D 0.87131562
40        D 0.66513184
41        E 0.33032247
42        E 0.44413510
43        E 0.09273389
44        E 0.42984325
45        E 0.12238873
46        E 0.26302076
47        E 0.57295920
48        E 0.47469022
49        E 0.95563427
50        E 0.29728068

As you can see it return a column named variable with the columnnames (A,B etc) and a column called value which returns the data.

CodePudding user response:

Lets say you have df:

set.seed(123)
df <- data.frame(VARIABLE1 = runif(1),
                VARIABLE2 = runif(1),
                VARIABLE3 = runif(1),
                VARIABLE4 = runif(1),
                VARIABLE5 = runif(1))

df

  VARIABLE1 VARIABLE2 VARIABLE3 VARIABLE4 VARIABLE5
1 0.2875775 0.7883051 0.4089769 0.8830174 0.9404673

Then you can use tidyr to do:

library(tidyr)
pivot_longer(df, everything())

# A tibble: 5 × 2
  name      value
  <chr>     <dbl>
1 VARIABLE1 0.288
2 VARIABLE2 0.788
3 VARIABLE3 0.409
4 VARIABLE4 0.883
5 VARIABLE5 0.940
  • Related