Home > Software engineering >  Reshape a dataframe from wide to long with different row lengths
Reshape a dataframe from wide to long with different row lengths

Time:03-20

Here is a dataframe.

     ID  Gene1   Gene2  Gene3    Gene4
2003959 PIK3CA  EPSTI1                  
2003421 AKT1    BTK     GATA3   MAP3K1          
2100019 MAP2K4
2005456 GATA3   RAD50                   
2003141 KDM6A   SF3B1                   
2103689 AKT2    EGFR     MAP2    PHKA2

I want to reshapre it from wide to long. Here is an expected output.

     ID    Var1
2003959  PIK3CA 
2003959  EPSTI1
2003421    AKT1
2003421     BTK
2003421   GATA3
2003421  MAP3K1
2100019  MAP2K4
...

CodePudding user response:

If your empty values are NA we could use pivot_longer. In case the empty values are empty strings "". Then first replace it with NA using na_if with across.

library(dplyr)
library(tidyr)

#case1 

df %>% 
  pivot_longer(
    -ID
  ) %>% 
  na.omit() %>% 
  select(-name)

# or 

# case 2
df %>% 
  mutate(across(-ID, ~na_if(.,""))) %>% 
  pivot_longer(
    -ID
  ) %>% 
  na.omit() %>% 
  select(-name)
        ID value 
     <int> <chr> 
 1 2003959 PIK3CA
 2 2003959 EPSTI1
 3 2003421 AKT1  
 4 2003421 BTK   
 5 2003421 GATA3 
 6 2003421 MAP3K1
 7 2100019 MAP2K4
 8 2005456 GATA3 
 9 2005456 RAD50 
10 2003141 KDM6A 
11 2003141 SF3B1 
12 2103689 AKT2  
13 2103689 EGFR  
14 2103689 MAP2  
15 2103689 PHKA2 
  •  Tags:  
  • r
  • Related