Home > Software design >  Splitting merged data into columns
Splitting merged data into columns

Time:12-15

I have a dataframe containing a column of the following format, where I have Vars separated by commas ordered by count, with their respective counts after the colon

df1 <- data.frame(A = c('Var1:1', 'Var1:2,Var2:1', 'Var3:2,Var2:1'))

A
Var1:1
Var1:2,Var2:1
Var2:1,Var3:1

I've separated these out with:

df1_split <- data.frame(dplyr::str_split(df1, A, pattern = ":|,", simplify = TRUE))

Which yields:

X1 X2 X3 X4
Var1 1
Var1 2 Var2 1
Var3 2 Var2 1

As you can see, sometimes variables are in column X1, but sometimes they appear in column X3. So Vars are always in odd columns, and their counts from the : are always in the proceeding even column#

My end goal is to have a dataframe with columns in the format:

Var1 Var 2 Var 3
1
2 1
1 2

I've tried using a function, for loop and case when for this, as I have 19 unique variable names in my original dataframe but they don't always appear in the same order.

fVARCOUNT <- function(x, y){`
  # x is df_split, y are the unique Var nanmes e.g. Var 1, Var 2
  df_mutated <- df_split
  # Loop over each unique variable
  for(i in y){
    df_mutated <- df_mutated %>%
    # Check if that variable is in one of X1, X3, X5 and get the value from the proceeding column if it is
    dplyr::mutate('{{i}}' := dplyr::case_when(X1 == i ~ X2,
                                X3 == i ~ X4))
  }
}
  

However, the result I'm getting is as follows:

Var1 Var2 Var 3
1
2
2

It's as though case_when is not checking beyond X1.

CodePudding user response:

Data

df1 <- data.frame(A = c('Var1:1', 'Var1:2,Var2:1', 'Var3:2,Var2:1'))

Code

library(tidyr)
library(dplyr)

df1 %>% 
  mutate(id = row_number()) %>% 
  separate_rows(A,sep = ",") %>% 
  separate(col = A,into = c("name","value")) %>% 
  pivot_wider(names_from = name,values_from = value)

Output

# A tibble: 3 x 4
     id Var1  Var2  Var3 
  <int> <chr> <chr> <chr>
1     1 1     NA    NA   
2     2 2     1     NA   
3     3 NA    1     2  
  • Related