Home > Enterprise >  Bring scattered variable values into clean single-line wide format
Bring scattered variable values into clean single-line wide format

Time:01-06

I am not sure on how to name my problem and therefore was unable to search for it properly.

The problem I have is, that I converted a json-txt-file in R into a dataframe but the values of my variables a scattered all about resulting in 37200 observations with lots of empty cells (NA) where there should only be 600.

It looks something like this:

   Var1    Var2   Var3  Var4   Var5   Var6   ID
1   4       NA     NA    NA     NA     NA    1
2   NA      3      2     NA     NA     NA    1
3   NA      NA     NA    1      8      NA    1
4   NA      NA     NA    NA     NA     7     1

What I want however, is the following:

    Var1    Var2   Var3  Var4   Var5   Var6   ID
1   4       3      2     1       8     7      1

So I don‘t know if this is a classic long-wide format problem or if its different (I actually have all the variables/columns I want already) Searching the former didn’t really bring me a propper solution. I had a somewhat similar problem once that I solved with creating lagged variables - however, in this dataset I have 146 variables, so I was wondering if there’s an easier solution?

This question : NA values and extra rows when spreading repeated measures of multiple variables into wide format? goes a bit in the direction of my problem, however with 146 variables I still hope there might be a better solution.

CodePudding user response:

library(tidyverse)

df %>% 
  pivot_longer(-ID) %>% 
  drop_na() %>%  
  pivot_wider(names_from = name, 
              values_from = value)

# A tibble: 1 × 7
     ID  Var1  Var2  Var3  Var4  Var5  Var6
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     4     3     2     1     8     7
  • Related