I am not sure on how to name my problem and therefore was unable to search for it properly.
The problem I have is, that I converted a json-txt-file in R into a dataframe but the values of my variables a scattered all about resulting in 37200 observations with lots of empty cells (NA) where there should only be 600.
It looks something like this:
Var1 Var2 Var3 Var4 Var5 Var6 ID
1 4 NA NA NA NA NA 1
2 NA 3 2 NA NA NA 1
3 NA NA NA 1 8 NA 1
4 NA NA NA NA NA 7 1
What I want however, is the following:
Var1 Var2 Var3 Var4 Var5 Var6 ID
1 4 3 2 1 8 7 1
So I don‘t know if this is a classic long-wide format problem or if its different (I actually have all the variables/columns I want already) Searching the former didn’t really bring me a propper solution. I had a somewhat similar problem once that I solved with creating lagged variables - however, in this dataset I have 146 variables, so I was wondering if there’s an easier solution?
This question : NA values and extra rows when spreading repeated measures of multiple variables into wide format? goes a bit in the direction of my problem, however with 146 variables I still hope there might be a better solution.
CodePudding user response:
library(tidyverse)
df %>%
pivot_longer(-ID) %>%
drop_na() %>%
pivot_wider(names_from = name,
values_from = value)
# A tibble: 1 × 7
ID Var1 Var2 Var3 Var4 Var5 Var6
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 4 3 2 1 8 7