Home > Software design >  use pivot_longer and pivot_wider in combination
use pivot_longer and pivot_wider in combination

Time:10-05

I have a dataframe with lots of NaNin various columns.

df <- data.frame(
  Data1 = c(3,2,1,NaN, NaN, NaN),
  Data2  = c(NaN, NaN, NaN, 3,5,3),
  Data3 = c(NaN, NaN, 7,5,1, NaN)
)

I'm trying to get rid of the NaN values by using pivot_longer, filtering on the NaNvalues and using pivot_wider to put the positive numbers back again into their original columns, which, however, fails:

library(tidyr)
df %>%
  pivot_longer(c("Data1","Data2","Data3")) %>%
  filter(!is.na(value)) %>%
  pivot_wider(names_from = name,
              values_from = value)
# A tibble: 1 x 3
  Data1     Data3     Data2    
  <list>    <list>    <list>   
1 <dbl [3]> <dbl [3]> <dbl [3]>
Warning message:
Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates 

What's wrong with the code and how can this output be achieved?

  Data1 Data2 Data3
      3     3     7
      2     5     5
      1     3     1

CodePudding user response:

This particular problem could be neatly solved using purrr:

map_dfr(df, na.omit)

  Data1 Data2 Data3
  <dbl> <dbl> <dbl>
1     3     3     7
2     2     5     5
3     1     3     1

CodePudding user response:

The code doesn't necessarily fail but return a warning since you have more than one value in each cell. If the number of values in each column are going to be the same you can unnest the list output.

library(dplyr)
library(tidyr)

df %>%
  pivot_longer(starts_with('Data'), values_drop_na = TRUE) %>%
  arrange(name) %>%
  pivot_wider(names_from = name,values_from = value, values_fn = list) %>%
  unnest()

#  Data1 Data2 Data3
#  <dbl> <dbl> <dbl>
#1     3     3     7
#2     2     5     5
#3     1     3     1
  • Related