I have a dataframe with lots of NaN
in various columns.
df <- data.frame(
Data1 = c(3,2,1,NaN, NaN, NaN),
Data2 = c(NaN, NaN, NaN, 3,5,3),
Data3 = c(NaN, NaN, 7,5,1, NaN)
)
I'm trying to get rid of the NaN
values by using pivot_longer
, filter
ing on the NaN
values and using pivot_wider
to put the positive numbers back again into their original columns, which, however, fails:
library(tidyr)
df %>%
pivot_longer(c("Data1","Data2","Data3")) %>%
filter(!is.na(value)) %>%
pivot_wider(names_from = name,
values_from = value)
# A tibble: 1 x 3
Data1 Data3 Data2
<list> <list> <list>
1 <dbl [3]> <dbl [3]> <dbl [3]>
Warning message:
Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates
What's wrong with the code and how can this output be achieved?
Data1 Data2 Data3
3 3 7
2 5 5
1 3 1
CodePudding user response:
This particular problem could be neatly solved using purrr
:
map_dfr(df, na.omit)
Data1 Data2 Data3
<dbl> <dbl> <dbl>
1 3 3 7
2 2 5 5
3 1 3 1
CodePudding user response:
The code doesn't necessarily fail but return a warning since you have more than one value in each cell. If the number of values in each column are going to be the same you can unnest
the list output.
library(dplyr)
library(tidyr)
df %>%
pivot_longer(starts_with('Data'), values_drop_na = TRUE) %>%
arrange(name) %>%
pivot_wider(names_from = name,values_from = value, values_fn = list) %>%
unnest()
# Data1 Data2 Data3
# <dbl> <dbl> <dbl>
#1 3 3 7
#2 2 5 5
#3 1 3 1