I have the following data set:
Data1 Data2 Data3
3 NAN NAN
2 NAN NAN
3 NAN NAN
NAN 3 NAN
NAN 5 NAN
NAN 3 NAN
NAN NAN 7
NAN NAN 5
NAN NAN 1
I'd like to summarise it into this:
Data1 Data2 Data3
3 3 7
2 5 5
3 3 1
I've trying creating a row number, grouping by row number and applying summarise
, but it's just filling the NaNs with 0.
data = data %>%
mutate(row = row_number()) %>%
dplyr::group_by(row) %>%
dplyr::summarise(across(c("Data1","Data2","Data3"), ~sum(., na.rm=T))) %>%
distinct(.)
CodePudding user response:
If you have same number of NaN
's in each column as shown in the example you can use na.omit
to drop those values.
library(dplyr)
df %>% summarise(across(.fns = na.omit))
#If in your data values are string 'NAN' then use the below
#df %>% summarise(across(.fns = ~.x[.x!= 'NAN']))
# Data1 Data2 Data3
#1 3 3 7
#2 2 5 5
#3 3 3 1
In base R -
as.data.frame(sapply(df, na.omit))
data
It is easier to help if you provide data in a reproducible format
df <- structure(list(Data1 = c(3, 2, 3, NaN, NaN, NaN, NaN, NaN, NaN
), Data2 = c(NaN, NaN, NaN, 3, 5, 3, NaN, NaN, NaN), Data3 = c(NaN,
NaN, NaN, NaN, NaN, NaN, 7, 5, 1)), row.names = c(NA, -9L), class = "data.frame")
CodePudding user response:
If you just want to compute the sum
s then this will do:
data %>%
summarise(across(c("Data1","Data2","Data3"), ~sum(., na.rm=T)))
EDIT:
You can ge rid of the NaN
values by using a combination of pivot_longer
and pivot_wider
:
data %>%
pivot_longer(starts_with('Data'), values_drop_na = TRUE) %>%
arrange(name) %>%
pivot_wider(names_from = name, values_from = value, values_fn = list) %>%
unnest()
# A tibble: 3 x 3
Data1 Data2 Data3
<dbl> <dbl> <dbl>
1 3 3 7
2 2 5 5
3 1 3 1
Or even more nicely:
library(purrr)
map_dfr(df, na.omit)