Home > Enterprise >  how to merge and summerise rows based on 2 columns of dataframe in r
how to merge and summerise rows based on 2 columns of dataframe in r

Time:05-31

I have below data frame as df1:

    Date        id  Age  B   R       S
1   00/01/16    223 55  7.9  5.65   138
2   00/01/16    223 55  NA   NA      NA
3   00/01/16    223 55  NA   NA      NA
4   00/01/17    223 55  NA   NA      NA
5   00/01/17    223 55  9.6  5.71   135
6   00/01/17    223 55  NA   NA      NA
7   00/01/18    223 55  NA   NA      NA
8   00/01/18    223 55  NA   NA      NA
9   00/01/18    223 55 11.5  6.11   135
10  00/01/17    223 55  NA   NA      NA
11  00/01/05    102 60  NA   NA     135
12  00/01/05    102 60  19.7 5.5     NA
13  00/01/05    102 60  NA   NA      NA
14  00/01/05    102 60  NA   NA      NA
15  00/01/06    102 60  18.5 5.34   144
16  00/01/06    102 60  NA   NA      NA
17  00/01/06    102 60  NA   NA      NA

First I need to merge rows based on "id" and then merge rows based on "Date".My problem is not omited raws with NA.for example, in raws No. 11 and 12, I have to select between 135 and 143 for "S" column. Finally, my out put should be as below data frame (df2):

      Date       id  Age     B     R     S
1   00/01/16    223  55     7.9   5.65  138
2   00/01/17    223  55     9.6   5.71  135
3   00/01/18    223  55     11.5  6.11  135
4   00/01/05    102  60     19.7  5.5   135
5   00/01/06    102  60     18.5  5.34  144

I wrote below code:

df2 <- df1 %>% 
  group_by(Date,id) %>% 
  summarise_all(funs(na.omit))

but I got the below error:

Error: Problem with `summarise()` column `S`.
i `S = na.omit(S)`.
x `S` must be size 6 or 1, not 0.
i An earlier column had size 6.
i The error occurred in group 1: Request_Date = "00/01/05", Patient.Code = 223

I appreciate it if anybody shares his/her comment with me.

Bests Regards

CodePudding user response:

Seems that you're only deleting rows with NAs:

df1 |> complete.cases()

CodePudding user response:

Turning data into a long format, and then back to wide should do something similar, I think. Try this:

library(tidyr)

df2 = df %>% 
  pivot_longer(cols = c(B, R, S)) %>% 
  filter(is.na(value) == FALSE) %>% 
  pivot_wider(names_from = name, values_from = value)
  • Related