Home > OS >  how to copy part of rows based on group by 'id' in R?
how to copy part of rows based on group by 'id' in R?

Time:07-03

I have a data frame such as below:

id        Date     Age  Sex   PP   Duration  cd      nh     W_B   R_B
583     99/07/19    51  2     NA      1       0       0     6.2   4.26
583     99/07/23    51  2     NA     NA       NA      NA     7    4.35
3024    99/10/30    42  2      4      6       NA      1     6.2   5.28
3024    99/11/01    42  2     NA     NA       NA      NA    5.2   5.47
3024    99/11/02    42  2     NA     NA       NA      NA    7.1   5.54

I have to copy the values of 'pp' column to 'nh' based on 'id' in other rows with that 'id'. my target data frame is as below:

    id        Date     Age  Sex   PP   Duration  cd      nh     W_B   R_B
    583     99/07/19    51  2     NA      1       0       0     6.2   4.26
    583     99/07/23    51  2     NA      1       0       0     7     4.35
    3024    99/10/30    42  2      4      6       NA      1     6.2   5.28
    3024    99/11/01    42  2      4      6       NA      1     5.2   5.47
    3024    99/11/02    42  2      4      6       NA      1     7.1   5.54

I apprecite it if anybody share his/her comment with me.

Best Regards

CodePudding user response:

library(tidyverse)
df <- read_table("id        Date     Age  Sex   PP   Duration  cd      nh     W_B   R_B
583     99/07/19    51  2     NA      1       0       0     6.2   4.26
583     99/07/23    51  2     NA     NA       NA      NA     7    4.35
3024    99/10/30    42  2      4      6       NA      1     6.2   5.28
3024    99/11/01    42  2     NA     NA       NA      NA    5.2   5.47
3024    99/11/02    42  2     NA     NA       NA      NA    7.1   5.54") 

df %>% 
  group_by(id) %>% 
  fill(PP:nh, .direction = 'updown')
#> # A tibble: 5 × 10
#> # Groups:   id [2]
#>      id Date       Age   Sex    PP Duration    cd    nh   W_B   R_B
#>   <dbl> <chr>    <dbl> <dbl> <dbl>    <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1   583 99/07/19    51     2    NA        1     0     0   6.2  4.26
#> 2   583 99/07/23    51     2    NA        1    NA     0   7    4.35
#> 3  3024 99/10/30    42     2     4        6    NA     1   6.2  5.28
#> 4  3024 99/11/01    42     2     4        6    NA     1   5.2  5.47
#> 5  3024 99/11/02    42     2     4        6    NA     1   7.1  5.54

Created on 2022-07-02 by the reprex package (v2.0.1)

CodePudding user response:

Another option using na.locf:

df <- read.table(text="id        Date     Age  Sex   PP   Duration  cd      nh     W_B   R_B
583     99/07/19    51  2     NA      1       0       0     6.2   4.26
583     99/07/23    51  2     NA     NA       NA      NA     7    4.35
3024    99/10/30    42  2      4      6       NA      1     6.2   5.28
3024    99/11/01    42  2     NA     NA       NA      NA    5.2   5.47
3024    99/11/02    42  2     NA     NA       NA      NA    7.1   5.54", header=TRUE)

library(dplyr)
library(zoo)
df %>%
  group_by(id) %>%
  summarise(across(everything(), ~na.locf(., na.rm = FALSE, fromLast = FALSE)))
#> `summarise()` has grouped output by 'id'. You can override using the `.groups`
#> argument.
#> # A tibble: 5 × 10
#> # Groups:   id [2]
#>      id Date       Age   Sex    PP Duration    cd    nh   W_B   R_B
#>   <int> <chr>    <int> <int> <int>    <int> <int> <int> <dbl> <dbl>
#> 1   583 99/07/19    51     2    NA        1     0     0   6.2  4.26
#> 2   583 99/07/23    51     2    NA        1     0     0   7    4.35
#> 3  3024 99/10/30    42     2     4        6    NA     1   6.2  5.28
#> 4  3024 99/11/01    42     2     4        6    NA     1   5.2  5.47
#> 5  3024 99/11/02    42     2     4        6    NA     1   7.1  5.54

Created on 2022-07-02 by the reprex package (v2.0.1)

  • Related