I have a data frame such as below:
id Date Age Sex PP Duration cd nh W_B R_B
583 99/07/19 51 2 NA 1 0 0 6.2 4.26
583 99/07/23 51 2 NA NA NA NA 7 4.35
3024 99/10/30 42 2 4 6 NA 1 6.2 5.28
3024 99/11/01 42 2 NA NA NA NA 5.2 5.47
3024 99/11/02 42 2 NA NA NA NA 7.1 5.54
I have to copy the values of 'pp' column to 'nh' based on 'id' in other rows with that 'id'. my target data frame is as below:
id Date Age Sex PP Duration cd nh W_B R_B
583 99/07/19 51 2 NA 1 0 0 6.2 4.26
583 99/07/23 51 2 NA 1 0 0 7 4.35
3024 99/10/30 42 2 4 6 NA 1 6.2 5.28
3024 99/11/01 42 2 4 6 NA 1 5.2 5.47
3024 99/11/02 42 2 4 6 NA 1 7.1 5.54
I apprecite it if anybody share his/her comment with me.
Best Regards
CodePudding user response:
library(tidyverse)
df <- read_table("id Date Age Sex PP Duration cd nh W_B R_B
583 99/07/19 51 2 NA 1 0 0 6.2 4.26
583 99/07/23 51 2 NA NA NA NA 7 4.35
3024 99/10/30 42 2 4 6 NA 1 6.2 5.28
3024 99/11/01 42 2 NA NA NA NA 5.2 5.47
3024 99/11/02 42 2 NA NA NA NA 7.1 5.54")
df %>%
group_by(id) %>%
fill(PP:nh, .direction = 'updown')
#> # A tibble: 5 × 10
#> # Groups: id [2]
#> id Date Age Sex PP Duration cd nh W_B R_B
#> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 583 99/07/19 51 2 NA 1 0 0 6.2 4.26
#> 2 583 99/07/23 51 2 NA 1 NA 0 7 4.35
#> 3 3024 99/10/30 42 2 4 6 NA 1 6.2 5.28
#> 4 3024 99/11/01 42 2 4 6 NA 1 5.2 5.47
#> 5 3024 99/11/02 42 2 4 6 NA 1 7.1 5.54
Created on 2022-07-02 by the reprex package (v2.0.1)
CodePudding user response:
Another option using na.locf
:
df <- read.table(text="id Date Age Sex PP Duration cd nh W_B R_B
583 99/07/19 51 2 NA 1 0 0 6.2 4.26
583 99/07/23 51 2 NA NA NA NA 7 4.35
3024 99/10/30 42 2 4 6 NA 1 6.2 5.28
3024 99/11/01 42 2 NA NA NA NA 5.2 5.47
3024 99/11/02 42 2 NA NA NA NA 7.1 5.54", header=TRUE)
library(dplyr)
library(zoo)
df %>%
group_by(id) %>%
summarise(across(everything(), ~na.locf(., na.rm = FALSE, fromLast = FALSE)))
#> `summarise()` has grouped output by 'id'. You can override using the `.groups`
#> argument.
#> # A tibble: 5 × 10
#> # Groups: id [2]
#> id Date Age Sex PP Duration cd nh W_B R_B
#> <int> <chr> <int> <int> <int> <int> <int> <int> <dbl> <dbl>
#> 1 583 99/07/19 51 2 NA 1 0 0 6.2 4.26
#> 2 583 99/07/23 51 2 NA 1 0 0 7 4.35
#> 3 3024 99/10/30 42 2 4 6 NA 1 6.2 5.28
#> 4 3024 99/11/01 42 2 4 6 NA 1 5.2 5.47
#> 5 3024 99/11/02 42 2 4 6 NA 1 7.1 5.54
Created on 2022-07-02 by the reprex package (v2.0.1)