Home > Enterprise >  Adding values from one column to values in another
Adding values from one column to values in another

Time:10-01

I'm working on calculating the new diameters of trees I've measured by taking their initial diameter (taken in 2018) and then adding the centimeters of diameter growth (diam_growth)from 2020 and 2021. This would fill in the NAs in the column "dbh". Here's some sample data to help explain

tag <- c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4)
diam_growth <- c(0.4, 0.5, NA, 0.7, 0.8, NA, 0.9, 1.0, NA, 0.1, 0.2, NA)
dbh <- c(NA, NA, 10, NA, NA, 15, NA, NA, 7, NA, NA, 12)
year <- c(2020, 2021, 2018, 2020, 2021, 2018, 2020, 2021, 2018, 2020, 2021, 2018)

tree_growth <- data.frame(tag, diam_growth, dbh, year)

   tag diam_growth dbh year
1    1         0.4  NA 2020
2    1         0.5  NA 2021
3    1          NA  10 2018
4    2         0.7  NA 2020
5    2         0.8  NA 2021
6    2          NA  15 2018
7    3         0.9  NA 2020
8    3         1.0  NA 2021
9    3          NA   7 2018
10   4         0.1  NA 2020
11   4         0.2  NA 2021
12   4          NA  12 2018

So for example, for tag 1, the code would take the 2018 dbh (10) and add 0.4 for 2020 and 0.5 for 2021. Then for tag 2, it would add 0.7 and 0.8 to the initial DBH of 15 for 2020 and 2021 respectively and so on for each tag ID.

I'm happy to clarify further if this isn't super clear!

Any help or suggestions would be greatly appreciated!!

CodePudding user response:

Grouped by 'tag', replace the 'diam_growth' values where they are NA in 'dbh' by adding the values with the non-NA value from 'dbh'

library(dplyr)
tree_growth %>% 
   group_by(tag) %>%
    mutate(diam_growth  = replace(diam_growth, is.na(dbh), 
        diam_growth[is.na(dbh)]   dbh[!is.na(dbh)])) %>%
   ungroup

-output

# A tibble: 12 × 4
     tag diam_growth   dbh  year
   <dbl>       <dbl> <dbl> <dbl>
 1     1        10.4    NA  2020
 2     1        10.5    NA  2021
 3     1        NA      10  2018
 4     2        15.7    NA  2020
 5     2        15.8    NA  2021
 6     2        NA      15  2018
 7     3         7.9    NA  2020
 8     3         8      NA  2021
 9     3        NA       7  2018
10     4        12.1    NA  2020
11     4        12.2    NA  2021
12     4        NA      12  2018

Or use case_when

tree_growth %>% 
   group_by(tag) %>% 
   mutate(diam_growth = case_when(is.na(dbh) ~ diam_growth   
         dbh[!is.na(dbh)], TRUE ~ diam_growth)) %>% 
   ungroup
  • Related