Divide Value from one Dataframe by Value of another Dataframe-CodePudding

these are my two dataframes:

    structure(list(Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", 
"Aaron P. Graft", "Aaron P. Graft", "Aaron P. Jagdfeld"), year = c(2019, 
2020, 2019, 2020, 2019), counter = c(5541L, 3269L, 165L, 200L, 
4L)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-5L), groups = structure(list(Full.Name = c("A. Patrick Beharelle", 
"Aaron P. Graft", "Aaron P. Jagdfeld"), .rows = structure(list(
    1:2, 3:4, 5L), ptype = integer(0), class = c("vctrs_list_of", 
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), .drop = TRUE))

and

structure(list(authority_dic = c("accomplished", "accomplished", 
"accomplished", "accomplished", "accomplished"), Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", 
"Aaron P. Graft", "Aaron P. Graft", "Aaron P. Jagdfeld"), Entity = c("WERNER ENTERPRISES INC", "MONDELEZ INTERNATIONAL INC", 
"AEROJET ROCKETDYNE HOLDINGS", "T-MOBILE US INC", "SOUTHWEST AIRLINES"
), `2019` = c(1L, 0L, 1L, 0L, 0L), `2020` = c(0L, 1L, 0L, 3L, 
1L)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-5L), groups = structure(list(authority_dic = c("accomplished", 
"accomplished", "accomplished", "accomplished", "accomplished"
), Full.Name = c("Derek J. Leathers", "Dirk Van de Put", "Eileen P. Drake", 
"G. Michael Sievert", "Gary C. Kelly"), .rows = structure(list(
    1L, 2L, 3L, 4L, 5L), ptype = integer(0), class = c("vctrs_list_of", 
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -5L), .drop = TRUE))

Now, I want to divide each value of the "2019" column by the "counter" value of the other dataframe and add the result as another column. The complexity comes in place, because I only want to divide through the "counter" value of 2019 and of (for example) Aaron P. Graft. I want to do this for every row of the dataframe containing the name "Aaron P. Graft" and therefore take the value of "counter" from the other dataframe in the row, which contains "Aaron P. Graft".

I cannot figure it out on my own. Maybe I need to transpose the year and counter columns in the first dataframe, but I don't know.

Thats what I want to achieve:

authority_dic	Full.name	2019	2020	2019_freq	2020_freq
example word	Aaron P. Jagdfeld	10	20	10/counter(of 2019)	20/counter(of 2020)

If there are any questions, don't mind asking me. Thanks in advance!!!

CodePudding user response：

Let the structures be s1 and s2, this should work:

library(tidyr)
mutate(
      full_join(
         summarise(
            group_by(s2, authority_dic, Full.Name),
            `2019`=sum(`2019`),
            `2020`=sum(`2020`)),
         s1 %>% spread(year,counter),
         by=c("Full.Name")),
      `2019_freq`=`2019.x`/`2019.y`,
      `2020_freq`=`2020.x`/`2020.y`)
# A tibble: 3 × 8
# Groups:   authority_dic [1]
  authority_dic Full.Name            `2019.x` `2020.x` `2019.y` `2020.y` `2019_freq` `2020_freq`
  <chr>         <chr>                   <int>    <int>    <int>    <int>       <dbl>       <dbl>
1 accomplished  A. Patrick Beharelle        1        1     5541     3269    0.000180    0.000306
2 accomplished  Aaron P. Graft              1        3      165      200    0.00606     0.015   
3 accomplished  Aaron P. Jagdfeld           0        1        4       NA    0          NA

Good practice is to avoid naming columns with values, e.g. 2019.... Use year instead. Your model needs refactoring into normal form (see database normalization topic for more information).