Home > OS >  If I have two groups in my data which partially match for a second categorical variable, is there a
If I have two groups in my data which partially match for a second categorical variable, is there a

Time:05-17

For example, I have two datasets (A,B), which have a score column, a location column (England or Wales) and a month column. If data set A only has the months January through to October, while data set B only has the months April - November, is there a way to filter my data to only include the months April-October? This is for pairing data in statistical tests.

My actual data set has over a hundred categorical variables, and maybe half don't match between groups so doing this manually isn't efficient in the least.

CodePudding user response:

Does this reproducible example capture what you want to do?

library(tidyverse)

dfa <- tribble(~location, ~month, ~a_score,
        "England", 1, 1,
        "England", 2, 1,
        "England", 3, 1,
        "Wales", 1, 1,
        "Wales", 2, 1,
        "Wales", 3, 1
        )

dfb <- tribble(~location, ~month, ~b_score,
        "England", 2, 2,
        "England", 3, 2,
        "England", 4, 2,
        "Wales", 2, 2,
        "Wales", 3, 2,
        "Wales", 4, 2
)

dfa |> inner_join(dfb, by = c("location", "month"))
#> # A tibble: 4 × 4
#>   location month a_score b_score
#>   <chr>    <dbl>   <dbl>   <dbl>
#> 1 England      2       1       2
#> 2 England      3       1       2
#> 3 Wales        2       1       2
#> 4 Wales        3       1       2

Created on 2022-05-16 by the reprex package (v2.0.1)

  • Related