Home > database >  Create Matched Samples Based on Age and Gender in R
Create Matched Samples Based on Age and Gender in R

Time:10-06

I have a data frame including 20 diagnosed and 60 undiagnosed participants. I want to create matched samples for diagnosis, based on age and gender. In other words, I want r to choose 20 participants from the undiagnosed group who are similar to the diagnosed group in terms of age and gender.

My diagnosed group consisted of females largely, but it is not the case for undiagnosed. This is what I tried with the "MatchIt" package:

m.out <- matchit(as.factor(diagnosis) ~ age   gender, data = mydf,
                  method = "nearest", distance = "glm")
matched_samples <- match.data(m.out)

then it gave me two groups based on diagnosis thankfully, however it did something weird for gender:

#number of participants per group
    diagnosis  n freq 
      <chr> <int> <chr>
    yes       20 50%  
    no        20 50% 

#gender distribution across the diagnosis

diagnosis   gender  n 
  <chr>    <chr>  <int> 
1 yes      male     6 
2 yes      female  14
3 no       male    20

It chose only the males for the nondiagnosed group, although I have enough female participants. Why might this be the case?

Thank you so much!

CodePudding user response:

By default, matchit() does 1:1 matching on the propensity score. What you probably want is exact matching on gender and nearest neighbor matching on age. Try this:

m.out <- matchit(as.factor(diagnosis) ~ age, data = mydf,
                 method = "nearest", exact = ~gender,
                 distance = "euclidean")
matched_samples <- match.data(m.out)

This will give you pairs of units with identical genders who are closely matched on age.

If exact matching is not possible, then you have to accept some imbalance in gender. You can try other matching methods (e.g., optimal matching, cardinality matching) and different distance measures (e.g., Mahalanobis distance, robust Mahalanobis distance) to find the one that gives you the balance you want.

  •  Tags:  
  • r
  • Related