I have a dataset in R where one disease is coded with different code numbers.
Just, for example, let's take code as -
set.seed(1)
df <- data.frame(Gender = sample(c("Male", "Female", "Other"), 20, T),
Disease = sample(c("709", "908", "1515", "698", "890", "20"), 20, T))
Code numbers such as - 709, 1515, and 20 belong to only one disease named Gonorrhoea.
Now, I have to generate a table for gender and only for Gonorrhoea. I am wondering how can it be done. Any suggestions?
I will be very thankful for any kind of helpful input.
In case, the question is unclear, let me know :)
Wish you all a lovely day.
CodePudding user response:
Any of the first two options will table the required subset of the data. The 3rd option transforms a table into a data.frame with the same information, only its presentation is different.
set.seed(1)
df <- data.frame(Gender = sample(c("Male", "Female", "Other"), 20, T),
Disease = sample(c("709", "908", "1515", "698", "890", "20"), 20, T))
table(subset(df, Disease %in% c(709, 1515, 20)))
#> Disease
#> Gender 1515 20 709
#> Female 1 1 1
#> Male 0 1 2
#> Other 0 2 1
xtabs(~ Gender Disease, df, subset = Disease %in% c(709, 1515, 20))
#> Disease
#> Gender 1515 20 709
#> Female 1 1 1
#> Male 0 1 2
#> Other 0 2 1
tbl <- table(subset(df, Disease %in% c(709, 1515, 20)))
as.data.frame(tbl)
#> Gender Disease Freq
#> 1 Female 1515 1
#> 2 Male 1515 0
#> 3 Other 1515 0
#> 4 Female 20 1
#> 5 Male 20 1
#> 6 Other 20 2
#> 7 Female 709 1
#> 8 Male 709 2
#> 9 Other 709 1
Created on 2022-08-21 by the reprex package (v2.0.1)