I have a large dataset over 10 000 records. The data looks like this
Id Ap Ak Al Aj
602 1 0 1 1
603 0 1 1 1
603 1 1 1 0
Some of the Ids appear more than once. How do I to get the proportion for each row?
I have tried : prop.table(table(rep$id), margin = 1)* 100
, this just returns 1s under each id.
CodePudding user response:
If we want the 'proportions' for 'Id', extend the proportions
to match the names
with the column value of 'Id' after converting to character
df1$proportion <- 100 * proportions(table(df1$Id))[as.character(df1$Id)]
-output
> df1
Id Ap Ak Al Aj proportion
1 602 1 0 1 1 33.33333
2 603 0 1 1 1 66.66667
3 603 1 1 1 0 66.66667
data
df1 <- structure(list(Id = c(602L, 603L, 603L), Ap = c(1L, 0L, 1L),
Ak = c(0L, 1L, 1L), Al = c(1L, 1L, 1L), Aj = c(1L, 1L, 0L
)), class = "data.frame", row.names = c(NA, -3L))
CodePudding user response:
Using dplyr
library(dplyr)
df1 %>%
add_count(Id) %>%
mutate(prop = n / nrow(.),
.keep = "unused")
Output
Id Ap Ak Al Aj prop
1 602 1 0 1 1 0.3333333
2 603 0 1 1 1 0.6666667
3 603 1 1 1 0 0.6666667