Home > Enterprise >  How do I count the ocurrence of a data point in a data frame?
How do I count the ocurrence of a data point in a data frame?

Time:03-11

I have this table and i want to retain and count only the id in which the string A and D are most represented. For example, A and D are most represented in the id "abc" than in the "hil" Id.

string id start end
A abc 0 1
A abc 2 3
B efg 1 3
A hil 5 6
A abc 6 7
D abc 7 8
D abc 1 2
D hil 3 4

How can I obtain the id in which those strings are most represented?

CodePudding user response:

You can use this code:

df %>% 
  filter(string == "A" | string == "D") %>%
  group_by(id) %>%
  count(id) %>%
  arrange() %>%
  ungroup() %>%
  slice(1)

Output:

# A tibble: 1 × 2
  id        n
  <chr> <int>
1 abc       5

CodePudding user response:

In base R, you can get the most common id for each string like this:

apply(table(df$id, df$string), 2, function(x) {
   rownames(table(df$id, df$string))[which.max(x)] })
#>     A     B     D 
#>  "abc" "efg" "abc" 
  • Related