Part of my data looks as follows:
> q[,c(1,3)]
Year Language
1 1 C
2 1 C
3 1 C
4 1 C
5 1 C
6 1 JavaScript
7 1 C
8 2 C
9 2 inny
10 2 C
11 2 Java
12 3 Java
13 3 Java
14 3 JavaScript
15 3 Java
16 3 JavaScript
17 3 .NET
18 3 inny
19 3 R
20 3 Python
21 3 .NET
22 3 Python
23 3 Java
24 3 Java
25 3 Java
26 3 Java
27 3 Java
28 3 Java
29 3 C#
30 3 C
31 3 JavaScript
32 3 C
33 3 JavaScript
34 3 Java
35 3 Java
36 3 Python
37 3 C#
38 4 R
39 4 C
40 4 Java
41 4 Python
42 4 C
43 4 .NET
44 4 C#
45 5 inny
46 5 JavaScript
47 5 C#
48 5 Python
49 5 R
50 2 C
The entire dataset named q
also has other columns that are not relevant here.
What I want to achieve is for each year to count the languages that occurred most often.
Sometimes several languages occurred with the same highest max amount! That's why I want to list each such language.
Expected output:
Year Language
1 1 C
2 2 C
3 3 Java
4 4 .NET
5 4 C
6 4 C#
7 4 C
8 4 Java
9 4 Python
10 4 R
11 5 C#
12 5 inny
13 5 JavaScript
14 5 Python
15 5 R
CodePudding user response:
Included "amount" column to display each languages occurrence each year, if needed.
library(tidyverse)
df %>%
count(Year, Language, name = "amount") %>%
group_by(Year) %>%
slice_max(amount)
# A tibble: 15 × 3
# Groups: Year [5]
Year Language amount
<dbl> <chr> <int>
1 1 C 4
2 2 C 2
3 3 Java 11
4 4 .NET 1
5 4 C 1
6 4 C# 1
7 4 C 1
8 4 Java 1
9 4 Python 1
10 4 R 1
11 5 C# 1
12 5 inny 1
13 5 JavaScript 1
14 5 Python 1
15 5 R 1
>
CodePudding user response:
Using dplyr
:
q %>% group_by(Year) %>% summarise(language=names(which(table(Language)==max(table(Language)))))
output:
Year language
<int> <chr>
1 1 C
2 2 C
3 3 Java
4 4 .NET
5 4 C
6 4 C#
7 4 C
8 4 Java
9 4 Python
10 4 R
11 5 C#
12 5 inny
13 5 JavaScript
14 5 Python
15 5 R
CodePudding user response:
Here is a base R variation:
apply(table(df$Language, df$Year), 2,
\(x) names(which(x == max(x)))) |>
stack() |>
`colnames<-`(c("Language", "Year"))
#> Language Year
#> 1 C 1
#> 2 C 2
#> 3 Java 3
#> 4 .NET 4
#> 5 C 4
#> 6 C# 4
#> 7 C 4
#> 8 Java 4
#> 9 Python 4
#> 10 R 4
#> 11 C# 5
#> 12 inny 5
#> 13 JavaScript 5
#> 14 Python 5
#> 15 R 5