I wanted to make a table that shows least popular and most popular movie. But I don't know how to do it. I tried to find a solution, but I failed. Basically it has to look like this:
ID | Least popular movie | Most popular movie |
---|---|---|
1 | xyz | rty |
2 | zxc | qwz |
Here's what I have tried
movies%>%
group_by(ID) %>%
summarise(
Least = min(popularity),
Most= max(popularity))
CodePudding user response:
Since you did not provide the actual data to test the code, i tested this code on iris
data where i tried to get the min
and max
of Sepal.Length
by group species
and generated the final df
please check this if it works for your data
df <- iris %>% group_by(Species) %>%
filter(Sepal.Length==min(Sepal.Length) | Sepal.Length==max(Sepal.Length)) %>%
mutate(name=ifelse(row_number()==1, 'min', 'max')) %>%
pivot_wider(Species, values_from = Sepal.Length, names_from = name)
Created on 2023-01-21 with reprex v2.0.2
# A tibble: 3 × 3
# Groups: Species [3]
Species min max
<fct> <dbl> <dbl>
1 setosa 4.3 5.8
2 versicolor 7 4.9
3 virginica 4.9 7.9
CodePudding user response:
Edited after feedback
library(PogromcyDanych)
library(tidyverse)
simdb <- PogromcyDanych::serialeIMDB
simdb |>
group_by(serial) |>
summarise(
avg=mean(ocena),
lnum = min(ocena),
hnum= max(ocena)) |> left_join(
simdb |> select(serial,ocena,"Least"=nazwa),
by=c("serial"="serial",
"lnum"="ocena")
) |> group_by(serial) |> slice_head(n=1) |> left_join(
simdb |> select(serial,ocena,"Highest"=nazwa),
by=c("serial"="serial",
"hnum"="ocena")
) |> group_by(serial) |> slice_head(n=1) |> select(-lnum,-hnum)
# A tibble: 198 × 4
# Groups: serial [198]
serial avg Least Highest
<fct> <dbl> <fct> <fct>
1 Breaking Bad 8.48 Fly Ozymandias
2 Cosmos: A Space-Time Odyssey 9.12 Deeper, Deeper, Deeper Still Unafraid of the Dark
3 Planet Earth 9.23 Mountains From Pole to Pole
4 Game of Thrones 8.63 The Night Lands The Rains of Castamere
5 True Detective 9.2 The Long Bright Dark Who Goes There
6 The Wire 8.72 The Target The Cost
7 Sherlock 8.97 The Blind Banker The Reichenbach Fall
8 Cosmos 8.79 Blues for a Red Planet One Voice in the Cosmic Fugue
9 The Sopranos 8.57 Whitecaps Denial, Anger, Acceptance
10 Leyla ile Mecnun 8.09 Episode #2.25 Pilot
Note that because ratings are unique there can be duplicates; this solution just keeps the first entry
CodePudding user response:
Please check the below code, here the least
and highest
variable have data of episode name
i.e., nazwa
code
library(PogromcyDanych)
library(tidyverse)
serialeIMDB %>% group_by(serial) %>% mutate(avg=mean(ocena)) %>% arrange(nazwa) %>%
filter(ocena ==min(ocena) | ocena ==max(ocena )) %>%
mutate(least=case_when(row_number()==1 ~ nazwa),
highest= case_when(row_number()==max(row_number()) ~nazwa)) %>%
fill(least, highest) %>% slice_tail(n=1) %>% select(serial, avg, least, highest)
Created on 2023-01-21 with reprex v2.0.2
output
# A tibble: 198 × 4
# Groups: serial [198]
serial avg least highest
<fct> <dbl> <fct> <fct>
1 Breaking Bad 8.48 Fly Ozymandias
2 Cosmos: A Space-Time Odyssey 9.12 Deeper, Deeper, Deeper Still Unafraid of the Dark
3 Planet Earth 9.23 From Pole to Pole Ocean Deep
4 Game of Thrones 8.63 The Night Lands The Rains of Castamere
5 True Detective 9.2 The Long Bright Dark Who Goes There
6 The Wire 8.72 The Target -30-
7 Sherlock 8.97 The Blind Banker The Reichenbach Fall
8 Cosmos 8.79 One Voice in the Cosmic Fugue Journeys in Space and Time
9 The Sopranos 8.57 Denial, Anger, Acceptance Whitecaps
10 Leyla ile Mecnun 8.09 Pilot Episode #2.25
# … with 188 more rows
# ℹ Use `print(n = ...)` to see more rows