I am trying to filter a df based on a specific column, but the criteria is not always the same.
I am looking at a large NBA dataset and some players have been on multiple teams, so they have one observation for each team they played for as well as a total (TOT) row. I would like to pull the TOT obs. for players who have that. If the player only played for one team then they do not have a TOT in the team column so I would like to keep that observation.
The example data might make this easier to understand. I know how to filter based on a column, but not sure how to adjust for those that might not have a TOT in that column.
library ("dplyr")
# declaring a dataframe
data_frame = data.frame(Player = c("Luka","Steph","Anderson","Anderson","Anderson") ,
Games= c(60, 59, 42, 30, 12),
Team= c('Dallas', 'Warriors', 'TOT', 'CLE', 'IND'))
print ("Original dataframe")
print (data_frame)
# checking which values of col1
# are equivalent to b or e
data_frame_mod <- filter(data_frame, Team == 'TOT')
print ("Modified dataframe")
print (data_frame_mod)
This code produces:
However, I would also like to include the Luka and Steph rows because they only played for one team. Below is the expected output:
CodePudding user response:
You could use slice_max
, since the TOT column is the sum of all the games played:
library(dplyr)
data_frame %>%
group_by(Player) %>%
slice_max(Games)
output
Player Games Team
<chr> <dbl> <chr>
1 Anderson 42 TOT
2 Luka 60 Dallas
3 Steph 59 Warriors
CodePudding user response:
We could do a grouping by 'Player' and use the condition to check number of rows (n() ==1
)
library(dplyr)
data_frame %>%
group_by(Player) %>%
filter(n() ==1| Team == 'TOT') %>%
ungroup
-output
# A tibble: 3 × 3
Player Games Team
<chr> <dbl> <chr>
1 Luka 60 Dallas
2 Steph 59 Warriors
3 Anderson 42 TOT
CodePudding user response:
A slightly variation by using group_by
arrange
and slice
:
library(dplyr)
data_frame %>%
group_by(Player) %>%
arrange(Player, .by_group = TRUE) %>%
slice(1)
Player Games Team
<chr> <dbl> <chr>
1 Anderson 42 TOT
2 Luka 60 Dallas
3 Steph 59 Warriors