Creating list based on pandas data frame conditions

I have a table:

Player	Team	GS
Jack	A	NaN
John	B	1
Mike	A	1
James	A	1

And would like to make 2 separate lists (TeamA & TeamB) so that they players are split by team and also filters so that the players that have a '1' in GS are only part of the list. The final lists would look like:

TeamA = Mike, James
TeamB = John

In this case, Jack was excluded from the TeamA list because he did not have a 1 value in the GS column.

Any direction would help. Thanks!

CodePudding user response：

You can use:

out = (df.loc[df['GS'].eq(1)]                  # filter rows
         .groupby('Team')['Player'].agg(list)  # aggregate as lists
         .to_dict()                            # convert to dict
      )

output:

{'A': ['Mike', 'James'], 'B': ['John']}

CodePudding user response：

Think of this as "filtering" the data based on certain conditions.

dataframe = pd.DataFrame(...)

# Create a mask that will be used to select team A rows
mask_team_a = dataframe['Team'] == 'A'
# Create a second mask for the GS filter
mask_gs = dataframe['GS'] == 1

# Use the .loc accessor to get the rows by combining masks with '&'
team_a_df = dataframe.loc[mask_team_a & mask_gs, :]
# You can use the same masks, but use the '~' to say 'not team A'
team_b_df = dataframe.loc[(~mask_team_a) & mask_gs, :]

team_a_list = list(team_a_df['Player'])
team_b_list = list(team_b_df['Player'])

This might be a bit verbose, but it allows for the most flexibility in the future if you need to tweak your selections.