Loop to match dictionary keys from list append dictionary with associated data in other columns-CodePudding

I want to loop through my data and population my dictonairies with 'event' value and their corresponding 'xCordAdjusted' and 'yCordAdjusted'

Dataframe:

season  period  teamCode    event   goal    xCord   xCordAdjusted   yCord   yCordAdjusted   shotType    playerPositionThatDidEvent  playerNumThatDidEvent   shooterPlayerId shooterName shooterLeftRight    
2014    1            MTL    MISS    0           61             61     29              29  WRIST                             C                     51    8471976.0   David Desharnais    L  
2014    1            TOR    SHOT    0          -54             54     29             -29  BACK                              C                     42    8475098.0   Tyler Bozak         R
2014    1            TOR    SHOT    0          -40             40     32             -32  WRIST                             D                     46    8471392.0   Roman Polak         R

My work:

league_data = {};
league_data['SHOT'] = {};
league_data['SHOT']['x'] = [];
league_data['SHOT']['y'] = [];
league_data['GOAL'] = {};
league_data['GOAL']['x'] = [];
league_data['GOAL']['y'] = [];
league_data['MISS'] = {};
league_data['MISS']['x'] = [];
league_data['MISS']['y'] = [];
event_types = ['SHOT','GOAL','MISS']

for data in season_df:
    for event in event_types:
        if data in event_types:
             if 'x' in range(0,100):
                league_data[event]['x'].append(['xCordAdjusted'])
                league_data[event]['y'].append(['yCordAdjusted'])
league_data

Output:

{'SHOT': {'x': [], 'y': []},
 'GOAL': {'x': [], 'y': []},
 'MISS': {'x': [], 'y': []}}

CodePudding user response：

You can extract the desired information directly from the DataFrame in a vectorized fashion, instead of looping over it repeatedly:

league_data = {
    'SHOT': {},
    'GOAL': {},
    'MISS': {},
}

for event in event_types:
    mask = (season_df['event'] == event) & season_df['xCord'].between(0, 100)
    x_adjusted = season_df.loc[mask, 'xCordAdjusted'].tolist()
    y_adjusted = season_df.loc[mask, 'yCordAdjusted'].tolist()
    league_data[event]['x'] = x_adjusted
    league_data[event]['y'] = y_adjusted

gives

{'GOAL': {'x': [], 'y': []},
 'MISS': {'x': [61], 'y': [-29]},
 'SHOT': {'x': [], 'y': []}
}

Note that I adjusted the range condition since your original code if 'x' in range(0,100) doesn't do what you intend because it doesn't reference your DataFrame at all.

CodePudding user response：

for data in season_df: iterate on columns, not rows.
Instead, use for index, row in season_df.iterrows()

However, iteration on rows is quite slow, so if your data is quite big, you can utilize vectorization.

Also, your code looks not working as you expected.. like if 'x' in range(0, 100). I re-code it on my assumption, try this.

for event in event_types:
    matched_df = season_df[season_df['event'] == event]
    x_matched_list = matched_df[(0 <= matched_df['xCordAdjusted']) & (matched_df['xCordAdjusted'] <= 100)]['xCordAdjusted'].tolist()
    league_data[event]['x'] = x_matched_list # or extend
    
    y_matched_list = matched_df[(0 <= matched_df['yCordAdjusted']) & (matched_df['yCordAdjusted'] <= 100)]['yCordAdjusted'].tolist()
    league_data[event]['y'] = y_matched_list # or extend

But be careful with possibility of length 'xCordAdjusted' not matching with 'yCordAdjusted'