Python - How to Loop over every row in a dataframe to change the values in a column?-CodePudding

I need to iterate over a Dataframe to assign a value to the new columns.
For instance, it should loop over every row, and do this-
if HomeTeam == 'Burnley':
HV = 50
elif HomeTeam == 'Crystal Palace':
HV = 65
and so on..for the whole dataframe (I have the HV values for each team in a separate file). Like HV, I want to assign values to other columns that are shown NaN in the dataframe. I tried using iterrows() but it treats every row as a tuple which is immutable.

CodePudding user response：

You can use dictionary and map to do what you want to do

hv_dict = {'Burnley': 50, 'Crystal Palace': 65}
df['HV'] = df['HomeTeam'].map(hv_dict)

This should be faster than iterating and neater

On a side note, if you do want to use iterrows (not recommended), then you can use like this:

df = pd.DataFrame({'HomeTeam': ['a', 'b'], 'HV': np.nan})

df_new = pd.DataFrame()
for index,row in df.iterrows():
    if(row['HomeTeam']=='a'):
        row['HV'] = 65
    elif(row['HomeTeam']=='b'):
        row['HV'] = 55
    df_new = df_new.append(row)

    HomeTeam    HV
0   a   NaN
1   b   NaN

df_new

    HV  HomeTeam
0   65.0    a
1   55.0    b

CodePudding user response：

If your HV for each team will be the same, you can first create a list of teams you have through your dataframe

teams = df['HomeTeam'].unique()

Next, by creating another dataframe that has the HV (let's call it df2), you can iterate through the list and assign the value to them.

for i in teams:
    df.loc[df['HomeTeam'].isin([i])] = df2['HV'].loc[df['HomeTeam'].isin([i])]

CodePudding user response：

There are two options depending on if there are many unique pairs

Few unique pairs: `.map()`

Manually key in the pairs into a dictionary, then use .map(dict):

dict = {'Burnley': 50, 'Crystal Palace': 65, ...)

df['HV'] = df['HomeTeam'].map(dict)

Many unique pairs

Reading in the separate file as DataFrame and merging, rather than manually keying in. Assuming the separate file is in .csv format:

hv_hometeam_df = pd.read_csv('PATH/to/csv')

merge_df = df.merge(hv_hometown_df,
                        left_on='HomeTeam', 
                        right_on='COLUMN')
merge_df = merge_df\
              .drop(labels=['HomeTeam'], axis=1)\
              .rename(columns={'COLUMN': 'HomeTeam'})

Few unique pairs: .map()

Many unique pairs

Few unique pairs: `.map()`