Home > Back-end >  How to select column based on row value in a different column
How to select column based on row value in a different column

Time:05-27

I have the below dataframe and I would like to select the column based on the value in the 'AGE' column and perform a math calculation based on the value in that column for that row.

enter image description here

For example, for row 0 the AGE value is 5, therefore I need to pick value from the column 'MONTH_5' which is 0.288 and multiply it by the SALARY for that row, which is 1174.88 and populate the result (1174.88 * 0.288) in a new column named 'Final_Value'. Similarly, for row 1, the AGE value is 6, therefore I need to pick value from column 'MONTH_6' for that row (which is 0.025) and multiply it by the SALARY for that row, which is 3665.87 and populate the result (3665.87* 0.025) in a new column named 'Final_Value'. I need to do this for every row in the dataframe.

I tried to apply the logic using lambda, however, I'm unable to figure out how to select the column, based on the values in AGE column for each row in the dataframe.

def calculate_sal(age, salary):
    return_val = 0
    month_col_name = 'MONTH_'   str(age)
    return_val = my_df[month_col_name]
    return return_val * salary 

my_df['Final_Value'] = my_df.apply(lambda row: calculate_sal(row['AGE'],row['salary']), axis=1)

When I'm trying the above code, I'm getting the following error:

"ValueError: Wrong number of items passed 235, placement implies 1"

How can I achieve this? Please advise.

Thank you!

CodePudding user response:

def calculate_sal(df, age):

    for index, row in df.iterrows():
        if row["AGE"] == age:
            row["Final_Value"] = row["AGE"] * row["MONTH_"   str(age)]
        elif row["AGE"] == age:
            row["Final_Value"] = row["AGE"] * row["MONTH_"   str(age)]
        elif:
            ...
        
    return 

CodePudding user response:

IIUC, you can also try np.arange like Indexing and selecting data

idx, cols = pd.factorize('MONTH_' df['AGE'].astype(str))
df['col'] = df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]
  • Related