I have a dataframe as below.
month fe_month_OCT re_month_APR fe_month_MAY
0 OCT 1 1 2
1 APR 4 2 2
2 MAY 1 4 3
Im trying to create a new column that gets me the value from any of the fe_month_ or re_month_ columns based on what month the row of data corresponds to (for the SAME month however we will not see 2 columns - i.e. we will never see both fe_month_APR and re_month_APR in the same df - it will either be fe or re).
Output example - for the first row, I would want this new column to have the value coming from fe_month_OCT, because month=OCT, for the second row, the value should come from re_month_APR etc.
Expected output:
month fe_month_OCT re_month_APR fe_month_MAY d_month
0 OCT 1 1 2 1
1 APR 4 2 2 2
2 MAY 1 4 3 3
Code to create input dataframe:
data = {'month': ['OCT', 'APR', 'MAY'], 'fe_month_OCT': [1, 4, 1], 're_month_APR': [1, 2, 4],'fe_month_MAY': [2, 2, 3] }
db = pd.DataFrame(data)
CodePudding user response:
Assuming all the column names are in the form "fe_month_" plus the string in db["month"]
, you can use apply().
get_value = lambda row: row[ "fe_month_" row["month"] ]
db["d_month"] = db.apply( get_value, axis=1 )