I have an input dataframe as below: Input df:
PET City Cost Expense
0 Dog MH 1500.0 NaN
1 Dog BLR 1000.0 NaN
2 Dog DL 2000.0 NaN
3 Cat MH NaN 500.0
4 Cat BLR NaN 900.0
5 Cat DL NaN 2500.0
6 Bird MH NaN NaN
7 Bird BLR NaN NaN
8 Bird DL NaN NaN
9 Others MH 100.0 NaN
10 Others BLR 300.0 NaN
11 Others DL 700.0 NaN
Expected Output:
PET City Cost Expense
0 Dog MH 1500.0 NaN
1 Dog BLR 1000.0 NaN
2 Dog DL 2000.0 NaN
3 Cat MH NaN 500.0
4 Cat BLR NaN 900.0
5 Cat DL NaN 2500.0
6 Bird MH 500.0 NaN
7 Bird BLR 900.0 NaN
8 Bird DL 2500.0 NaN
9 Others MH 100.0 NaN
10 Others BLR 300.0 NaN
11 Others DL 700.0 NaN
Difference in output from input is: For PET having Bird, Cost will be the Expense value of Cat
I have written as below, but getting error.
df_subset = df.loc[(df['PET'] == 'Cat'), ['Expense']]
df['Cost'] = np.select(df["PET"] == 'Bird', [i for i in df_subset['Expense']], df['Cost'])
CodePudding user response:
You can try
df.loc[df['PET'].eq('Bird'), 'Cost'] = df.loc[df['PET'].eq('Bird'), 'City'].map(df.set_index('City').loc[lambda x: x['PET'].eq('Cat'), 'Expense'])
print(df)
PET City Cost Expense
0 Dog MH 1500.0 NaN
1 Dog BLR 1000.0 NaN
2 Dog DL 2000.0 NaN
3 Cat MH NaN 500.0
4 Cat BLR NaN 900.0
5 Cat DL NaN 2500.0
6 Bird MH 500.0 NaN
7 Bird BLR 900.0 NaN
8 Bird DL 2500.0 NaN
9 Others MH 100.0 NaN
10 Others BLR 300.0 NaN
11 Others DL 700.0 NaN