I have a Dataframe with the below column names, and I want to create new columns(n_1, n_2, n_3 n_4, n_5, n_6, n_7, n_8) off the original Dataframe based on a given condition. The condition is to create new columns for each unique EVENT_ID in the Dataframe. check for rows in the SELECTION_TRAP column that has 1 as a value, take the BSP value that corresponds to the row, and insert it into an n_1, if SELECTION_TRAP value is 2, insert into n_2 the BSP that corresponds to that row. same applies to n_3,n_4, n_5, n_6, n_7, n_8.
EVENT_ID BSP SELECTION_TRAP
174331755 34.00 1
174331755 5.18 2
174331755 3.30 7
174331755 70.00 4
174331755 19.10 3
174331755 13.00 6
174331755 2.96 8
174329130 14.00 5
174329130 8.20 7
174329130 3.65 1
174329130 3.15 4
174329130 7.20 2
174329130 13.50 8
174339802 160.00 3
174339802 47.57 7
174339802 88.01 8
174339802 1.58 5
174339802 268.86 2
174339802 19.14 4
174339802 4.66 1
174339802 17.77 6
Expected Output:
EVENT_ID n_1 n_2 n_3 n_4 n_5 n_6 n_7 n_8
174331755 34.00 5.18 19.10 70.00 0.00 13.00 3.30 2.96
174329130 3.65 7.20 0.00 3.15 14.00 0.00 8.20 13.50
174339802 4.66 268.86 160.00 19.14 1.58 17.77 47.57 88.01
I tried the below code but the output is way off from my expected output. I can figure a pandas function I can use to achieve this asap. please I need you help on this. thanks
for j,n in enumerate(new.index):
if df['SELECTION_TRAP'].values[j] == 1:
df['1'] = df['BSP'].values[j]
elif df['SELECTION_TRAP'].values[j] == 2:
df['2'] = df['BSP'].values[j]
elif df['SELECTION_TRAP'].values[j] == 3:
df['3'] = df['BSP'].values[j]
elif df['SELECTION_TRAP'].values[j] == 4:
df['4'] = df['BSP'].values[j]
elif df['SELECTION_TRAP'].values[j] == 5:
df['5'] = df['BSP'].values[j]
elif df['SELECTION_TRAP'].values[j] == 6:
df['6'] = df['BSP'].values[j]
elif df['SELECTION_TRAP'].values[j] == 7:
df['7'] = df['BSP'].values[j]
elif df['SELECTION_TRAP'].values[j] == 8:
df['8'] = df['BSP'].values[j]
else:
df[['1','2','3','4','5','6','7','8']] = df[['1','2','3','4','5','6','7','8']].apply(0,axis = 1)
EVENT_ID BSP S_TRAP 3 7 8 5 2 4 1 6
174339802 160.00 3 160.00 47.57 88.01 1.58 268.86 19.14 4.66 17.77
174339802 47.57 7 160.00 47.57 88.01 1.58 268.86 19.14 4.66 17.77
174339802 88.01 8 160.00 47.57 88.01 1.58 268.86 19.14 4.66 17.77
174339802 1.58 5 160.00 47.57 88.01 1.58 268.86 19.14 4.66 17.77
174339802 268.86 2 160.00 47.57 88.01 1.58 268.86 19.14 4.66 17.77
174339802 19.14 4 160.00 47.57 88.01 1.58 268.86 19.14 4.66 17.77
174339802 4.66 1 160.00 47.57 88.01 1.58 268.86 19.14 4.66 17.77
174339802 17.77 6 160.00 47.57 88.01 1.58 268.86 19.14 4.66 17.77
CodePudding user response:
You can use a pivot table and add prefixes once that's done.
df.pivot_table(index='EVENT_ID',columns='SELECTION_TRAP',values='BSP').add_prefix('n_')
Output
SELECTION_TRAP n_1 n_2 n_3 n_4 n_5 n_6 n_7 n_8
EVENT_ID
174329130 3.65 7.20 NaN 3.15 14.00 NaN 8.20 13.50
174331755 34.00 5.18 19.1 70.00 NaN 13.00 3.30 2.96
174339802 4.66 268.86 160.0 19.14 1.58 17.77 47.57 88.01
CodePudding user response:
Isn't this a pivot_table()
output?
df.pivot_table(index="EVENT_ID",columns="SELECTION_TRAP",values="BSP",fill_value=0).add_prefix('n_').reset_index()
SELECTION_TRAP EVENT_ID n_1 n_2 n_3 ... n_5 n_6 n_7 n_8
0 174329130 3.65 7.20 0.0 ... 14.00 0.00 8.20 13.50
1 174331755 34.00 5.18 19.1 ... 0.00 13.00 3.30 2.96
2 174339802 4.66 268.86 160.0 ... 1.58 17.77 47.57 88.01