Is there code for Append rows to df2 and, when appended, change value of that row?
I want to append (or really cut) row/s which meet a few different criteria from df_Clean to df_VOIDS and then update the 'Drop_Reason' col to a new value; ie indicating the criteria it was moved.
df_Clean=
Encoder Index Void Phase Max Force Max Force Avg THD Max
125545 NaN 0 54 19 6
125583 NaN 12 79 54 16
df_VOIDS=
Encoder_Index Void Phase_Max Force_Max Force_Avg THD_Max Drop_Reason
125544 Void 5 59 39 14 Voided
125657 Void 5 80 44 23 Voided
125901 Void 6 80 43 19 Voided
125963 Void 6 78 54 14 Voided
If Phase Max > 10, then ideally the last row of df_Clean would be removed and added to df_VOIDS with "above phase max" added to 'Drop_Reason'.
I'm doing several criteria checks ie Force Max, Force Avg etc in case that changes the answer.
CodePudding user response:
Firstly, you shouldn't use append
any more as it is deprecated. Instead, use concat
. There are a couple of optimisations you can make to your code, and you can resolve the worry about replacing all NaN
values in the Drop_Reason
column by adding the required value to the rows you transfer:
Phase_Max_Value = 10
xfer = df_Clean.Phase_Max > Phase_Max_Value
df_VOIDS = pd.concat([df_VOIDS, df_Clean[xfer].assign(Drop_Reason='GrtrT Phase_Max')])
df_Clean = df_Clean[~xfer]
Resultant dataframes:
# df_VOIDS
Encoder_Index Void Phase_Max Force_Max Force_Avg THD_Max Drop_Reason
0 125544 Void 5 59 39 14 Voided
1 125657 Void 5 80 44 23 Voided
2 125901 Void 6 80 43 19 Voided
3 125963 Void 6 78 54 14 Voided
1 125583 NaN 12 79 54 16 GrtrT Phase_Max
# df_Clean
Encoder_Index Void Phase_Max Force_Max Force_Avg THD_Max
0 125545 NaN 0 54 19 6
CodePudding user response:
I'm posting my current answer but definitely welcome other suggestions.
Here's working code one of several criteria:
Phase_Max_Value = 10
df_VOIDS = df_VOIDS.append(df_Clean[df_Clean.Phase_Max > Phase_Max_Value])
#if Drop_Reason equals NaN than Drop reason = 'Phase Max'
df_VOIDS.loc[df_VOIDS['Drop_Reason'].isna(),'Drop_Reason']='GrtrT Phase_Max'
df_Clean = df_Clean[df_Clean.Phase_Max <= Phase_Max]
Once the Appended row comes across it fills Drop_Reason with NaN. Then I replace all NaN's in the column with a reason, but it feels a little dangerous blindly overwriting NaN's. This gives:
df_Clean=
Encoder Index Void Phase Max Force Max Force Avg THD Max
125545 NaN 0 54 19 6
df_VOIDS =
Encoder_Index Void Phase_Max Force_Max Force_Avg THD_Max Drop_Reason
125544 Void 5 59 39 14 Voided
125657 Void 5 80 44 23 Voided
125901 Void 6 80 43 19 Voided
125963 Void 6 78 54 14 Voided
125583 NaN 12 79 54 16 GrtrT Phase_Max
Thanks for your time.