I'm starting to lose my mind a bit. I have:
df = pd.DataFrame(bunch_of_stuff)
df2 = df.loc[bunch_of_conditions].copy()
def transform_df2(df2):
df2['new_col'] = [rand()]*len(df2)
df2['existing_column_1'] = [list of new values]
return df2
df2 = transform_df2(df2)
I know what to re-insert df2 into df, such that it overwrites all its previous records.
What would the best way to do this be? df.loc[df2.index] = df2 ?
This doesn't bring over any of the new columns in df2 though.
CodePudding user response:
You have the right method with pd.concat
. However you can optimize a little bit by using a boolean mask to avoid to recompute the index difference:
m = bunch_of_conditions
df2 = df[m].copy()
df = pd.concat([df[~m], df2]).sort_index()
Why do you want to make a copy of your dataframe? Is not simpler to use the dataframe itself?
CodePudding user response:
One way I did it was:
df= pd.concat([df.loc[~df.index.isin(df2.index)],df2])