Home > OS >  Overwrite portion of dataframe
Overwrite portion of dataframe

Time:07-19

I'm starting to lose my mind a bit. I have:

df = pd.DataFrame(bunch_of_stuff)
df2 = df.loc[bunch_of_conditions].copy()

def transform_df2(df2):
    df2['new_col'] = [rand()]*len(df2)
    df2['existing_column_1'] = [list of new values]
    return df2

df2 = transform_df2(df2)

I know what to re-insert df2 into df, such that it overwrites all its previous records.

What would the best way to do this be? df.loc[df2.index] = df2 ? This doesn't bring over any of the new columns in df2 though.

CodePudding user response:

You have the right method with pd.concat. However you can optimize a little bit by using a boolean mask to avoid to recompute the index difference:

m = bunch_of_conditions
df2 = df[m].copy()
df = pd.concat([df[~m], df2]).sort_index()

Why do you want to make a copy of your dataframe? Is not simpler to use the dataframe itself?

CodePudding user response:

One way I did it was:

df= pd.concat([df.loc[~df.index.isin(df2.index)],df2])
  • Related