Home > Software engineering >  python - drop duplicated index in place in a pandas dataframe
python - drop duplicated index in place in a pandas dataframe

Time:02-10

I have a list of dataframes:

all_df = [df1, df2, df3]

I would like to remove rows with duplicated indices in all dataframes in the list, such that the changes are reflected in the original dataframes df1, df2 and df3. I tried to do

for df in all_df:
    df = df[~df.index.duplicated()]

But the changes are only applied in the list, not on the original dataframes.

Essentially, I want to avoid doing the following:

df1 = df1[~df1.index.duplicated()]
df2 = df2[~df2.index.duplicated()]
df3 = df3[~df3.index.duplicated()]
all_df = [df1,df2,df3]

CodePudding user response:

You need recreate list of DataFrames:

all_df = [df[~df.index.duplicated()] for df in all_df]

Or:

for i, df in enumerate(all_df):
    all_df[i] = df[~df.index.duplicated()]

print (all_df[0])

EDIT: If name of dictionary is important use dictionary of DataFrames, but also inplace modification df1, df2 is not here, need select by keys of dicts:

d = {'price': df1, 'volumes': df2}

d  = {k: df[~df.index.duplicated()] for k, df in all_df.items()}

print (d['price'])
  • Related