I am trying to concatenate a list df_l
of ~200 Dataframes, which all have the same number of columns and names.
When I try to run:
df = pd.concat(df_l, axis=0)
it throws the error:
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
Following this post I tried to reset the index of each dataframe, but I'll still get the same error.
new_l = [df.reset_index(drop=True) for df in df_l]
pd.concat(new_l, axis=0)
Also pd.concat
arguments like ignore_index=True
did not help in any combination. Any advice?
Running on python 3.8
and pandas 1.4.2
.
CodePudding user response:
I think there is problem with duplicated columns names, here is solution for deduplicate them with DataFrame.pipe
:
#https://stackoverflow.com/a/44957247/2901002
def df_column_uniquify(df):
df_columns = df.columns
new_columns = []
for item in df_columns:
counter = 0
newitem = item
while newitem in new_columns:
counter = 1
newitem = "{}_{}".format(item, counter)
new_columns.append(newitem)
df.columns = new_columns
return df
new_l = [df.pipe(df_column_uniquify).reset_index(drop=True) for df in df_l]