Having:
combine_by = ['aaa', 'ccc']
dfs_list = [df_aaa_1, df_aaa_2, df_ccc_1, df_ccc_2, df_ggg_1]
How can I combine the dataframes with the same string from the 'combine by' list in df.name? and get..
result = [df_aaa_1 2, df_ccc_1 2, df_ggg_1]
CodePudding user response:
First you need to make the dfs_list
into strings to compare them with the combine_by
list. You need to either manually make dfs_list
a string, or name the dataframes using df.name
which is a more suitable option.
So:
df_aaa_1.name = "df_aaa_1"
df_aaa_2.name = "df_aaa_2"
Then compare each dataframe's name with the combine_by
list using a simple for loop if else statement:
combine_by_aaa = []
combine_by_ccc = []
df_aaa = pd.DataFrame()
df_ccc = pd.DataFrame()
def merge_dfs(frames):
for df in frames:
if "aaa" in df.name:
combine_by_aaa.append(df)
if "ccc" in df.name:
combine_by_ccc.append(df)
try:
df_aaa = pd.concat(combine_by_aaa)
df_ccc = pd.concat(combine_by_ccc)
except ValueError:
pass
merge_dfs(dfs_list)
CodePudding user response:
You could also try something like this, though it looks very convoluted .
df1 = pd.DataFrame([['a', 1], ['b', 2]],
columns=['letter', 'number'])
df1.name = 'aaa1'
df2 = pd.DataFrame([['c', 3], ['d', 4]],
columns=['letter', 'number'])
df2.name='aaa3'
df3 = pd.DataFrame([['e', 3], ['f', 4]],
columns=['letter', 'number'])
df3.name='bbb2'
df_list = [df1, df2, df3]
def combine_dfs(df_list, string):
filtered_df = [df for df in df_list if string in df.name]
df = pd.concat(filtered_df)
df.name = string " ".join([df.name[-1:] for df in filtered_df]) #edit this as you need to return whatever name format you want
return df
df5 = combine_dfs(df_list, 'aaa')
print(df5.name)
this gives a result df5 with df.name aaa1 3