Home > Mobile >  Concat dataframe contain the same string
Concat dataframe contain the same string

Time:05-17

Having:

combine_by = ['aaa', 'ccc']
dfs_list = [df_aaa_1, df_aaa_2, df_ccc_1, df_ccc_2, df_ggg_1]

How can I combine the dataframes with the same string from the 'combine by' list in df.name? and get..

result = [df_aaa_1 2, df_ccc_1 2, df_ggg_1]

CodePudding user response:

First you need to make the dfs_list into strings to compare them with the combine_by list. You need to either manually make dfs_list a string, or name the dataframes using df.name which is a more suitable option.

So:

df_aaa_1.name = "df_aaa_1"
df_aaa_2.name = "df_aaa_2"

Then compare each dataframe's name with the combine_by list using a simple for loop if else statement:

combine_by_aaa = []
combine_by_ccc = []

df_aaa = pd.DataFrame()
df_ccc = pd.DataFrame()

def merge_dfs(frames):
    for df in frames:
        if "aaa" in df.name:
            combine_by_aaa.append(df)
        if "ccc" in df.name:
            combine_by_ccc.append(df)
    
    try:
        df_aaa = pd.concat(combine_by_aaa)
        df_ccc = pd.concat(combine_by_ccc)
    except ValueError:
    pass
        

merge_dfs(dfs_list)

CodePudding user response:

You could also try something like this, though it looks very convoluted .

    df1 = pd.DataFrame([['a', 1], ['b', 2]],
                   columns=['letter', 'number'])
df1.name = 'aaa1'
df2 = pd.DataFrame([['c', 3], ['d', 4]],
                   columns=['letter', 'number'])
df2.name='aaa3'
df3 = pd.DataFrame([['e', 3], ['f', 4]],
                   columns=['letter', 'number'])
df3.name='bbb2'

df_list = [df1, df2, df3]

def combine_dfs(df_list, string):
    filtered_df = [df for df in df_list if string in df.name]
    df = pd.concat(filtered_df)
    df.name = string " ".join([df.name[-1:] for df in filtered_df])  #edit this as you need to return whatever name format you want
    return df
    
df5 = combine_dfs(df_list, 'aaa')
print(df5.name)

this gives a result df5 with df.name aaa1 3

  • Related