Home > database >  Iterate over pairs of variables in for loop in Python to reduce code redundancy
Iterate over pairs of variables in for loop in Python to reduce code redundancy

Time:10-13

I have 3 separate dicts: df1, df2 and df3 with the same column names. I am trying to merge each unique pair of dicts, count the length of the merged dicts and save each length in a dict. How can I do this in a for loop?

final_dict = {}

df1 = {}
df2 = {}
df3 = {}

df1_df2 = df1.merge(df2, on = ["column_name1, column_name2, column_name3"])
df1_df3 = df1.merge(df3, on = ["column_name1, column_name2, column_name3"])
df2_df3 = df2.merge(df3, on = ["column_name1, column_name2, column_name3"])

length1 = len(df1_df2)
length2 = len(df1_df3)
length3 = len(df2_df3)

I'd like to have key,value pairs in final_dict to have:

'df1_df2': length1
'df1_df3': length2
'df2_df3': length3

Since I'm doing the same merge and length operations on different pairs of dicts, can I efficiently do this in a for loop to reduce code redundancy?

CodePudding user response:

from itertools import combinations

dict_ls = {'df1': df1, 'df2': df2, 'df3': df3}
cols = ["column_name1", "column_name2", "column_name3"]

final_dict = {}

for l, r in list(combinations(dict_ls, 2)):
    df_merged = dict_ls[l].merge(dict_ls[r], on=cols)
    final_dict[l '_' r] = len(df_merged)

CodePudding user response:

dfs = [df1, df2, df3]
for i in range(len(dfs) - 1):
    for j in range(i   1, len(dfs)):
        key = f'df{i   1}_df{j   1}'
        merged = dfs[i].merge(dfs[j], on=['column_name1', 'column_name2', 'column_name3'])
        final_dict[key] = len(merged)
  • Related