I have 3 separate dicts: df1, df2 and df3 with the same column names. I am trying to merge each unique pair of dicts, count the length of the merged dicts and save each length in a dict. How can I do this in a for loop?
final_dict = {}
df1 = {}
df2 = {}
df3 = {}
df1_df2 = df1.merge(df2, on = ["column_name1, column_name2, column_name3"])
df1_df3 = df1.merge(df3, on = ["column_name1, column_name2, column_name3"])
df2_df3 = df2.merge(df3, on = ["column_name1, column_name2, column_name3"])
length1 = len(df1_df2)
length2 = len(df1_df3)
length3 = len(df2_df3)
I'd like to have key,value pairs in final_dict to have:
'df1_df2': length1
'df1_df3': length2
'df2_df3': length3
Since I'm doing the same merge and length operations on different pairs of dicts, can I efficiently do this in a for loop to reduce code redundancy?
CodePudding user response:
from itertools import combinations
dict_ls = {'df1': df1, 'df2': df2, 'df3': df3}
cols = ["column_name1", "column_name2", "column_name3"]
final_dict = {}
for l, r in list(combinations(dict_ls, 2)):
df_merged = dict_ls[l].merge(dict_ls[r], on=cols)
final_dict[l '_' r] = len(df_merged)
CodePudding user response:
dfs = [df1, df2, df3]
for i in range(len(dfs) - 1):
for j in range(i 1, len(dfs)):
key = f'df{i 1}_df{j 1}'
merged = dfs[i].merge(dfs[j], on=['column_name1', 'column_name2', 'column_name3'])
final_dict[key] = len(merged)