I have the above dataframes..
df_a1 = pd.DataFrame([['ab1', 10], ['ab2', 15], ['ab3', 14]], columns=['listing_id', 'number_a1'])
df_a2 = pd.DataFrame([['aabb1', 5], ['aabb2', 6], ['aabb3', 7]], columns=['listing_id', 'number_b1'])
df_b1 = pd.DataFrame([['ab1', 20], ['ab3', 25], ['ab4', 20]], columns=['id', 'number_b1'])
df_b2 = pd.DataFrame([['aabb3', 13], ['aabb2', 12], ['aabb4', 11]], columns=['id', 'number_b2'])
I need to merge with left_outer the df_a1 with df_b1 and df_a2 with df_b2. Because in reality they are too many data frames I need an iteration.
I have try this..
df_full_aa = pd.DataFrame()
df_full_bb = pd.DataFrame()
df_list = [df_full_aa, df_full_aa]
df_aa = [df_a1, df_a2]
df_bb = [df_b1, df_b2]
for i in range(len(df_list)):
df_list[i] = pd.merge(df_aa[i], df_bb[i], how="left", left_on='listing_id', right_on='id')
print([str(i)])
print(df_list[i].head())
The print inside the loop looks like it work but if i try to print the dfs outside the results are missing.
After many search i have tried also this..
for i in enumerate(df_list):
df_list[i] = pd.merge(df_aa[i], df_bb[i], how="left", left_on='listing_id', right_on='id')
but it returns the above error
TypeError: list indices must be integers or slices, not tuple
I have already try to understand this Python loops are missing results but i can't adjust it to my case.
Thank you for your time!!!
CodePudding user response:
df_full_aa
df_full_bb
are empty, because these variables still point to the same empty dataframes that you have created at the beginning. The new dataframes are stored in your df_list
: you can try this after the loop: df_full_aa = df_list[0]
and df_full_bb = df_list[1]
.
You don't actually need to create empty df_full_aa
and df_full_bb
before the loop. Here is a better version:
df_list = []
df_aa = [df_a1, df_a2]
df_bb = [df_b1, df_b2]
for i in range(len(df_aa)):
df_list.append(pd.merge(df_aa[i], df_bb[i], how="left", left_on='listing_id', right_on='id'))
print([str(i)])
print(df_list[i].head())
df_full_aa, df_full_bb = df_list