Below is the script I am working with. For practice, I've created two sets of dataframes, one set of df1,df2,and df3, and another set of dv1,dv2, and dv3. I then created two sets of lists, test and test2, which then combined as zip_list. Now, I am trying to create a loop function that will do the following. 1. Set index and create keys = 2022 and 2021. 2. Swap level so the columns are next to each other. The loop function works but gets only applied to only the first dataframe. Without calling each dataframe one by one, how can I apply it to the whole dataframes that are found in the zipped_list?
import pandas as pd
#Creating a set of dataframes
data = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'logitech', 'samsung', 'lg', 'lenovo'],
'price': [1200, 150, 300, 450, 200]}
df1 = pd.DataFrame(data)
data2 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'mac', 'fujitsu', 'lg', 'asus'],
'price': [2200, 200, 300, 450, 200]}
df2 = pd.DataFrame(data2)
data3 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['microsoft', 'logitech', 'samsung', 'lg', 'asus'],
'price': [1500, 100, 200, 350, 400]}
df3 = pd.DataFrame(data3)
#Creating another set of dataframes
data = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'logitech', 'samsung', 'lg', 'lenovo'],
'price': [10, 20, 30, 40, 50]}
dv1 = pd.DataFrame(data)
data2 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'mac', 'fujitsu', 'lg', 'asus'],
'price': [10, 20, 30, 50, 50]}
dv2 = pd.DataFrame(data2)
data3 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['microsoft', 'logitech', 'samsung', 'lg', 'asus'],
'price': [1, 2, 3, 4, 5]}
dv3 = pd.DataFrame(data3)
#creating a list for dataframe
test=[df1,df2,df3]
test2=[dv1,dv2,dv3]
#combining two lists
zipped = zip(test, test2)
zipped_list = list(zipped)
#Looping through the zipped_list
for x,y in zipped_list:
z=pd.concat([zipped_list[0][0].set_index(['product_name','item_name']), zipped_list[0][1].set_index(['product_name','item_name'])],
axis='columns', keys=['2022', '2021'])
z=z.swaplevel(axis='columns')[zipped_list[0][0].columns[2:]]
print(z)
In addition to this dataframe, there should be two more.
CodePudding user response:
To apply the loop to all the dataframes in zipped_list, you can use enumerate to get the index of each dataframe in the list and access it accordingly within the loop. Here's an example:
#Looping through the zipped_list
for i, (x, y) in enumerate(zipped_list):
z = pd.concat([zipped_list[i][0].set_index(['product_name','item_name']), zipped_list[i][1].set_index(['product_name','item_name'])],
axis='columns', keys=['2022', '2021'])
z = z.swaplevel(axis='columns')[zipped_list[i][0].columns[2:]]
print(z)
This should apply the operations to all dataframes in zipped_list. Note that you need to add the i variable to access the correct dataframe in zipped_list. Also, you need to print z for each iteration of the loop to see the result for each dataframe.