Home > OS >  Function not working when looping through a list of dataframes
Function not working when looping through a list of dataframes

Time:12-12

I have a list of dataframes that I want to loop through all of them and perform the same actions. The dataframes have the same format. I used a function and a loop like you see in the code below but it seems that the only changes that are passed is the renaming of the columns. Am I missing something here?

def changes(df):
    df = df[["A","B","C"]]
    df = df/1000000
    df["A"] = df["A"]*1000000
    df.rename(columns={'A': 'A1', 'B': 'B1','C': 'C1'}, inplace=True)
    df["A"] = df["A"].astype(int)
    df = df.transpose()
    return df

dfs = [df1,df2,df3]

for i in dfs:
    i = changes(i)

CodePudding user response:

Use enumerate in your loop:

# Setup
df1 = pd.DataFrame({'A': [10 , 20 , 30], 'B': [11, 21, 31], 'C': [12, 22, 32]})
df2 = pd.DataFrame({'A': [110 , 120 , 130], 'B': [111, 121, 131], 'C': [112, 122, 132]})
df3 = pd.DataFrame({'A': [210 , 220 , 230], 'B': [211, 221, 231], 'C': [212, 222, 232]})
dfs = [df1, df2, df3]

def changes(df):
    df = df[["A","B","C"]]
    df = df/1000000
    df["A"] = df["A"]*1000000
    df = df.rename(columns={'A': 'A1', 'B': 'B1','C': 'C1'})  # <- Don't use inplace
    df["A1"] = df["A1"].astype(int)  # <- A does not exist anymore
    df = df.transpose()
    return df
    
for i, df in enumerate(dfs):
    dfs[i] = changes(df)

Output:

>>> dfs
[            0          1          2
 A1  10.000000  20.000000  30.000000
 B1   0.000011   0.000021   0.000031
 C1   0.000012   0.000022   0.000032,
              0           1           2
 A1  110.000000  120.000000  130.000000
 B1    0.000111    0.000121    0.000131
 C1    0.000112    0.000122    0.000132,
              0           1           2
 A1  210.000000  220.000000  230.000000
 B1    0.000211    0.000221    0.000231
 C1    0.000212    0.000222    0.000232]

CodePudding user response:

The problem is you are naming the modified dataframe as i which is the iterator in your for loop, it's not being stored anywhere. You could solve this by creating a new list of dataframes with the desired output using list comprehensions to avoid for loops. For example:

dfs = [df1,df2,df3]
new_dfs = [changes(i) for i in dfs]

Edit:

You can simply reassign them with:

df1,df2,df3 = [changes(i) for i in dfs]

CodePudding user response:

Better apply with oops

class Changes:
    def __init__(self,df):
        self.df = df
    def transform(self):
        df = self.df[["A","B","C"]]
        df= self.df/1000000
        df = self.df["A"]*1000000
        self.df.rename(columns={'A': 'A1', 'B': 'B1','C': 'C1'},inplace=True}
        df["A"] = self.df["A"].astype(int)
        df = self.df.transpose()
        return df

obj = Changes(df)
df = obj.transform()

now you can iterate through your list of dataframe

  • Related