Home > Software engineering >  loop through pandas columns inside function
loop through pandas columns inside function

Time:09-30

I have following function:

def match_function(column):
    df_1 = df[column].str.split(',', expand=True)
    df_11=df_1.apply(lambda s: s.value_counts(), axis=1).fillna(0)
    match = df_11.iloc[:, 0][0]/df_11.sum(axis=1)*100
    df[column] = match
    return match

this functuion only works if I enter specific column name

how to change this function in the way, if I pass it a certain dataframe, it will loop through all of its columns automatically. so I won't have to enter each column separately?

ps. I know the function it self written very poorly, but im kinda new to coding, sorry

CodePudding user response:

You need to wrap the function so that it does this iteratively over all columns.

If you add this to your code then it'll iterate over the columns while returning the match results in a list (as you will have multiple results as you're running over multiple columns).

def match_over_dataframe_columns(dataframe):
    return [match_function(column) for column in dataframe.columns]

results = match_over_dataframe_columns(df)

CodePudding user response:

Instead of inputting column to your function, input the entire dataframe. Then, cast the columns of the df to a list and loop over the columns, performing your analysis on each column. For example:

def match_function(df):

    columns = df.columns.tolist()
    matches = {}
    
    for column in columns:
        #do your analysis
        #instead of returning match,
        matches[column] = match

    return matches

This will return a dictionary with keys of your columns and values of the corresponding match value.

CodePudding user response:

just loop through the columns

def match_function(df):
    l_match = []
    for column in df.columns:
        df_1 = df[column].str.split(',', expand=True)
        df_11=df_1.apply(lambda s: s.value_counts(), axis=1).fillna(0)
        match = df_11.iloc[:, 0][0]/df_11.sum(axis=1)*100
        df[column] = match
        l_match.append(match)
    return l_match
  • Related