I have the following function:
def match_function(df):
columns = df.columns.tolist()
matches = {}
for column in columns:
df = df.astype(str)
df_new = df.dropna()
df_new = df[column].str.split(',', expand=True)
df_new = df_new.apply(lambda s: s.value_counts(), axis=1).fillna(0)
match = df_new.iloc[:, 0][0] / df_new.sum(axis=1) * 100
match = round(match, 2)
df[column] = match
matches[column] = match
return matches
I want this function to run completely separately for each row of the dataframe. It will loop through first row of dataframe, then will stop and will run again for the second row and etc.
Because it's written in a such complex and unprofessional way(as I'm new to Python), the result is wrong when I pass a dataframe and it runs through the whole dataframe simultaneously. Or maybe change the function itself somehow, so it would go only row by row
CodePudding user response:
Consider the following df:
a b
0 1.000000 0.000000
1 -2.000000 1.000000
2 1.000000 0.000000
3 3.000000 -4.000000
And the following function, named "func"
.
def func(x):
return x['a'] x['b']
You can apply that function on a row-basis with :
df.apply(func, axis=1)
Than yields:
0 1.000000
1 -1.000000
2 1.000000
3 -1.000000
So basically, for every row, we applied the named function func()
, which is x['a']
x['b']