How to apply a function to each row of a dataframe and get the results back-CodePudding

This is my data Frame

            3        4        5        6       97       98       99      100
0         1.0      2.0      3.0      4.0     95.0     96.0     97.0     98.0
1     50699.0  16302.0  50700.0  16294.0  50735.0  16334.0  50737.0  16335.0
2     57530.0  33436.0  57531.0  33438.0      NaN      NaN      NaN      NaN
3     24014.0  24015.0  34630.0  24016.0      NaN      NaN      NaN      NaN
4     44933.0   2611.0  44936.0   2612.0  44982.0   2631.0  44972.0   2633.0
1792  46712.0  35340.0  46713.0  35341.0  46759.0  35387.0  46760.0  35388.0
1793  61283.0  40276.0  61284.0  40277.0  61330.0  40323.0  61331.0  40324.0
1794      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0
1795      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0
1796  27156.0  48331.0  27157.0  48332.0      NaN      NaN      NaN      NaN

--> How do I apply the below function and get the answers back for each row in one run... values is the array of values of each row and N is 100

def entropy_s(values, N):
    a= scipy.stats.entropy(values,base=2)
    a = round(a,2)
    global CONSTANT_COUNT,RANDOM_COUNT,LOCAL_COUNT,GLOBAL_COUNT,ODD_COUNT
    if(math.isnan(a) == True):
        a = 0.0
    if(a==0.0):
        CONSTANT_COUNT  = 1
    elif(a<round(math.log2(N),2)):
        LOCAL_COUNT  =1
        RANDOM_COUNT  =1
    elif(a==round(math.log2(N),2)):
            RANDOM_COUNT  =1
            GLOBAL_COUNT  = 1
            LOCAL_COUNT  = 1
    else:
        ODD_COUNT  =1

CodePudding user response：

import functools
series = df.apply(functool.partial(entropy_s, N=100), axis=1)
# or 
series = df.apply(lambda x: entropy_s(x, N=100), axis=1)

axis=1 will push the rows of your df to the first arg of apply.

You will get a pd.Series of None's though, because your function doesn't return anything.

I highly suggest to avoid using globals in your function.

CodePudding user response：

I assume that the values are supposed to be rows? in that case, I suggest the following: rows will be fed to function and you can get the column in each row using row.column_name.

def func(N=100):
  def entropy_s(values):
    a= scipy.stats.entropy(values,base=2)
    a = round(a,2)
    global CONSTANT_COUNT,RANDOM_COUNT,LOCAL_COUNT,GLOBAL_COUNT,ODD_COUNT
    if(math.isnan(a) == True):
        a = 0.0
    if(a==0.0):
        CONSTANT_COUNT  = 1
    elif(a<round(math.log2(N),2)):
        LOCAL_COUNT  =1
        RANDOM_COUNT  =1
    elif(a==round(math.log2(N),2)):
            RANDOM_COUNT  =1
            GLOBAL_COUNT  = 1
            LOCAL_COUNT  = 1
    else:
        ODD_COUNT  =1
  return entropy_s


df.apply(func(100), axis=1)

if you want to have the rows as list you can do this:

df.apply(lambda x: func(100)([k for k in x]), axis=1)