Home > database >  Apply own function to every item in DataFrame
Apply own function to every item in DataFrame

Time:02-19

I have created a function to give out a rank based on the value in each cell of the table below: Table name is "ranked"

Date        MMM     AOS     ABT
2016-01-31  55.0    411.0   102.0
2016-02-29  44.0    425.0   96.0
2016-03-31  29.0    410.0   70.0
2016-04-30  29.0    425.0   87.0
2016-05-31  46.0    409.0   52.0

Function:

def get_rank(x):
    if 1 <= x < 96:
        return 1
    elif 96 <= x < 193:
        return 2
    elif 193 <= x < 289:
        return 3
    elif 289 <= x <= 385:
        return 4
    elif x > 385:
        return 5

I have tried to apply the function using lambda:

ranked.apply(lambda x: get_rank(x))

However it gives me the error message:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

The end goal is to have a 1 for all values in the table that are below 96, a 2 for all values higher than 192 and smaller than 289 .... and so on up to 5.

Could you please give me a hint how I can easily apply this function to the table? Appreciate your help!

CodePudding user response:

Use applymap instead:

>>> ranked[['MMM','AOS','ABT']].applymap(get_rank)

Should return the sub-dataframe "MMM, AOS, ABT" resulting from applying your get_rank() function to each value.

CodePudding user response:

I would just add ranked columns to the data frame:

df['Ranked MMM']=[get_rank(i) for i in df['MMM']]

And also add a default return at the end of the function like return 0

CodePudding user response:

You can use .applymap to do this. You don't have to wrap your function in a lambda since it already takes a single value as an argument. Since, trying to apply your function on the date would result in an error, you have to specify which columns you want to map on and then replace the original values.

apply_on = ["MMM", "AOS", "ABT"]
ranked[apply_on] = ranked[apply_on].applymap(get_rank)

  • Related