Home > Net >  Multiply all and only numeric values of dataframe using lambda function
Multiply all and only numeric values of dataframe using lambda function

Time:04-17

Dataframe stu_alcol looks like following:

school  sex age address famsize Pstatus Medu    Fedu    Mjob    Fjob    reason  guardian
0   GP  F   18  U   GT3 A   4   4   at_home teacher course  mother
1   GP  F   17  U   GT3 T   1   1   at_home other   course  father
2   GP  F   15  U   LE3 T   1   1   at_home other   other   mother
3   GP  F   15  U   GT3 T   4   2   health  services    home    mother
4   GP  F   16  U   GT3 T   3   3   other   other   home    father

Goal is to multiply all integer values with 10 (playing with data)

This code however throws 'invalid syntax' error

stu_alcol.transform(lambda x: x*10 if isinstance(x, int))

Can anyone help? Please understand that I am aware of other possible solutions. I just want to understand what can be possibly wrong here.

CodePudding user response:

You can update the entire df to numeric, and let 'coerce' conver the non-numerics to NaN. Multiply that by 10 and update the original df.

This should allow you to handle mixed-type columns properly as well.

df.update(df.apply(pd.to_numeric, errors='coerce').mul(10))

CodePudding user response:

You can select the columns by name and multiply them by a value.

stu_alcol[['age', 'Medu', 'Fedu']] *= 10
# stu_alcol[['age', 'Medu', 'Fedu']] = stu_alcol[['age', 'Medu', 'Fedu']]*10
# stu_alcol[['age', 'Medu', 'Fedu']] = stu_alcol[['age', 'Medu', 'Fedu']].multiply(10)

All three examples give the same result but using different notations.

Comment

You can perform a apply() function to all rows like below:

stu_alcol = stu_alcol.apply(lambda x: [xx*10 if isinstance(xx,int) else xx for xx in x])

but this is not easy to read and can have some performance problems.

CodePudding user response:

The reason this isn't working is that a lambda function can only have one expression. Your if makes the lambda function more than one expression, hence the 'invalid syntax' error.

You would have to make the lambda function a single expression, for example by making it its own function, to correct the error (also note that the type that you probably want to be checking for is numpy.int64 not int).

As an example, the following will work (although the mult_ints_by_10 function is just some example code to make the point, and certainly isn't optimised!)

def mult_ints_by_10(data_series):
    return_series = data_series.copy()
    for loop in range(len(data_series)):
        element = data_series[loop]
        return_series[loop] = element * 10 if isinstance(element, numpy.int64) else element
    return return_series

stu_alcol.transform(lambda x: mult_ints_by_10(x))
  • Related