I would like to calculate a column based on the values of mean
and stdev
columns that were calculated in the previous step. I am unable to use the lambda function correctly.
#Import necessary modules
import pandas as pd
data = {
'A':[1, 2, 3],
'B':[4, 5, 6],
'C':[7, 8, 9] }
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
data_mean = df.mean(axis=1)
data_stdev = df.std(axis=1)
#Calculate LV column for data df
df['LV'] = df.apply(
lambda row : 0
if data_mean < 55.5:
LV = (55.5-data_mean) (3.1*data_stdev)
elif data_mean > 57.5:
LV = (data_mean-57.5) (3.1*data_stdev)
else:
LV = (3.1*data_stdev),
axis = 1)
display(df)
CodePudding user response:
Another approach you could try - is a similar speed to the other answer (if not slightly faster):
#Import necessary modules
import pandas as pd
def calculate_lv(x):
if x['MEAN'] < 55.5:
return (55.5 - x['MEAN']) (3.1 * x['STDEV'])
elif x['MEAN'] > 57.5:
return (x['MEAN'] - 57.5) (3.1 * x['STDEV'])
else:
return x['STDEV'] * 3.1
data = {
'A':[1, 2, 3],
'B':[4, 5, 6],
'C':[7, 8, 9] }
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
df['MEAN'] = df.mean(axis=1)
df['STDEV'] = df.std(axis=1)
df['LV'] = df.apply(lambda x: calculate_lv(x), axis=1)
CodePudding user response:
I would suggest using a vectorized approach, as it would work faster:
#Import necessary modules
import pandas as pd
data = {
'A':[1, 2, 3],
'B':[4, 5, 6],
'C':[7, 8, 9] }
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
data_mean = df.mean(axis=1)
data_stdev = df.std(axis=1)
#Calculate LV column for data df
# base value
df['LV'] = 3.1 * data_stdev
# different values
df.loc[data_mean < 55.5, 'LV'] = (55.5 - data_mean) (3.1 * data_stdev)
df.loc[data_mean > 57.5, 'LV'] = (data_mean - 57.5) (3.1 * data_stdev)
display(df)