I would like to create a function which could convert some concentrations into nanomolar concentration. So I wrote this function:
def convert_molars(df, field_values, field_units):
if (df[field_units].str.contains('uM')): # 1 uM = 1000 nM
df[field_values] *= 1000
elif (df[field_units].str.contains('M')): # 1 M = 1000000000 nM
df[field_values] *= 1000000000
else: # 1 mM = 1000000 nM
df[field_values] *= 1000000
return df
And I started it like this:
standard_units = convert_molars(IC50_nonan_units, 'Standard Value', 'Standard Units')
standard_units.to_csv("standard_units.csv")
standard_units.head()
But I got this error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I tried one more option like this:
def convert_molars_bool(df, field_values, field_units):
if (df[field_units].str == 'uM').bool(): # 1 uM = 1000 nM
df[field_values] *= 1000
elif (df[field_units].str == 'M').bool(): # 1 M = 1000000000 nM
df[field_values] *= 1000000000
else: # 1 mM = 1000000 nM
df[field_values] *= 1000000
return df
But I got this:
AttributeError: 'bool' object has no attribute 'bool'
Could someone please explain me that I did wrong?
CodePudding user response:
the ==
operator returns a bool value. In your if
and elif
conditions you are trying to run the bool()
method on a bool.
def convert_molars_bool(df, field_values, field_units):
if (df[field_units].str == 'uM'): # 1 uM = 1000 nM
df[field_values] *= 1000
elif (df[field_units].str == 'M'): # 1 M = 1000000000 nM
df[field_values] *= 1000000000
else: # 1 mM = 1000000 nM
df[field_values] *= 1000000
return df
This should work
CodePudding user response:
df.loc[df[field_units].str.conatins('uM'), field_values] = 1000*df.loc[df[field_units].str.conatins('uM'), field_values]
df.loc[df[field_units].str.conatins('M'), field_values] = 1000000000*df.loc[df[field_units].str.conatins('M'), field_values]
df.loc[(~df[field_units].str.conatins('M')) & (~df[field_units].str.conatins('uM')), field_values] = 1000000*df.loc[(~df[field_units].str.conatins('M')) & (~df[field_units].str.conatins('uM')), field_values]
CodePudding user response:
Take a look at what df[field_units].str.contains('uM')
gives you.
It should be a pandas
Series
of bool
s.
import pandas as pd
dct = {
'Standard Units': ['uM', 'M', 'D'],
'Standard Value': [23.34, 245.6, 102]
}
df = pd.DataFrame(dct)
print(df['Standard Units'].str.contains('uM'))
#0 True
#1 False
#2 False
#Name: Standard Units, dtype: bool
Even if this wouldn't raise an error, df[field_values] *= 1000
would multiply 1000
onto EVERY row. It doesn't know anything about the conditional in the previous line of code.
What you are trying to do is apply this logic to each row individually rather than the DataFrame
as a whole.
For this you could use a for
loop or better, df.apply()
.
df.apply
takes a function to apply to each column or row.
For example:
import pandas as pd
dct = {
'Standard Units': ['uM', 'M', 'D'],
'Standard Value': [23.34, 245.6, 102]
}
df = pd.DataFrame(dct)
def func(row):
if 'uM' in row['Standard Units']:
return row['Standard Value'] * 1000
elif 'M' in row['Standard Units']:
return row['Standard Value'] * 1000000000
else:
return row['Standard Value'] * 1000000
df['Standard Value'] = df.apply(func, axis=1)
print(df)
# Standard Units Standard Value
#0 uM 2.334000e 04
#1 M 2.456000e 11
#2 D 1.020000e 08
Or, if you want to keep the same function signature you were using :
import pandas as pd
dct = {
'Standard Units': ['uM', 'M', 'D'],
'Standard Value': [23.34, 245.6, 102]
}
df = pd.DataFrame(dct)
def convert_molars(df, field_values, field_units):
# create a function to apply
def func(row):
if 'uM' in row[field_units]:
return row[field_values] * 1000
elif 'M' in row[field_units]:
return row[field_values] * 1000000000
else:
return row[field_values] * 1000000
# apply the function that was just created
df[field_values] = df.apply(func, axis=1)
# return the DataFrame
return df
df = convert_molars(df, 'Standard Value', 'Standard Units')
print(df)