Data Time,PM2.5, 1/1/2014,9 2/1/2014,10
import pandas as pd
df = pd.read_csv('xx.csv')
data = pd.DataFrame(df)
def calculation(y):
if 0 < y and y < 12:
bello=data.assign(API=(50/12)*y)
elif 12.1 <= y and y <= 50.4:
bello=data.assign(API=(((100-51)/(50.4-12.1))*(y-12.1)) 51)
elif 50.5 <= y and y <= 55.4:
bello=data.assign(API=(((150-101)/(55.4-50.5))*(y-50.5)) 101)
elif 55.5 <= y and y <= 150.4:
bello=data.assign(API=(((200-151)/(150.4-55.5))*(y-55.5)) 151)
elif 150.5 <= y and y <= 250.4:
bello=data.assign(API=(((300-201)/(250.4-150.5))*(y-150.5)) 201)
elif 250.5 <= y and y <= 350.4:
bello=data.assign(API=(((400-301)/(350.4-250.5))*(y-250.5)) 301)
else:
bello=data.assign(API=(((500-401)/(500.4-350.5))*(y-350.5)) 401)
return bello
y=data['PM2.5']
print(calculation(y))
Hi everyone,
I want to convert air quality data to PM2.5 with above condition and equation using coding above.
I received an error "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().".
I hope someone can tell me what is the problem.
Thanks in advance.
I wrote the coding above but show error. Hope someone can tell what is the problem of my coding.
CodePudding user response:
For example, you can rewrite the function for single value and use df.apply(...)
import pandas as pd
data = pd.DataFrame({'PM2.5': [15, 50, 1000]})
def calculation(y):
if 0<y<12:
return (50/12)*y
elif 12.1 <= y <= 50.4:
return (((100-51)/(50.4-12.1))*(y-12.1)) 51
## ....
else:
return (((500-401)/(500.4-350.5))*(y-350.5)) 401
y=data['PM2.5']
print(y.apply(calculation))
This is close to your code, however faster solutions might exists by vectorizing.