I was trying to replicate some SQL code:
proc sql;
update __crsp_sf1
set vol =
case
when date < '01FEB2001'd then vol / 2
when date <= '31DEC2001'd then vol / 1.8
when date < '31DEC2003'd then vol / 1.6
else vol
end
where exchcd = 3;
quit;
and I used
crsp_sf1.loc[(crsp_sf1['exchcd'] == 3)&(crsp_sf1['date']<datetime.datetime(2001, 2, 1)),'vol']/=2
crsp_sf1.loc[(crsp_sf1['exchcd'] == 3)&(crsp_sf1['date']>datetime.datetime(2001, 2, 1)&(crsp_sf1['date']<=datetime.datetime(2001, 12, 31))),'vol']/=1.8
crsp_sf1.loc[(crsp_sf1['exchcd'] == 3)&(crsp_sf1['date']>datetime.datetime(2001, 12, 31)&(crsp_sf1['date']<datetime.datetime(2003, 12, 31))),'vol']/=1.6
But I got a warning saying "TypeError: unsupported operand type(s) for &: 'datetime.datetime' and 'bool'". At the same time, I checked my dataframe to find some values of 'vol' column were updated with 1/2 of the previous value. But I did not know if every row was updated or the meaning of this warning. How should I did this SQL code in python correctly?
CodePudding user response:
I'm violating policy here by posting an answer to what is a typo, but because the lines are complicated, it might not be clear where the problem is. Here is your code, spaced out vertically to be easier to read (and, hence, easier to spot problems). This works.
import pandas as pd
import datetime
data = [
[ 3, '2001-01-15', 100 ],
[ 3, '2001-04-15', 100 ],
[ 3, '2001-09-15', 100 ],
[ 3, '2002-01-15', 100 ],
[ 3, '2002-04-15', 100 ],
[ 3, '2002-09-15', 100 ]
]
crsp_sf1 = pd.DataFrame(data, columns=['exchcd','date','vol'])
crsp_sf1['date'] = pd.to_datetime(crsp_sf1['date'])
print(crsp_sf1)
crsp_sf1.loc[
(crsp_sf1['exchcd'] == 3)
&
(crsp_sf1['date']<datetime.datetime(2001, 2, 1))
,
'vol'] /= 2
print(crsp_sf1)
crsp_sf1.loc[
(crsp_sf1['exchcd'] == 3)
&
(crsp_sf1['date']>datetime.datetime(2001, 2, 1))
&
(crsp_sf1['date']<=datetime.datetime(2001, 12, 31))
,
'vol'] /= 1.8
print(crsp_sf1)
crsp_sf1.loc[
(crsp_sf1['exchcd'] == 3)
&
(crsp_sf1['date']>datetime.datetime(2001, 12, 31))
&
(crsp_sf1['date']<datetime.datetime(2003, 12, 31))
,
'vol'] /= 1.6
print(crsp_sf1)
CodePudding user response:
It's simple a parentheses mix-up, as @TimRoberts pointed out:
crsp_sf1.loc[(crsp_sf1['exchcd'] == 3)&(crsp_sf1['date']<datetime.datetime(2001, 2, 1)),'vol']/=2
crsp_sf1.loc[(crsp_sf1['exchcd'] == 3)&(crsp_sf1['date']>datetime.datetime(2001, 2, 1))&(crsp_sf1['date']<=datetime.datetime(2001, 12, 31))),'vol']/=1.8
^ add this ^ remove this
crsp_sf1.loc[(crsp_sf1['exchcd'] == 3)&(crsp_sf1['date']>datetime.datetime(2001, 12, 31))&(crsp_sf1['date']<datetime.datetime(2003, 12, 31))),'vol']/=1.6
^ add this ^ remove this