I am trying to do some calculations based on joined data sets. My aim is to calculate the revenue in prices of the previous year. The code below works for the revenue with current prices and sales.
data work.price;
input date date. car $ price;
format date date9. ;
datalines;
01Jan19 Model1 7000
01Jan19 Model2 4000
01Jan19 Model3 5000
01Jan20 Model1 7500
01Jan20 Model2 4800
01Jan20 Model3 4500
01Jan21 Model1 8000
01Jan21 Model2 5200
01Jan21 Model3 4000
run;
data work.sales;
input date date. type $ sales;
format date date9. ;
datalines;
01Jan19 A 10
01Jan19 B 4
01Jan19 C 50
01Jan20 A 18
01Jan20 B 10
01Jan20 C 16
01Jan21 A 22
01Jan21 B 8
01Jan21 C 13
run;
data work.assignment;
input car $6. type $7.;
datalines;
Model1 A
Model2 B
Model3 C
run;
proc sql ;
create table want as
select Date format date9., *,price*sales as return
from sales
natural join price
natural join assignment
;
quit;
My solution so far, was to shift the times series of the prices prior to joining. But I wonder if this step can be done more efficiently in the proc sql statement.
data work.price;
set work.price;
Date = intnx('month',date, 12);
run;
Thanks a lot for your help!
CodePudding user response:
You can do that by using an inner join and specifying the join conditions. For you this would be something like
proc sql;
create table want as
select
p.date format date9., s.type, s.sales, p.car, p.price, p.price * s.sales as return
from
sales as s
inner join
prices as p
on
intnx("year", s.date, 1) = p.date
inner join
assignment as a
on
s.type = a.type and
p.car = a.car;
quit;
Me personally, I always do explicit inner joins and do not rely on natural joins. If something is wrong with your input data, you get an error instead of a wrong result. Also, the needed columns need to be explicitly selected instead of just selecting *.