I would like to calculate the sum for the type of plans based on the customer_no, plan and plan_total. However it gives me the total sum based on the plan_total of the current row instead of the column. May I know how do I group it by customer_no, plan, plan_total to get the output as shown in the second picture?
data cust;
set cust_det;
by customer_no;
if type_a = "1"
then sum_a = sum(plan_total);
else sum_a = "0";
if type_b = "1"
then sum_b = sum(plan_total);
else sum_b = "0";
if type_c = "1"
then sum_c = sum(plan_total);
else sum_c = "0";
run;
CodePudding user response:
In proc sql
, you can aggregate and join the results back in:
proc sql;
select c.*,
(case when c.type_a = 1 then cc.sum_a else 0 end) as sum_a,
(case when c.type_b = 1 then cc.sum_b else 0 end) as sum_b,
(case when c.type_c = 1 then cc.sum_c else 0 end) as sum_c
from cust c join
(select customer_no,
sum(plan_total * type_a) as sum_a,
sum(plan_total * type_b) as sum_b,
sum(plan_total * type_c) as sum_c
from cust
group by customer_no
) cc
on cc.customer_no = c.customer_no;
run;
CodePudding user response:
A value calculated by processing each row of a group can not be applied to each row of the group while the calculation is being done.
For future questions, post the actual data instead of a snipped image.
Example:
data have;
input customer_no plan $ plan_total;
type_a = ( plan eq 'aa' );
type_b = ( plan eq 'bb' );
type_c = ( plan eq 'cc' );
row_number 1;
format _numeric_ 6.;
datalines;
1001 aa 2000
1001 aa 20000
1001 bb 1000
1001 bb 4000
1001 aa 2100
1001 aa 3000
1002 cc 2100
1002 cc 3000
1002 aa 1200
1002 aa 1210
;
There are several ways to get the result set desired. Here are a two
Way 1
Use SQL grouping and SUM
aggregate function and the Proc SQL
automatic remerging feature. The 0/1 nature of the type_<letter>
variables can be leveraged in the following way, 0*value
means the value will be excluded from the aggregate and 1*value
will be included.
* SUM aggregate using 0/1 flag for inclusion;
* TYPE_<letter> * SUM is for aligning SUM to type;
Proc SQL;
create table want as
select
customer_no, plan, plan_total, type_a, type_b, type_c
, type_a * sum (type_a * plan_total) as sum_a format=6.
, type_b * sum (type_b * plan_total) as sum_b format=6.
, type_c * sum (type_c * plan_total) as sum_c format=6.
from
have
group by
customer_no
order by
row_number
;
Way 2
A dual DOW loop technique can compute a group total and then apply the computation to each row of the group.
data want;
* compute group sum;
do _n_ = 1 by 1 until (last.customer_no);
set have;
by customer_no;
group_sum_a = sum(group_sum_a, type_a * plan_total);
group_sum_b = sum(group_sum_b, type_b * plan_total);
group_sum_c = sum(group_sum_c, type_c * plan_total);
end;
* apply group_sum;
do _n_ = 1 to _n_;
set have;
sum_a = type_a * group_sum_a;
sum_b = type_b * group_sum_b;
sum_c = type_c * group_sum_c;
OUTPUT;
end;
drop group:;
run;
Other techniques include
- Transpose Aggregate Untranspose
- MEANS MERGE
- Arrays