Home > Net >  Calculate the sum of a field based on customer_no and plan
Calculate the sum of a field based on customer_no and plan

Time:09-26

I would like to calculate the sum for the type of plans based on the customer_no, plan and plan_total. However it gives me the total sum based on the plan_total of the current row instead of the column. May I know how do I group it by customer_no, plan, plan_total to get the output as shown in the second picture?

data cust;
set cust_det;
by customer_no;
if type_a = "1"
then sum_a = sum(plan_total);
else sum_a = "0";

if type_b = "1"
then sum_b = sum(plan_total);
else sum_b = "0";

if type_c = "1"
then sum_c = sum(plan_total);
else sum_c = "0";
run;

ouput of the codes

expected result

CodePudding user response:

In proc sql, you can aggregate and join the results back in:

proc sql;
    select c.*,
           (case when c.type_a = 1 then cc.sum_a else 0 end) as sum_a,
           (case when c.type_b = 1 then cc.sum_b else 0 end) as sum_b,
           (case when c.type_c = 1 then cc.sum_c else 0 end) as sum_c
    from cust c join
         (select customer_no,
                 sum(plan_total * type_a) as sum_a,
                 sum(plan_total * type_b) as sum_b,
                 sum(plan_total * type_c) as sum_c
          from cust
          group by customer_no
         ) cc
         on cc.customer_no = c.customer_no;

run;

CodePudding user response:

A value calculated by processing each row of a group can not be applied to each row of the group while the calculation is being done.

For future questions, post the actual data instead of a snipped image.

Example:

data have;
input customer_no plan $ plan_total;
type_a = ( plan eq 'aa' );
type_b = ( plan eq 'bb' );
type_c = ( plan eq 'cc' );
row_number   1;
format _numeric_ 6.;
datalines;
1001 aa  2000
1001 aa 20000
1001 bb  1000
1001 bb  4000
1001 aa  2100
1001 aa  3000
1002 cc  2100
1002 cc  3000
1002 aa  1200
1002 aa  1210
;

There are several ways to get the result set desired. Here are a two

Way 1

Use SQL grouping and SUM aggregate function and the Proc SQL automatic remerging feature. The 0/1 nature of the type_<letter> variables can be leveraged in the following way, 0*value means the value will be excluded from the aggregate and 1*value will be included.

* SUM aggregate using 0/1 flag for inclusion;
* TYPE_<letter> * SUM is for aligning SUM to type;

Proc SQL;
  create table want as
  select 
    customer_no, plan, plan_total, type_a, type_b, type_c
  , type_a * sum (type_a * plan_total) as sum_a format=6.
  , type_b * sum (type_b * plan_total) as sum_b format=6.
  , type_c * sum (type_c * plan_total) as sum_c format=6.
  from
    have
  group by
    customer_no
  order by
    row_number
  ;

Way 2

A dual DOW loop technique can compute a group total and then apply the computation to each row of the group.

data want;
  * compute group sum;
  do _n_ = 1 by 1 until (last.customer_no);
    set have;
    by customer_no;
    group_sum_a = sum(group_sum_a, type_a * plan_total);
    group_sum_b = sum(group_sum_b, type_b * plan_total);
    group_sum_c = sum(group_sum_c, type_c * plan_total);
  end;
  * apply group_sum;
  do _n_ = 1 to _n_;
    set have;
    sum_a = type_a * group_sum_a;
    sum_b = type_b * group_sum_b;
    sum_c = type_c * group_sum_c;
    OUTPUT;
  end;
  drop group:;
run;

Other techniques include

  • Transpose Aggregate Untranspose
  • MEANS MERGE
  • Arrays
  • Related