Home > Mobile >  PROC SQL query for a list of a variable with duplicate records
PROC SQL query for a list of a variable with duplicate records

Time:07-09

I have a dataset in SAS which has 13 variables including REPORTING_DATE & COLLATERAL_ID

I need a list of collateral_id with their counts that have duplicate records (all 13 variables same) using REPORTING_DATE =31stDEC2021 & COLLATERAL_ID

code I have tried till now

proc sql;
create table TB_IFRS9_COLLATERAL_31dec21 as
select 'TB_IFRS9_COLLATERAL' as tablename,REPORTING_DATE, COLLATERAL_ID ,COLLATERAL_TYPE_CODE,COLLATERAL_SUB_TYPE_CODE , count(*) as cnt
from PRCR.TB_IFRS9_COLLATERAL
where REPORTING_DATE="31DEC2021"d
group by COLLATERAL_ID, COLLATERAL_TYPE_CODE,COLLATERAL_SUB_TYPE_CODE
order by COLLATERAL_ID, COLLATERAL_TYPE_CODE, COLLATERAL_SUB_TYPE_CODE;
quit;

proc Sql;
create table TB_IFRS9_COLLATERAL_31dec21_1 as
select * from TB_IFRS9_COLLATERAL_31dec21
where cnt>1;
quit;

the above is still throwing up records that are not completely duplicate i.e., even though values for REPORTING_DATE, COLLATERAL_ID, COLLATERAL_TYPE_CODE,COLLATERAL_SUB_TYPE_CODE are same , certain variables still have different information

What is the easiest route/ procedure to gather a list of collateral_id and their counts that will throw up only completely duplicate records

Thanks

CodePudding user response:

proc sql;
create table TB_IFRS9_COLLATERAL_31dec21 as
 select 'TB_IFRS9_COLLATERAL' as tablename,* , count(*) as cnt
 from PRCR.TB_IFRS9_COLLATERAL
 where REPORTING_DATE="31DEC2021"d
 group by 2,3,4,5,6,78,9,10,11,12,13,14
 having cnt>1
 order by COLLATERAL_ID, COLLATERAL_TYPE_CODE, COLLATERAL_SUB_TYPE_CODE;
quit;

CodePudding user response:

Why not just use proc sort with nodup and keep only the variables of interest?

data have;
    input id var1$ var2$ var3$;
    datalines;
1 A A A
1 A A A
2 B B B
2 B B B
3 C D E
4 F G H
;
run;

proc sort data   = have(keep=id var1-var3)
          out    = have_nodupes 
          dupout = dupes
          nodup
          ;

    by id;
run;
  • Related