I have table called generalledger_mgmt
that contains many columns. I wrote query to perform some operations and select some of them as below.
My problem is some of those rows comes duplicated (some times different signs, and some time the same sign but regardless, for my report I consider them duplicated) and I would like to retrieve only one of the rows (could be the last one or first one it doesn't really matter to me as long as I am getting one row per 'serial
') - how can I ajust my code to achieve that?
table:
rowsortid | COMPANYID | serial | TRANSDATEID | AMOUNTLC | CURRENCYCODE |
---|---|---|---|---|---|
20211109-ICJ-565 | SSE | 222 | 20211101 | 500 | SEK |
20211109-ICJ-564 | SSE | 222 | 20211105 | -500 | SEK |
20211109-ICT-562 | SSE | 202 | 20211001 | 600 | SEK |
20211109-ICK-568 | SSE | 210 | 20211201 | 200 | SEK |
in my code I am grouping by rowsortid cause it is the only one different from row till another. My main interest is to get one row per serial
, so in this case the output should be:
rowsortid | COMPANYID | serial | TRANSDATEID | AMOUNTLC | CURRENCYCODE |
---|---|---|---|---|---|
20211109-ICJ-564 | SSE | 222 | 20211105 | -500 | SEK |
20211109-ICT-562 | SSE | 202 | 20211001 | 600 | SEK |
20211109-ICK-568 | SSE | 210 | 20211201 | 200 | SEK |
My code:
SELECT rowsortid
,COMPANYID
,REGEXP_SUBSTR(INDIVIDUALCODE, '[^| ] ', 1, 1) as serial
,max(TRANSDATEID)
,sum(AMOUNTLC)
,max(CURRENCYCODE)
FROM generalledger_mgmt
WHERE 1 = 1
AND companyid = 'SSE'
AND transdateid >= 20211101
AND transdateid < 20220101
AND accountno LIKE '3311%'
AND HIERARCHYID LIKE '3311C%'
GROUP BY rowsortid
,COMPANYID
,REGEXP_SUBSTR(INDIVIDUALCODE, '[^| ] ', 1, 1)
ORDER BY REGEXP_SUBSTR(individualcode, '[^| ] ', 1, 1);
CodePudding user response:
There are some columns in your query which are not in your table definition so I have removed them. The function abs
should be quicker that a regex for saying that positive and negative serial numbers are the same.
SELECT rowsortid ,COMPANYID ,ABS(serial) as serial ,max(TRANSDATEID) ,sum(AMOUNTLC) ,max(CURRENCYCODE) FROM generalledger_mgmt WHERE 1 = 1 AND companyid = 'SSE' AND transdateid >= 20211101 AND transdateid < 20220101 GROUP BY rowsortid ,COMPANYID ,ABS(serial) ORDER BY ABS(serial);
rowsortid | COMPANYID | serial | max(TRANSDATEID) | sum(AMOUNTLC) | max(CURRENCYCODE) :--------------- | :-------- | -----: | :--------------- | ------------: | :---------------- 20211109-ICK-568 | SSE | 210 | 20211201 | 200 | SEK 20211109-ICJ-565 | SSE | 222 | 20211101 | 500 | SEK 20211109-ICJ-564 | SSE | 222 | 20211105 | -500 | SEK
db<>fiddle here
CodePudding user response:
If you only want one row per id it must be the only column in the group by
create table generalledger_mgmt ( rowsortid varchar(25), COMPANYID varchar(5), serial int, TRANSDATEID varchar(25), AMOUNTLC int, CURRENCYCODE varchar(5) ); insert into generalledger_mgmt values ('20211109-ICJ-565','SSE',222,'20211101', 500,'SEK'), ('20211109-ICJ-564','SSE',222,'20211105',-500,'SEK'), ('20211109-ICT-562','SSE',202,'20211001', 600,'SEK'), ('20211109-ICK-568','SSE',210,'20211201', 200,'SEK');
✓ ✓
SELECT MAX(rowsortid) ,COMPANYID ,MAX(REGEXP_SUBSTR(serial, '[^| ] ', 1, 1)) as serial ,max(TRANSDATEID) ,sum(AMOUNTLC) ,max(CURRENCYCODE) FROM generalledger_mgmt WHERE 1 = 1 AND companyid = 'SSE' AND transdateid >= 20211101 AND transdateid < 20220101 GROUP BY COMPANYID ORDER BY MAX(REGEXP_SUBSTR (serial, '[^| ] ', 1, 1));
MAX(rowsortid) | COMPANYID | serial | max(TRANSDATEID) | sum(AMOUNTLC) | max(CURRENCYCODE) :--------------- | :-------- | :----- | :--------------- | ------------: | :---------------- 20211109-ICK-568 | SSE | 222 | 20211201 | 200 | SEK
db<>fiddle here