- select caseid, indcase from casemt where indcase like '663-01%'
- select caseid, indcase from casemt where indcase like '663%'
Query1 returns 0 records. while query2 returns multiple records where one of the record's indcase is equal to 19663-01
so the question is why query1 is returning 0 records despite the fact that indcase = 19663-01 exist in oracle table
CodePudding user response:
You can debug your data:
SELECT caseid,
indcase,
DUMP(indcase) AS actual_binary,
DUMP('19663-01') AS expected_binary
FROM casemt
WHERE indcase like '663_01%'
Which, for the sample data:
CREATE TABLE casemt (caseid, indcase) AS
SELECT 1, '19663-01' FROM DUAL UNION ALL
SELECT 2, CAST(UNISTR('19663\201001') AS VARCHAR2(10)) FROM DUAL UNION ALL
SELECT 3, CAST(UNISTR('19663\201201') AS VARCHAR2(10)) FROM DUAL UNION ALL
SELECT 4, CAST(UNISTR('19663\201301') AS VARCHAR2(10)) FROM DUAL UNION ALL
SELECT 5, CAST(UNISTR('19663\201401') AS VARCHAR2(10)) FROM DUAL UNION ALL
SELECT 6, CAST(UNISTR('19663\204301') AS VARCHAR2(10)) FROM DUAL UNION ALL
SELECT 7, CAST(UNISTR('19663\FE6301') AS VARCHAR2(10)) FROM DUAL UNION ALL
SELECT 8, CAST(UNISTR('19663\FF0D01') AS VARCHAR2(10)) FROM DUAL UNION ALL
SELECT 9, CAST(UNISTR('19663\180601') AS VARCHAR2(10)) FROM DUAL;
Outputs:
CASEID INDCASE ACTUAL_BINARY EXPECTED_BINARY 1 19663-01 Typ=1 Len=8: 49,57,54,54,51,45,48,49 Typ=96 Len=8: 49,57,54,54,51,45,48,49 2 19663‐01 Typ=1 Len=10: 49,57,54,54,51,226,128,144,48,49 Typ=96 Len=8: 49,57,54,54,51,45,48,49 3 19663‒01 Typ=1 Len=10: 49,57,54,54,51,226,128,146,48,49 Typ=96 Len=8: 49,57,54,54,51,45,48,49 4 19663–01 Typ=1 Len=10: 49,57,54,54,51,226,128,147,48,49 Typ=96 Len=8: 49,57,54,54,51,45,48,49 5 19663—01 Typ=1 Len=10: 49,57,54,54,51,226,128,148,48,49 Typ=96 Len=8: 49,57,54,54,51,45,48,49 6 19663⁃01 Typ=1 Len=10: 49,57,54,54,51,226,129,131,48,49 Typ=96 Len=8: 49,57,54,54,51,45,48,49 7 19663﹣01 Typ=1 Len=10: 49,57,54,54,51,239,185,163,48,49 Typ=96 Len=8: 49,57,54,54,51,45,48,49 8 19663-01 Typ=1 Len=10: 49,57,54,54,51,239,188,141,48,49 Typ=96 Len=8: 49,57,54,54,51,45,48,49 9 19663᠆01 Typ=1 Len=10: 49,57,54,54,51,225,160,134,48,49 Typ=96 Len=8: 49,57,54,54,51,45,48,49
Which, shows that there are multiple different hyphen characters:
- the ACSII character with the character code 45; and
- the characters in the extended character sets with the UTF-8 (hex) codes of:
- U 1806 (Mongolian Todo Soft Hyphen)
- U 2010 (Hyphen)
- U 2012 (Figure Dash)
- U 2013 (En Dash)
- U 2014 (Em Dash)
- U 2043 (Hyphen Bullet)
- U FE63 (Small Hyphen-Minus)
- U FF0D (Fullwidth Hyphen-Minus)
And your data could be using any of them for a hyphen character.
If you want to match them all then you can use:
SELECT caseid,
indcase
FROM casemt
WHERE TRANSLATE(
indcase,
UNISTR('\1806\2010\2012\2013\2014\2043\FE63\FF0D'),
'--------'
) like '663-01%'
Or you can update your data to try to normalise it to a single hyphen type:
UPDATE casemt
SET indcase = TRANSLATE(
indcase,
UNISTR('\1806\2010\2012\2013\2014\2043\FE63\FF0D'),
'--------'
)
WHERE indcase <> TRANSLATE(
indcase,
UNISTR('\1806\2010\2012\2013\2014\2043\FE63\FF0D'),
'--------'
);
and then your hyphens will be the ASCII hyphen and not a hyphen from the extended character set and then you can use:
SELECT caseid,
indcase
FROM casemt
WHERE indcase LIKE '663-01%'
db<>fiddle here
CodePudding user response:
Can you copy the '663-01%' exactly as it is in the query and check the result of the following:
SELECT DUMP('663-01%',16)
FROM dual;
It should return this (considering your base is UTF8 encoded):
Typ=96 Len=10: 25,31,39,36,36,33,2d,30,31,25
If the result is different you just have copied something with other encoding that transforms into some character that has no visual representation in your encoding.