I am trying to compare the same strings having space and not having space but all three comparison operators ('>, <, =') are giving false output.
I first had assumption oracle string comparison is nonpadded. please can any one explain am I missing on something?.
select case when 'a ' > 'a' then 't' else 'f' end from dual;
select case when 'a ' < 'a' then 't' else 'f' end from dual;
select case when 'a ' = 'a' then 't' else 'f' end from dual;
CodePudding user response:
[TL;DR] To get it to work as expected, use CAST
to give the literals an explicit data type. Something like either: CAST('a ' AS VARCHAR2(10))
or CAST('a ' AS CHAR(2))
.
There is something happening with the default literal data type and it appears to be a bug.
If you define a table:
CREATE TABLE data (lhs CHAR(2), rhs CHAR(1));
INSERT INTO data VALUES ('a ', 'a');
and then run:
select case when lhs > rhs then 't' else 'f' end AS gt_ch,
case when lhs < rhs then 't' else 'f' end AS lt_ch,
case when lhs = rhs then 't' else 'f' end AS eq_ch,
case
when lhs > rhs then 'gt'
when lhs < rhs then 'lt'
when lhs = rhs then 'eq'
else NULL
end AS comp,
case
when CAST(lhs AS VARCHAR2(10)) > CAST(rhs AS VARCHAR2(10))
then 't'
else 'f'
end AS gt_vc,
case
when CAST(lhs AS VARCHAR2(10)) < CAST(rhs AS VARCHAR2(10))
then 't'
else 'f'
end AS lt_vc,
case
when CAST(lhs AS VARCHAR2(10)) = CAST(rhs AS VARCHAR2(10))
then 't'
else 'f'
end AS eq_vc,
DUMP(lhs) AS dmp_lsh_raw,
DUMP(CAST(lhs AS CHAR(2))) AS dmp_lhs_ch,
DUMP(CAST(lhs AS VARCHAR2(10))) AS dmp_lhs_vc,
DUMP(rhs) AS dmp_rsh_raw,
DUMP(CAST(rhs AS CHAR(2))) AS dmp_rhs_ch,
DUMP(CAST(rhs AS VARCHAR2(10))) AS dmp_rhs_vc
from data;
Then the output is:
GT_CH LT_CH EQ_CH COMP GT_VC LT_VC EQ_VC DMP_LSH_RAW DMP_LHS_CH DMP_LHS_VC DMP_RSH_RAW DMP_RHS_CH DMP_RHS_VC f f t eq t f f Typ=96 Len=2: 97,32 Typ=96 Len=2: 97,32 Typ=1 Len=2: 97,32 Typ=96 Len=1: 97 Typ=96 Len=2: 97,32 Typ=1 Len=1: 97
This works and comparing them as their default data-type of CHAR
, using Blank-Padding Comparison Semantics, then 'a '
and 'a'
are equal but comparing them as VARCHAR2
data-types, using Non-Padded Comparison Semantics, then 'a '
is greater than 'a'
. This is the expected behaviour.
If you do the same query with literals explicitly cast to CHAR
data types generated inside a sub-query factoring clause (a.k.a. CTE):
WITH cte (lhs, rhs) AS (
SELECT CAST('a ' AS CHAR(2)),
CAST('a' AS CHAR(1))
FROM DUAL
)
select case when lhs > rhs then 't' else 'f' end AS gt_ch,
case when lhs < rhs then 't' else 'f' end AS lt_ch,
case when lhs = rhs then 't' else 'f' end AS eq_ch,
case
when lhs > rhs then 'gt'
when lhs < rhs then 'lt'
when lhs = rhs then 'eq'
else NULL
end AS comp,
case
when CAST(lhs AS VARCHAR2(10)) > CAST(rhs AS VARCHAR2(10))
then 't'
else 'f'
end AS gt_vc,
case
when CAST(lhs AS VARCHAR2(10)) < CAST(rhs AS VARCHAR2(10))
then 't'
else 'f'
end AS lt_vc,
case
when CAST(lhs AS VARCHAR2(10)) = CAST(rhs AS VARCHAR2(10))
then 't'
else 'f'
end AS eq_vc,
DUMP(lhs) AS dmp_lsh_raw,
DUMP(CAST(lhs AS CHAR(2))) AS dmp_lhs_ch,
DUMP(CAST(lhs AS VARCHAR2(10))) AS dmp_lhs_vc,
DUMP(rhs) AS dmp_rsh_raw,
DUMP(CAST(rhs AS CHAR(2))) AS dmp_rhs_ch,
DUMP(CAST(rhs AS VARCHAR2(10))) AS dmp_rhs_vc
from cte;
Then the output is identical to when the data is stored in a table and, again, it is the expected behaviour.
However, if you remove the explicit casts and use the default data types for the literals:
WITH cte_default_data_type (lhs, rhs) AS (
SELECT 'a ',
'a'
FROM DUAL
)
select case when lhs > rhs then 't' else 'f' end AS gt_ch,
case when lhs < rhs then 't' else 'f' end AS lt_ch,
case when lhs = rhs then 't' else 'f' end AS eq_ch,
case
when lhs > rhs then 'gt'
when lhs < rhs then 'lt'
when lhs = rhs then 'eq'
else NULL
end AS comp,
case
when CAST(lhs AS VARCHAR2(10)) > CAST(rhs AS VARCHAR2(10))
then 't'
else 'f'
end AS gt_vc,
case
when CAST(lhs AS VARCHAR2(10)) < CAST(rhs AS VARCHAR2(10))
then 't'
else 'f'
end AS lt_vc,
case
when CAST(lhs AS VARCHAR2(10)) = CAST(rhs AS VARCHAR2(10))
then 't'
else 'f'
end AS eq_vc,
DUMP(lhs) AS dmp_lsh_raw,
DUMP(CAST(lhs AS CHAR(2))) AS dmp_lhs_ch,
DUMP(CAST(lhs AS VARCHAR2(10))) AS dmp_lhs_vc,
DUMP(rhs) AS dmp_rsh_raw,
DUMP(CAST(rhs AS CHAR(2))) AS dmp_rhs_ch,
DUMP(CAST(rhs AS VARCHAR2(10))) AS dmp_rhs_vc
from cte_default_data_type;
Then the output is:
GT_CH LT_CH EQ_CH COMP GT_VC LT_VC EQ_VC DMP_LSH_RAW DMP_LHS_CH DMP_LHS_VC DMP_RSH_RAW DMP_RHS_CH DMP_RHS_VC f f f eq t f f Typ=96 Len=2: 97,32 Typ=96 Len=2: 97,32 Typ=1 Len=2: 97,32 Typ=96 Len=1: 97 Typ=96 Len=2: 97,32 Typ=1 Len=1: 97
And none of the first 3 CASE
expressions are true (as found in the question). However, the output from DUMP
is identical to the previous queries.
This should not happen as the values should either be compared as CHAR
data types, in which case Blank-Padding Comparison Semantics should apply, or compared as VARCHAR2
data types, in which case Non-Padded Comparison Semantics should apply, and in this case it appears to be using something else and is not following either rule and is not the expected behaviour.
To work around it, use CAST
to give the literal an explicit data-type as this appears to "fix" the issue.
db<>fiddle here
CodePudding user response:
When character strings are compared, there are two options:
- Blank-Padded Comparison Semantics
If the two values have different lengths, then Oracle first adds blanks to the end of the shorter one so their lengths are equal. Oracle then compares the values character by character up to the first character that differs
- Nonpadded comparison semantics
See documentation.
If all 3 cases you posted are put into the same case
expression, this is the result:
SQL> select case when 'a ' > 'a' then 't1'
2 when 'a ' < 'a' then 'f2'
3 when 'a ' = 'a' then 'f3'
4 else 'x4'
5 end query_1
6 from dual;
QU
--
f3
SQL>
f3
? It says they are equal because of the blank-padded comparison (described above).