Home > Net >  oracle DB Why String having space comparison in gives false always?
oracle DB Why String having space comparison in gives false always?

Time:03-21

I am trying to compare the same strings having space and not having space but all three comparison operators ('>, <, =') are giving false output.

I first had assumption oracle string comparison is nonpadded. please can any one explain am I missing on something?.

select case when 'a ' > 'a' then 't' else 'f' end from dual; 

enter image description here

select case when 'a ' < 'a' then 't' else 'f' end from dual; 

enter image description here

select case when 'a ' = 'a' then 't' else 'f' end from dual; 

enter image description here

CodePudding user response:

[TL;DR] To get it to work as expected, use CAST to give the literals an explicit data type. Something like either: CAST('a ' AS VARCHAR2(10)) or CAST('a ' AS CHAR(2)).


There is something happening with the default literal data type and it appears to be a bug.

If you define a table:

CREATE TABLE data (lhs CHAR(2), rhs CHAR(1));

INSERT INTO data VALUES ('a ', 'a');

and then run:

select case when lhs > rhs then 't' else 'f' end AS gt_ch,
       case when lhs < rhs then 't' else 'f' end AS lt_ch,
       case when lhs = rhs then 't' else 'f' end AS eq_ch,
       case
       when lhs > rhs then 'gt'
       when lhs < rhs then 'lt'
       when lhs = rhs then 'eq'
       else NULL
       end AS comp,
       case
       when CAST(lhs AS VARCHAR2(10)) > CAST(rhs AS VARCHAR2(10))
       then 't'
       else 'f'
       end AS gt_vc,
       case
       when CAST(lhs AS VARCHAR2(10)) < CAST(rhs AS VARCHAR2(10))
       then 't'
       else 'f'
       end AS lt_vc,
       case
       when CAST(lhs AS VARCHAR2(10)) = CAST(rhs AS VARCHAR2(10))
       then 't'
       else 'f'
       end AS eq_vc,
       DUMP(lhs) AS dmp_lsh_raw,
       DUMP(CAST(lhs AS CHAR(2))) AS dmp_lhs_ch,
       DUMP(CAST(lhs AS VARCHAR2(10))) AS dmp_lhs_vc,
       DUMP(rhs) AS dmp_rsh_raw,
       DUMP(CAST(rhs AS CHAR(2))) AS dmp_rhs_ch,
       DUMP(CAST(rhs AS VARCHAR2(10))) AS dmp_rhs_vc
from   data;

Then the output is:

GT_CH LT_CH EQ_CH COMP GT_VC LT_VC EQ_VC DMP_LSH_RAW DMP_LHS_CH DMP_LHS_VC DMP_RSH_RAW DMP_RHS_CH DMP_RHS_VC
f f t eq t f f Typ=96 Len=2: 97,32 Typ=96 Len=2: 97,32 Typ=1 Len=2: 97,32 Typ=96 Len=1: 97 Typ=96 Len=2: 97,32 Typ=1 Len=1: 97

This works and comparing them as their default data-type of CHAR, using Blank-Padding Comparison Semantics, then 'a ' and 'a' are equal but comparing them as VARCHAR2 data-types, using Non-Padded Comparison Semantics, then 'a ' is greater than 'a'. This is the expected behaviour.

If you do the same query with literals explicitly cast to CHAR data types generated inside a sub-query factoring clause (a.k.a. CTE):

WITH cte (lhs, rhs) AS (
  SELECT CAST('a ' AS CHAR(2)),
         CAST('a' AS CHAR(1))
  FROM   DUAL
)
select case when lhs > rhs then 't' else 'f' end AS gt_ch,
       case when lhs < rhs then 't' else 'f' end AS lt_ch,
       case when lhs = rhs then 't' else 'f' end AS eq_ch,
       case
       when lhs > rhs then 'gt'
       when lhs < rhs then 'lt'
       when lhs = rhs then 'eq'
       else NULL
       end AS comp,
       case
       when CAST(lhs AS VARCHAR2(10)) > CAST(rhs AS VARCHAR2(10))
       then 't'
       else 'f'
       end AS gt_vc,
       case
       when CAST(lhs AS VARCHAR2(10)) < CAST(rhs AS VARCHAR2(10))
       then 't'
       else 'f'
       end AS lt_vc,
       case
       when CAST(lhs AS VARCHAR2(10)) = CAST(rhs AS VARCHAR2(10))
       then 't'
       else 'f'
       end AS eq_vc,
       DUMP(lhs) AS dmp_lsh_raw,
       DUMP(CAST(lhs AS CHAR(2))) AS dmp_lhs_ch,
       DUMP(CAST(lhs AS VARCHAR2(10))) AS dmp_lhs_vc,
       DUMP(rhs) AS dmp_rsh_raw,
       DUMP(CAST(rhs AS CHAR(2))) AS dmp_rhs_ch,
       DUMP(CAST(rhs AS VARCHAR2(10))) AS dmp_rhs_vc
from   cte;

Then the output is identical to when the data is stored in a table and, again, it is the expected behaviour.

However, if you remove the explicit casts and use the default data types for the literals:

WITH cte_default_data_type (lhs, rhs) AS (
  SELECT 'a ',
         'a'
  FROM   DUAL
)
select case when lhs > rhs then 't' else 'f' end AS gt_ch,
       case when lhs < rhs then 't' else 'f' end AS lt_ch,
       case when lhs = rhs then 't' else 'f' end AS eq_ch,
       case
       when lhs > rhs then 'gt'
       when lhs < rhs then 'lt'
       when lhs = rhs then 'eq'
       else NULL
       end AS comp,
       case
       when CAST(lhs AS VARCHAR2(10)) > CAST(rhs AS VARCHAR2(10))
       then 't'
       else 'f'
       end AS gt_vc,
       case
       when CAST(lhs AS VARCHAR2(10)) < CAST(rhs AS VARCHAR2(10))
       then 't'
       else 'f'
       end AS lt_vc,
       case
       when CAST(lhs AS VARCHAR2(10)) = CAST(rhs AS VARCHAR2(10))
       then 't'
       else 'f'
       end AS eq_vc,
       DUMP(lhs) AS dmp_lsh_raw,
       DUMP(CAST(lhs AS CHAR(2))) AS dmp_lhs_ch,
       DUMP(CAST(lhs AS VARCHAR2(10))) AS dmp_lhs_vc,
       DUMP(rhs) AS dmp_rsh_raw,
       DUMP(CAST(rhs AS CHAR(2))) AS dmp_rhs_ch,
       DUMP(CAST(rhs AS VARCHAR2(10))) AS dmp_rhs_vc
from   cte_default_data_type;

Then the output is:

GT_CH LT_CH EQ_CH COMP GT_VC LT_VC EQ_VC DMP_LSH_RAW DMP_LHS_CH DMP_LHS_VC DMP_RSH_RAW DMP_RHS_CH DMP_RHS_VC
f f f eq t f f Typ=96 Len=2: 97,32 Typ=96 Len=2: 97,32 Typ=1 Len=2: 97,32 Typ=96 Len=1: 97 Typ=96 Len=2: 97,32 Typ=1 Len=1: 97

And none of the first 3 CASE expressions are true (as found in the question). However, the output from DUMP is identical to the previous queries.

This should not happen as the values should either be compared as CHAR data types, in which case Blank-Padding Comparison Semantics should apply, or compared as VARCHAR2 data types, in which case Non-Padded Comparison Semantics should apply, and in this case it appears to be using something else and is not following either rule and is not the expected behaviour.

To work around it, use CAST to give the literal an explicit data-type as this appears to "fix" the issue.

db<>fiddle here

CodePudding user response:

When character strings are compared, there are two options:

  • Blank-Padded Comparison Semantics

    If the two values have different lengths, then Oracle first adds blanks to the end of the shorter one so their lengths are equal. Oracle then compares the values character by character up to the first character that differs

  • Nonpadded comparison semantics

See documentation.

If all 3 cases you posted are put into the same case expression, this is the result:

SQL> select case when 'a ' > 'a' then 't1'
  2              when 'a ' < 'a' then 'f2'
  3              when 'a ' = 'a' then 'f3'
  4              else 'x4'
  5         end query_1
  6  from dual;

QU
--
f3

SQL>

f3? It says they are equal because of the blank-padded comparison (described above).

  • Related