I need to do some comparison through 3 columns containing string dates 'yyyy-mm-dd', in Hive SQL. Please take in consideration that the table has more than 2 million records.
Consider three columns (col1; col2; col3) from table T1, I must guarantee that:
- col1 = col2, and both, or at least one is different from col3.
My best regards,
CodePudding user response:
Logically you have an issue.
col1 = col2
Therefore if col1 != col3 then col2 != col3;
There for it's really enough to use:
select * from T1 where col1 = col2 and col1 != col3;
It is appropriate to do this map side so using a where
criteria is likely good enough.
If you wanted to say 2 out of the 3 need to match you could use group by
with having
to reduce comparisons.