Home > Blockchain >  Count number of times the value differs for all columns and for a same id
Count number of times the value differs for all columns and for a same id

Time:05-24

So here is my problem :

I have tons of doublons in my data, and I would like to count in SQL the number of times a column doesn't have the same value for a same id.

So, for example, if I have the following table :

enter image description here

Here is the output I would like :

Name : 2

Email : 3

Age : 1

State : 5

Could someone please help me?

Keep in mind I have a lot more columns in my real data, so doing it one by one is not the best solution. I would like a code which can count it for all columns directly, but for the moment I didn't find a solution :/

Thank you, and take care of you!

CodePudding user response:

One possible solution for a specific column could look like:

select distinct Count(*) over()
from t
group by id
having Count(distinct name) > 1;

Result: 2

select distinct Count(*) over()
from t
group by id
having Count(distinct age) > 1;

Result: 1

If you can make use of window functions an example of counting multiple columns in a single query would be:

select 
    Count(distinct case when name1 != name2 then id end) NameCount, 
    Count(distinct case when age1 != age2 then id end) AgeCount
from (
    select *,
      Min(name) over(partition by id) name1,
      Max(name) over(partition by id) name2,
      Min(age)  over(partition by id) age1,
      Max(age)  over(partition by id) age2
    from t
)t;
  •  Tags:  
  • sql
  • Related