How can I combine DISTINCT and COUNT(*) statements in a single query?-CodePudding

How can I remove duplicates and find row count in a single query in Hive?

CodePudding user response：

I never use Hive before, so I not sure the SQL query is same or not. But you can use this one as reference. I suggest you put the column name with the column has duplicate values.

SELECT DISTINCT COUNT(<column name>) FROM <table>

CodePudding user response：

I have never used Hive before, one more way to use is SELECT COUNT (DISTINCT column name) FROM table

CodePudding user response：

try this

SELECT  COUNT(DISTINCT <column name>) FROM <table>

another option is using group by:

SELECT  COUNT(<column name>) FROM <table> GROUP BY <column name>

CodePudding user response：

Try this https://www.w3resource.com/mysql/aggregate-functions-and-grouping/aggregate-functions-and-grouping-count-with-distinct.php

Example

SELECT column_1, COUNT(DISTINCT(column_2)) FROM table_name GROUP BY column_1