I need to create a new column in my dataset (duplicate_name) that contains TRUE if there are more than one record for someone or FALSE otherwise. I found this code (I am working with sqlite): SELECT *, CASE WHEN ROW NUMBER() OVER (PARTITION BY first_name, last_name) > 1) THEN 'TRUE' ELSE 'FALSE' END AS duplicate_name FROM users;
But when I ran it it gives me something like this (ONLY THE SECOND RECORD IS MARKED AS TRUE): Carlo Thomas male FALSE Carlo Thomas male TRUE Don Scallion male FALSE Tania Lopes female FALSE
What I need is a table like this (both records with the same name are marked as TRUE): Carlo Thomas male TRUE Carlo Thomas male TRUE Don Scallion male FALSE Tania Lopes female FALSE
Can someone help me, please.
Thanks
CodePudding user response:
Instead of ROW_NUMBER()
use COUNT(*)
window function:
SELECT *,
CASE WHEN COUNT(*) OVER (PARTITION BY first_name, last_name) > 1 THEN 'TRUE' ELSE 'FALSE' END AS duplicate_name
FROM users;
or simpler with a 1 for true and 0 for false:
SELECT *,
COUNT(*) OVER (PARTITION BY first_name, last_name) > 1 AS duplicate_name
FROM users;