Home > other >  How to delete duplicate rows and keep the nth occurrence on Excel?
How to delete duplicate rows and keep the nth occurrence on Excel?

Time:09-27

In Excel I want to delete duplicate rows but I want to have some preferences for what duplicate occurrence to keep. As in Excel while removing duplicates it by default keeps the first occurrence. Is there any way to keep the 2nd or 3rd occurrence?

For example:

a     Apple
a     Banana
a     Cherry
b     Apple
b     Banana
b     Melon
c     Apple
c     Cherry
c     Melon

By default if I remove duplicate, it would be like this:

a     Apple
b     Apple
c     Apple

But is there a way to achieve this

a     Banana
b     Banana
c     ?

As the C doesn't have banana, we can keep this blank, or any default value.

CodePudding user response:

To always keep the second occurrence,

  • SORT the data,
  • MATCH the elements among itself to get the first occurrence,
  • Subtract the matched row number from actual row number and if it's not 1, filter it out.

Input:

Duplicates Helper column
a Apple
a Cherry
b Apple
c Apple
c Cherry
b Banana
b Melon
a Banana
c Melon

Formula:

=FILTER(SORT(A1:B9,1,1),ROW(A1:A9)-MATCH(SORT(A1:A9),SORT(A1:A9),0)=1)

Output:

Second occurrence Helper column
a Cherry
b Banana
c Cherry

CodePudding user response:

see:

=FILTER(A1:B, COUNTIFS(A1:A, A1:A, ROW(A1:A), "<="&ROW(A1:A))=2)

enter image description here

for 3rd occurrence change 2 to 3

  • Related