Home > Software design >  How to remove unsorted duplicate lines in Notepad using Regex
How to remove unsorted duplicate lines in Notepad using Regex

Time:11-08

I have my file (link is in comment)
A Sample of Data
Yn2STc5A
MBI1irwA
Yn2STc5A
agCGRvWu
KZIcwFII
414PGEBK
MBI1irwA
KZIcwFII
lln5OKRi
Yn2STc5A
6gCsLHJA
Yn2STc5A
MBI1irwA
KZIcwFII
MBI1irwA
22LYWQsX
22LYWQsX
Yn2STc5A
KZIcwFII
agCGRvWu
lln5OKRi

This file has 528 lines, every line is a repetition of 13 lines, And the 13 lines is a code per a Team link.
I have used and searched many Regex
But only these two was a bit close to what I needed,
Find: ^(.{8}\n)([\S\s] ?\1) and this too ^(.*)([\S\s] ?\1)
Replace All: $2

But I have to push Replace all repetitively, (47 times at least) to reach my goal...

My Desired Output should be out of complete file..
1:22LYWQsX
2:414PGEBK
3:6gCsLHJA
4:C6C8JOnf
5:KZIcwFII
6:MBI1irwA
7:NQid5EnY
8:P68A94uk
9:Yn2STc5A
10:agCGRvWu
11:jbsO5Pzk
12:lln5OKRi
13:vWSvMjaa

Thanks in advance

CodePudding user response:

I recommend to use standard functions of Notepad (my version 8.1.9 64 bit) if possible for your needs.

  • First open the sample data file (*.txt) by Notepad
  • From the main menu go to Edit > Line Operations > Remove Duplicate Lines
  • Go to Edit > Line Operations > Sort Lines Lexicographically Ascending
  • Format the result as desired for your needs.

Interim result:

enter image description here

  • Related