Home > Blockchain >  Excel question: How to find duplicate string of six digits in a cell?
Excel question: How to find duplicate string of six digits in a cell?

Time:12-07

I'm dealing with a spreadsheet containing ranges of Bates numbers for legal discovery. All numbers are six digits. A typical cell might be named something like "Court records 000001-000100" or "Search warrant 000300-000300." Is there are way to identify cells where the SAME six-digit string is repeated, as in the second example, and replace it with only the first instance of that string? E.g., "Search warrant 000300."

I prefer a formula, but VBA would also work. I've found methods for identifying ANY six-digit string, but not one that will specifically look for the same number twice. Thank you so much for any suggestions you may have!

CodePudding user response:

With o365 you can use

=IF(LET(arr,FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TEXT(A1,"@"),"-"," "),".pdf","")," ","</s><s>")&"</s></t>","//s"),AGGREGATE(15,6,arr/ISNUMBER(arr),1)=AGGREGATE(15,6,arr/ISNUMBER(arr),2)),LEFT(A1,SEARCH("-",A1)-1)&".pdf",SUBSTITUTE(A1,".pdf","")&".pdf")

For other versions one needs to replace every arr with the FILTERXML-part. If it is possible that there are other punctuation marks around the numbers, then these must also be substituted.

enter image description here

EDIT 1: Just read, that you want to delete the duplicate.

EDIT 2: This formula would keep the .pdf-file-extension.

CodePudding user response:

If not already in a table, turn it into a table using Insert Table from the data ribbon. This allows you to refer to other columns by name and store column formulas once per *column" rather than have to drag it down the column and store the formula once per cell.

Assuming your doc name column title is "DocName", add a column titled "JustNumbers" with a formula of =RIGHT([@DocName],13)

Then add another column with a formula of

`=IF(LEFT([@JustNumbers],6)=RIGHT([@JustNumbers],6), LEFT([@DocName],LEN([@DocName])-7),[@DocName])`

You can hide the other columns and just show this last column.

  • Related