Home > front end >  Most efficient way of comparing Strings in a single list
Most efficient way of comparing Strings in a single list

Time:01-12

So I have a text file with over 5 million hashes and have been asked to find a pair of hashes with the most common characters. The characters have to be matching at the same index.

For example:

Hash 1 = B79F56435...

Hash 2 = B79123456...

Result = 4 matching characters

*Each hash contains 64 hexadecimal characters in total

I started to solve this in Python, but it obviously took a long time so I moved to java which sped up the process significantly. Here is my current code:

I imported the text file of hashes into ArrayList of Strings:

ArrayList<String> hashes = new ArrayList<>(); 

My main method for comparing each String in the ArrayList

String winner1 = "", winner2 = "";
int count = 0, n = hashes.size();

for(int i = 0; i < n; i  ) 
{
    s1 = hashes.get(i);             
    for(int j = i 1; j < n; j  ) 
    {
        s2 = hashes.get(j);
        int temp = countComparisons(s1, s2);
        if(temp > count) 
        {
            count = temp;
            winner1 = s1; winner2 = s2;
        }
     }
}

And this is my method for counting the comparisons between the hashes:

public static int countComparisons(String s1, String s2)
{
    int x = 0;
    for(int i = 0; i < 64; i  ) 
    {
        if(s1.charAt(i) == s2.charAt(i))
            x  ;
    }       
    return x;
}

Now comparing these hashes all individually with one another will equal over 12 trillion combinations so it's obviously going to take a long time but I just wanted to see if anyone has any improvements to make to this code.

I am completely open to any suggestions on how this could be improved.

It doesn't necessarily have to be in java either and any suggestion on what the best data structure to use would be great. I just chose ArrayList because as it was easy to import from the text file.

I found it difficult to find comparison algorithms for this specific case so if someone point me towards anything related to this that would also be extremely useful.

Thank you

CodePudding user response:

You have to implement your comparison in a multi tread approach to reduce the time. Use ExecutorService for that. I guess this topic helps you:

Multithread loop which comparing two Strings

  •  Tags:  
  • Related