Home > Software design >  Find the difference between two strings of uneven length in python
Find the difference between two strings of uneven length in python

Time:09-09

a = 'abcdfjghij'
b = 'abcdfjghi'

Output : j

def diff(a, b):
    string=''
    for val in a:
        if val not in b:
            string=val
    return string

a = 'abcdfjghij'
b = 'abcdfjghi'
print(diff(a,b))

This code returns an empty string. Any solution for this?

CodePudding user response:

collections.Counter from the standard library can be used to model multi-sets, so it keeps track of repeated elements. It's a subclass of dict which is performant and extends its functionality for counting purposes. To find differences between two strings you can mimic a symmetric difference between sets.

from collections import Counter

a = 'abcdfjghij'
b = 'abcdfjghi'

ca = Counter(a)
cb = Counter(b)

diff = (cb-ca) (ca-cb) # symmetric difference

print(diff)
#Counter({'j': 1})

CodePudding user response:

Its hard to know exactly what you want based on your question. Like should

'abc'
'efg'

return 'abc' or 'efg' or is there always just going to be one character added?

Here is a solution that accounts for multiple characters being different but still might not give your exact output.

def diff(a, b):
    string = ''
    
    if(len(a) >= len(b)):
        longString = a
        shortString = b
    else:
        longString = b
        shortString = a
    for i in range(len(longString)):
        if(i >= len(shortString) or longString[i] != shortString[i]):
            string  = longString[i]
    return string

a = 'abcdfjghij'
b = 'abcdfjghi'
print(diff(a,b))

if one string just has one character added and i could be anywhere in the string you could change

string  = longString[i]

to

string = longString[i]

CodePudding user response:

In your example, there are 2 differences between the 2 strings : The letter g and j. I tested your code and it returns g because all the other letters from are in b:

a = 'abcdfjghij'
b = 'abcdfjhi'

def diff(a, b):
    string=''
    for val in a:
        if val not in b:
            string=val
    return string

print(diff(a,b))

CodePudding user response:

updated

But you have j twice in a. So the first time it sees j it looks at b and sees a j, all good. For the second j it looks again and still sees a j, all good. Are you wanting to check if each letter is the same as the other letter in the same sequence, then you should try this:

a = 'abcdfjghij'
b = 'abcdfjghi'

def diff(a, b):
  if len(a)>len(b):
    smallest_len = len(b)
    for index, value in enumerate(a[:smallest_len]):
      if a[index] != b[index]:
        print(f'a value {a[index]} at index {index} does not match b value {b[index]}')
    if len(a) == len(b):
      pass
    else:
      print(f'Extra Values in A Are {a[smallest_len:]}')
  else:
    smallest_len = len(a)
    for index, value in enumerate(b[:smallest_len]):
      if a[index] != b[index]:
        print(f'a value {a[index]} at index {index} does not match b value {b[index]}')
    if len(a) == len(b):
      pass
    else:
      print(f'Extra Values in B Are {b[smallest_len:]}')
  

diff(a, b)

enter image description here

  • Related