Home > Mobile >  Finding repetitions of a string by length
Finding repetitions of a string by length

Time:02-23

I have a string of letters similar to that shown below:

'ABTSOFDNSOHASAPMAPDSNFAKSGMOMAPEPTNSNTROMAPKSDFANSDHASOMAPDODDFG'

I am treating this as a cipher text and therefore want to begin to find the position of repetitions in order to find the length of the encryption key (the example above is random so no direct answers will come from it)

For now what I want to be able to do is write a code that can find repetitions of length 3 - for example 'MAP' and 'HAS' are repeated. I want the code to find these repetitions as opposed to me having to specify the substring it should look for.

Previously I have used:

text.find("MAP")

Using the answer below I have written:

substring = []
for i in range(len(Phrase)-4):
    substring.append(Phrase[i:i 4])
    
for index, value in freq.iteritems():
    if value > 1:
        for i in range(len(Phrase)-4):
            if index == Phrase[i:i 4]:
                print(index)

This gives a list of each repeated substring as many times as it appears, ideally I want this to just be a list of the substring with the positions it appears in

CodePudding user response:

Here what I did :)

import pandas as pd

# find frequency of each length 3 substring
Phrase    = "Maryhadalittlarymbada"
substring = []
for i in range(len(Phrase)-3):
    substring.append(Phrase[i:i 3])
Frequency  = pd.Series(substring).value_counts()

# find repetition's position in string
for index, value in Frequency.iteritems():
    positions = []
    if value > 1:
        for i in range(len(Phrase)-3):
            if index == Phrase[i:i 3]:
                positions.append(i)
        print(index, ": ", positions)
    else:
        continue
  • Related