Home > Software engineering >  how to crack a random substitution text cypher?
how to crack a random substitution text cypher?

Time:10-20

i was reading about caesar cipher where the characters are simply shifted by a number life this:

l=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

def shift(l,n):
    res = l[n:]  l[:n]
    return res

we can then switch the function to 2 steps for the rights for example to get:

l_c2= ['c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b']

for encrypting the message one just has to substitute each chatacter in the original text with the shifted one. This method is very easy to break, because once you know the mirror of one character, you know all the others, even if we do not, we can try 26 shiftings to find the correct shift, its a small number of tests!

so i was thinking if i randomly reorder the elements of the list with:

import random

def randomReorder(l):
    return random.sample(l,len(l))

then i will get a list that looks like this;

l_r = ['f', 'e', 'l', 'r', 'p', 't', 'k', 'v', 'u', 'c', 'd', 'o', 'a', 'x', 'm', 'g', 'b', 'z', 'q', 's', 'h', 'j', 'i', 'n', 'w', 'y']

so if i subsitute the letters in the original text with these ones, if one know the key to one character, its hard to predict the others, because they are simpley randomly reordered, so for "hello" for example it become "vpoom". Because the cypher list is just random, so a cracker will have to test many reordered lists to find the list which can give a "more english" result, which are 10^26 possible arrangments. So can this method of crypting data be powerful?, or there is something that i'm missing that crackers can use to break the cyphering?

CodePudding user response:

Yes, true if your text can be anything but natural languages have a lot of redundancy and aren't random.

Decrypting a random string, encrypted with random substitution might be hard to decrypt.

But since the text is English, you can do frequency analysis e.g. the most commonly occurring letter is "e"; "j" is the least commonly occurring, and so on. You can also think about character pairs (e.g. "th" occurs a lot), or character triplets and so on.

  • Related