I tried to parse through a text file, and see the index of the character where the four characters before it are each different. Like this:
wxrgh
The h would be the marker, since it is after the four different digits, and the index would be 4. I would find the index by converting the text into an array, and it works for the test but not for the actually input. Does anyone know what is wrong.
def Repeat(x):
size = len(x)
repeated = []
for i in range(_size):
k = i 1
for j in range(k, _size):
if x[i] == x[j] and x[i] not in repeated:
repeated.append(x[i])
return repeated
with open("input4.txt") as f:
text = f.read()
test_array = []
split_array = list(text)
woah = ""
for i in split_array:
first = split_array[split_array.index(i)]
second = split_array[split_array.index(i) 1]
third = split_array[split_array.index(i) 2]
fourth = split_array[split_array.index(i) 3]
test_array.append(first)
test_array.append(second)
test_array.append(third)
test_array.append(fourth)
print(test_array)
if Repeat(test_array) != []:
test_array = []
else:
woah = split_array.index(i)
print(woah)
print(woah)
I tried a test document and unit tests but that still does not work
CodePudding user response:
You can utilise a set to help you with this.
Read the entire file into a list (buffer). Iterate over the buffer starting at offset 4. Create a set of the 4 characters that precede the current position. If the length of the set is 4 (i.e., they're all different) and the character at the current position is not in the set then you've found the index you're interested in.
W = 4
with open('input4.txt') as data:
buffer = data.read()
for i in range(W, len(buffer)):
if len(s := set(buffer[i-W:i])) == W and buffer[i] not in s:
print(i)
Note:
If the input data are split over multiple lines you may want to remove newline characters.
You will need to be using Python 3.8 to take advantage of the assignment expression (walrus operator)