I wrote this program to find certain consecutive recurring digits in a string and group them. The string only contains 0 and 1, and I want to shorten the recurring zeros by converting them to a number. Also, to avoid confusion I converted all the 1s to a letter. For example:
item = list("00011101110100010111010001110000")
for i in item:
if i == "1":
item[item.index(i)] = "n"
if i == "0":
index = item.index(i)
zeros = 0
for shft, _ in enumerate(item):
try:
if item[index shft] == "1":
break
if item[index shft] == "0":
item.pop(index shft)
zeros =1
except IndexError:
pass
item.insert(index, zeros)
print(item)
The expected output for this program I wrote is
[3, 'n', 'n', 'n', 1, 'n', 'n', 'n', 1, 'n', 3, 'n', 1, 'n', 'n', 'n', 1, 'n', 3, 'n', 4]
But the output I get is:
[2, 1, 'n', 'n', 'n', 1, 'n', 'n', 'n', 4, 'n', 1, 'n', 'n', 'n', 'n', 3, 'n', 1, 'n', 'n', 'n', 2, 1, 1]
I looked around for something of this kind which can group consecutive characters, and the closest thing I found was this Java example but I was having trouble implementing it in python.
I then tried this approach:
item = img[2]
zeros = 0
for idx, i in enumerate(item):
if i == "0":
zeros = 1
item.pop(idx)
elif i == "1":
item[idx] = "n"
if zeros != 0:
item.insert(idx-1, zeros)
zeros = 0
elif i == "x":
if zeros != 0:
item.insert(idx-1, zeros)
zeros = 0
print(item)
But the output was:
['0', 2, '1', 'n', 'n', 1, '1', 'n', 'n', '1', '0', '1', 4, '1', 'n', 'n', '1', '0', 3, '1', 'n', 'n', '0', 2, '0', 'x']
Could anyone please show me a better and faster approach than this and show me where I'm going wrong?
CodePudding user response:
You can use itertools.groupby
for grouping consecutive items of the same key. Since you really only want to group 0s while leaving 1s separate items in this case, a trick I'd use is to use a key function that returns False
for 0s and an incremental number for 1s so that 1s would not be grouped together as their keys are always unique. You can use itertools.count
to generate such incremental numbers:
from itertools import groupby, count
item = '00011101110100010111010001110000'
c = count(1)
print([
'n' if k else sum(1 for _ in g)
for k, g in groupby(item, lambda i: i == '1' and next(c))
])
This outputs:
[3, 'n', 'n', 'n', 1, 'n', 'n', 'n', 1, 'n', 3, 'n', 1, 'n', 'n', 'n', 1, 'n', 3, 'n', 'n', 'n', 4]
CodePudding user response:
You can use itertools.groupby
with a nested for-loop which decides whether to use the number of zeros or to repeat the ones:
>>> import itertools as it
>>> item = '00011101110100010111010001110000'
>>> [x for k, g in it.groupby(item) for x in (('n' for _ in g) if k == '1' else [sum(1 for _ in g)])]
[3, 'n', 'n', 'n', 1, 'n', 'n', 'n', 1, 'n', 3, 'n', 1, 'n', 'n', 'n', 1, 'n', 3, 'n', 'n', 'n', 4]
CodePudding user response:
why do you want to output in the same item object/variable?
It is simple way -
item = list("00011101110100010111010001110000")
output = []
count_zeros = 0
for i in item:
if i == "1":
if count_zeros != 0:
output.append(count_zeros)
output.append("n")
# Set count of zero to 0
count_zeros = 0
elif i == "0":
count_zeros = count_zeros 1
else:
print("This characher is not handled {}".format(i))
print(output)