I have a binary list including only two elements(such as 0,1) 1010100010100000000000100000100000101000000000000000100000001000010
How to do paired transcoding with a custom setting of occurrence?
this is encoding rule:
if element occurs continuously less than 3 times, the encoding is 0,
if elements occurs continuously occurs 4-7 times, the encoding is 1,
if elements occurs continuously more than 7 times, the encoding is 2.
custom show up setting:
0-3 :0(short)
4-7 :1(medium)
more than 7 : 2(long)
for example :
how to let 0100111100011100000000 be transformed into [[0,0],[1,0],[0,0],[1,1],[0,0],[1,0],[0,2]] following by the above rule
*[a,b]
a: 0,1(there is only binary outcomes in my list )
b:0,1,2(it's my custom frequency setting)
CodePudding user response:
The solution with only basic statements is:
word = '0100111100011100000000'
#Consecutive counts
count=1
counts = []
if len(word)>1:
for i in range(1,len(word)):
if word[i-1]==word[i]:
count =1
else :
counts.append([ word[i-1],count])
count=1
counts.append([ word[i-1],count])
else:
i=0
counts.append([ word[i-1],count])
#check your conditions
output = []
for l in counts:
if l[1]<= 3 :
output.append([int(l[0]), 0])
elif l[1]> 3 and l[1]<8 :
output.append([int(l[0]), 1])
else:
output.append([int(l[0]), 2])
print(output)
Output :
[[0, 0], [1, 0], [0, 0], [1, 1], [0, 0], [1, 0], [0, 2]]
CodePudding user response:
You can define a function to translate length of groups to numbers, then use e.g. itertools.groupby
to separate the different groups of characters and apply that function is a list comprehension.
from itertools import groupby
def f(g):
n = len(list(g))
return 0 if n <= 3 else 1 if n <= 7 else 2
s = "0100111100011100000000"
res = [(int(k), f(g)) for k, g in groupby(s)]
# [(0, 0), (1, 0), (0, 0), (1, 1), (0, 0), (1, 0), (0, 2)]