I'm attempting to extract the following a sublist that follows the following rule :
| Events |
| {on,e_1...e_n,on} |
|{off,e_1...e_n,on} |
| {on,e_1...e_n,off}|
| {off,e_1...e_n,off} |
Here is what I have so far :
import pandas as pd
import numpy as np
def get_subs(df):
updated = []
for index, row in df.iterrows():
values = row['t']
new_lst = []
sub_list = []
is_on_or_off_set = False
for value in values:
if (value == 'ON' or value == 'OFF') and is_on_or_off_set == False:
sub_list.append(value)
is_on_or_off_set = True
elif value.startswith('sc') and is_on_or_off_set == True:
sub_list.append(value)
else:
is_on_or_off_set = False
sub_list.append(value)
new_lst.append(sub_list)
sub_list = []
updated.append(new_lst)
return updated
array = np.array([
['name1', ['ON', 'sc','sc', 'ON', 'ON', 'sc', 'sc', 'ON']]
,
['name2', ['OFF', 'sc', 'sc', 'ON', 'OFF', 'sc', 'sc','OFF']]
,
['name3', ['ON', 'sc', 'sc' , 'OFF', 'ON', 'sc', 'sc', 'OFF']]
,
['name4', ['ON' , 'sc1' , 'sc2' , 'OFF' , 'ON']]
,
['name5', ['OFF' , 'ON' , 'sc' , 'OFF' , 'ON']]
,
['name6', ['OFF' , 'OFF' , 'sc1' , 'OFF' , 'ON']]
,
['name6', ['ON', 'OFF', 'OFF', 'sc1', 'sc2' , 'OFF', 'ON','ON']]
])
index_values = ['1', '2', '3', '4', '5','6','7']
column_values = ['name', 't']
df = pd.DataFrame(data=array,
index=index_values,
columns=column_values)
subs = get_subs(df)
for s in subs :
print(s)
which prints :
[['ON', 'sc', 'sc', 'ON'], ['ON', 'sc', 'sc', 'ON']]
[['OFF', 'sc', 'sc', 'ON'], ['OFF', 'sc', 'sc', 'OFF']]
[['ON', 'sc', 'sc', 'OFF'], ['ON', 'sc', 'sc', 'OFF']]
[['ON', 'sc1', 'sc2', 'OFF']]
[['OFF', 'ON'], ['sc'], ['OFF', 'ON']]
[['OFF', 'OFF'], ['sc1'], ['OFF', 'ON']]
[['ON', 'OFF'], ['OFF', 'sc1', 'sc2', 'OFF'], ['ON', 'ON']]
There is an issue when the algo encounters ['name6', ['OFF' , 'OFF' , 'sc1' , 'OFF' , 'ON']]
as this is transformed to [['OFF', 'OFF'], ['sc1'], ['OFF', 'ON']]
when I expect the transform to be [['OFF' , 'sc1' , 'OFF']
How to modify such that if ['name6', ['OFF' , 'OFF' , 'sc1' , 'OFF' , 'ON']]
it is transformed to [['OFF' , 'sc1' , 'OFF']]
while not breaking any of the existing rules :
| Events |
| {on,e_1...e_n,on} |
|{off,e_1...e_n,on} |
| {on,e_1...e_n,off}|
| {off,e_1...e_n,off} |
Each group should have at least 1 sc event.
CodePudding user response:
I looked into your approach for a bit, but I couldn't come up with a way that fixes your bugs.
I don't know how much you simplified your problem here, but how about you join the lists to a string and try to find matching patterns with regex like this:
I picked the different possible combinations in your example, joined each list to a string and put them in a dict for demonstration.
dic = {
'name1' : 'ON sc sc ON ON sc sc ON',
'name2' : 'ON sc1 sc2 OFF ON',
'name3' : 'OFF ON sc OFF ON',
'name4' : 'ON OFF OFF sc1 sc2 OFF ON ON',
'name5' : 'ON OFF OFF sc1 sc2 OFF ON ON OFF sc1 sc2 ON'
}
import re
pat = r"(ON|OFF)\s?(sc\d?\s?) \s?(ON|OFF)"
for key, string in dic.items():
m = re.finditer(pat, string)
if m:
res = [elem.group().split(' ') for elem in m]
print(f"{key=}:\t{res=}")
Output:
key='name1': res=[['ON', 'sc', 'sc', 'ON'], ['ON', 'sc', 'sc', 'ON']]
key='name2': res=[['ON', 'sc1', 'sc2', 'OFF']]
key='name3': res=[['ON', 'sc', 'OFF']]
key='name4': res=[['OFF', 'sc1', 'sc2', 'OFF']]
key='name5': res=[['OFF', 'sc1', 'sc2', 'OFF'], ['OFF', 'sc1', 'sc2', 'ON']]
You can find the Regex here.