Home > Software design >  Change the first element of each group Python Lists
Change the first element of each group Python Lists

Time:09-16

Suppose we have the following list:

['O', 'O', 'O', 'O', 'I-INS', 'I-INS', 'I-INS', B-PER, I-PER]

I want to change this list, so that anytime there are multiple members of a subgroup (like INS) starting with I- without a B- member behind them, the first element changes to B-, for example:

O,I-INS,I-INS,I-INS,B-PER, I-PER => O,B-INS,I-INS,I-INS,B-PER, I-PER

If a subgroup already starts with a B- or anything else other than I, then it should remain unchanged. So far, I have written this code:

temp = []
for i in range(len(iobTags)):
  if iobTags[i].startswith('I'):
    if iobTags[i-1].startswith('I'):
      temp = iobTags[i-1].split('-')
      temp[0] = 'B'
      mem = temp[0]   '-'   temp[1]
      iobTags[i-1] = mem
    else:
      continue

The problem is that this code keeps changing every I- member that it sees to B- after the first element like:

I-INS,I-INS,I-INS => B-INS,B-INS,I-INS

While I just want the first element to change and then move on to checking the first element of other subgroups. How can I change this code?

CodePudding user response:

You can use itertools.groupby for the task:

from itertools import groupby

l = ["O", "I-INS", "I-INS", "I-INS", "B-PER", "I-PER"]

out = []
for v, g in groupby(l, lambda k: k.split("-")[-1]):
    g = list(g)
    if g[0].startswith("I-"):
        if not any(v.startswith("B-") for v in g):
            g[0] = g[0].replace("I-", "B-")
    out.extend(g)

print(out)

Prints:

['O', 'B-INS', 'I-INS', 'I-INS', 'B-PER', 'I-PER']

CodePudding user response:

list = ['O', 'O', 'O', 'O', 'I-INS', 'I-INS', 'I-INS', 'B-PER', 'I-PER']
number_of_items_in_subgroup = 0
output_list = []
for index in range(len(list)):
    #First case
    if index == 0:
        if list[index][0] == "I":
            output_list.append("B"   list[index][1:])
        else:
            output_list.append(list[index])
        
    else:
        if (list[index][0] == "I") & ((list[index-1][0] != "B") & (list[index-1] != list[index])) & (output_list[-1][0] != "B"):
            output_list.append("B"   list[index][1:])
        else:
            output_list.append(list[index])
        
    
    
    
print(output_list)

Check this one. For the example you provided, it works.

Also works for another random list I've created.

  • Related