Home > Enterprise >  Using itertools groupby, create groups of elements, if ANY key is same in each element
Using itertools groupby, create groups of elements, if ANY key is same in each element

Time:11-22

Given a list of strings, how to group them if any value is similar?

inputList = ['w', 'd', 'c', 'm', 'w d', 'm c', 'd w', 'c m', 'o', 'p']

desiredOutput = [['d w', 'd', 'w', 'w d',], ['c', 'c m', 'm', 'm c'], ['o'], ['p']]

How to sort a list properly by first, next, and last items?

My sorting attempt:

groupedList = sorted(inputList, key=lambda ch: [c for c in ch.split()])

Output:

['c', 'c m', 'd', 'd w', 'm', 'm c', 'o', 'p', 'w', 'w d']

Desired output:

['c', 'c m', 'm c', 'm', 'd', 'd w', 'w', 'w d', 'o', 'p']

My grouping attempt:

b = sorted(g, key=lambda elem: [i1[0] for i1 in elem[0].split()]) # sort by all first characters
b = groupby(b, key=lambda elem: [i1[0] in elem[0].split()[:-1] for i1 in elem[0].split()[:-1]])
b = [[item for item in data] for (key, data) in b]

Output:

[[('c winnicott', 3), ('d winnicott', 2)], [('d w winnicott', 2), ('w d winnicott', 1)], [('w winnicott', 1)]]

Desired output:

[[('c winnicott', 3)], [('d winnicott', 2), ('d w winnicott', 2), ('w d winnicott', 1), ('w winnicott', 1)]]

CodePudding user response:

I did it with the bubble sort algorithm.

def bubbleSort(arr):
n = len(arr)
swapped = False

for i in range(n-1):
    for j in range(0, n-i-1):
        
        g1 = arr[j][0].split()
        g2 = arr[j   1][0].split()
        
        if any([k > l for k in g1] for l in g2):

            swapped = True
            arr[j], arr[j   1] = arr[j   1], arr[j]
            
            if any(s in g2 for s in g1):
                arr[j].extend(arr[j   1])
                arr[j   1] = ['-']
     
    if not swapped:
        return arr
    
arr = [a for a in arr if a[0]!='-']
return arr

inputList = ['w', 'd', 'c', 'm', 'w d', 'm c', 'd w', 'c m', 'o', 'p']
#inputList = ["m", "d", "w d", "m c", "c d"]

inputList = [[n] for n in inputList]

print(bubbleSort(inputList))

Output:

[['p'], ['o'], ['c m', 'm c', 'c', 'm'], ['d w', 'w d', 'w', 'd']]
  • Related