I have a two dimensional list of words and their respective definitions. As you can see in the example below, some words appear more than once but with different definitions. I would to combine the definitions of duplicate words so that each word only appears once.
list_of_lists = [
['absorption', 'a process in which one substance permeates another'],
['absorption', 'when radiated energy is retained on passing through a medium'],
['aerobic', 'depending on free oxygen or air'],
['aerobic', 'enhancing respiratory and circulatory efficiency'],
['chain reaction', 'a self-sustaining nuclear reaction'],
['chain reaction', 'a series of chemical reactions in which the product of one is a reactant in the next']
]
Expected output after some programming magic
['absorption', 'a process in which one substance permeates another', 'when radiated energy is retained on passing through a medium']
['aerobic', 'depending on free oxygen or air', 'enhancing respiratory and circulatory efficiency']
['chain reaction', 'a self-sustaining nuclear reaction', 'a series of chemical reactions in which the product of one is a reactant in the next']
CodePudding user response:
Assuming this is your data:
list_of_lists = [
['absorption', 'a process in which one substance permeates another'],
['absorption', 'when radiated energy is retained on passing through a medium'],
['aerobic', 'depending on free oxygen or air'],
['aerobic', 'enhancing respiratory and circulatory efficiency'],
['chain reaction', 'a self-sustaining nuclear reaction'],
['chain reaction', 'a series of chemical reactions in which the product of one is a reactant in the next'],
]
You could use a groupby
expression like so:
from itertools import groupby
from operator import itemgetter
for key, group in groupby(list_of_lists, itemgetter(0)):
print([key] list(map(itemgetter(1), group)))
Output:
['absorption', 'a process in which one substance permeates another', 'when radiated energy is retained on passing through a medium']
['aerobic', 'depending on free oxygen or air', 'enhancing respiratory and circulatory efficiency']
['chain reaction', 'a self-sustaining nuclear reaction', 'a series of chemical reactions in which the product of one is a reactant in the next']
CodePudding user response:
You can use defaultdict from the standard collections library to make a dictionary of keys -> list of definitions. This by itself might be useful, but it's also easy to transform to a list of tuples:
from collections import defaultdict
l = [
('absorption','a process in which one substance permeates another'),
('absorption', 'when radiated energy is retained on passing through a medium'),
('aerobic', 'depending on free oxygen or air'),
('aerobic', 'enhancing respiratory and circulatory efficiency'),
('chain reaction', 'a self-sustaining nuclear reaction'),
('chain reaction', 'a series of chemical reactions in which the product of one is a reactant in the next')
]
res = defaultdict(list)
for k, v in l:
res[k].append(v)
# res is a dict so you can look up words:
print(res['aerobic'])
# ['depending on free oxygen or air', 'enhancing respiratory and circulatory efficiency']
# to get back a list of tuples, just pass the dict items to list()
collected_list = list(res.items())
# [('absorption',
# ['a process in which one substance permeates another',
# 'when radiated energy is retained on passing through a medium']),
# ('aerobic',
# ['depending on free oxygen or air',
# 'enhancing respiratory and circulatory efficiency']),
# ('chain reaction',
# ['a self-sustaining nuclear reaction',
# 'a series of chemical reactions in which the product of one is a reactant in the next'])]