I have a list of tuples within a list of lists. I want to strip the second value from each tuple while keeping the sublist structure intact. I have tried a for loop where I pull the first value, but that collapses the structure of the sublist into just a single list of every word.
The code I have is:
tokenz = [word_tokenize(i) for i in data_file]
tokenz = [l[:2] for l in tokenz]
tuples = [tuple(x) for x in tokenz]
tuples = [list(g) for k, g in groupby(tuples, key=bool) if k]
tuples
Right now my output is:
[('EU', 'NNP'),
('rejects', 'VBZ'),
('German', 'JJ'),
('call', 'NN'),
('to', 'TO'),
('boycott', 'VB'),
('British', 'JJ'),
('lamb', 'NN'),
('.', '.')],
[('Peter', 'NNP'), ('Blackburn', 'NNP')],
[('BRUSSELS', 'NNP'), ('1996-08-22', 'CD')],
I am trying to get it to look like this:
['EU', 'rejects', 'German', 'call', 'to', 'boycott', 'British', 'lamb', '.'],
['Peter', 'Blackburn'],
['BRUSSELS', '1996-08-22']
CodePudding user response:
Use nested list comprehensions
tuples = [
[('EU', 'NNP'),
('rejects', 'VBZ'),
('German', 'JJ'),
('call', 'NN'),
('to', 'TO'),
('boycott', 'VB'),
('British', 'JJ'),
('lamb', 'NN'),
('.', '.')],
[('Peter', 'NNP'), ('Blackburn', 'NNP')],
[('BRUSSELS', 'NNP'), ('1996-08-22', 'CD')]
]
result = [[tup[0] for tup in list_of_tuples] for list_of_tuples in tuples]