I have a parallelized list of tuples in the format:
data= [('Emily', (4, 2)),
('Alfred', (1, 12)),
('George', (10, 2))]
list = sc.parallelize(data)
What I want is to multiply the integers within the tuples which will give me this output:
[('Emily', (8)),
('Alfred', (12)),
('George', (20))]
I have tried:
list = list.map(lambda x: (x[0], x[1]*x[2]))
But with not effect.
CodePudding user response:
In you lambda x[1]
is a tuple ((4, 2)
...), so you need to access the first and second values you want to multiply (x[1][0]
...).
Try this instead:
result = list.map(lambda x: (x[0], x[1][0] * x[1][1]))
print(result.collect())
#[('Emily', 8), ('Alfred', 12), ('George', 20)]
Another way by passing the tuple to reduce function with mul
operator:
import operator
import functools
list.map(lambda x: (x[0], functools.reduce(operator.mul, x[1], 1)))