I have tuples like this:
('id1', ['date', 'type', 'value', '2017-11-11 08:32:46.934', 'no_error', '54.64325', '2017-11-11 08:32:47.356', 'no_error', '76.34553']
I want to retrieve only the elements that are floats. I have only found solutions for this if the value is just one element not a list, using something along the lines of this:
filter(lambda t: is_float(t[1]) == True)
being is_float a function I created that, as the name says, returns true if the value is a float. How could I solve it?
CodePudding user response:
That's what isinstance()
is for. It will return True
if the first parameter is an instance of the second one.
>>> isinstance(1, float)
False
>>> isinstance("1.0", float)
False
>>> isinstance(1.0, float)
True
CodePudding user response:
You could achieve it with list comprehension with an if-clause:
def is_float(s):
try:
float(s)
return True
except ValueError:
return False
rdd.map(lambda key, list_value: (key, [element for element in list_value if is_float(element)]))
This will not be very performant, though.
Update: I changed the code to incorporate the OP's remark, that the list elements are strings.