How can I perform an array operation in Pyspark without using explode?
I want to take one ArrayType column, which has array [2,4] and subtract another ArrayType column which has array [1,1]. I just want to subtract the columns element-wise, so the resulting column would be [1,3].
Thanks, Christie
CodePudding user response:
https://stackoverflow.com/a/55832763/14978104
see this answer for a potential solution - it uses the udf decorator to perform a comprehension on the arrays