I'd like to implement Kadane's algorithm for the maximum subarray problem in q; here is the Python code for the simplest flavor we can look at on Wikipedia
def max_subarray(numbers):
"""Find the largest sum of any contiguous subarray."""
best_sum = 0
current_sum = 0
for x in numbers:
current_sum = max(0, current_sum x)
best_sum = max(best_sum, current_sum)
return best_sum
Just to simplify, I can run this in two separate loops, one for each line in the for-loop; so let's look at a simplified piece of code:
current_sum = 0
for x in numbers:
current_sum = max(0, current_sum x)
I can implement this in q as:
((|[0]) ( )::)/[0f;x]
My runtime (single-core) for a 9M-length list is about 3.2s; for comparison, Python numba takes 22ms, while only executing in q one of the two composed functions, ( )/[0f;z]
, takes 5 ms.
Is there any way to achieve numba-like speeds in q? If not, I'd move this algorithm to C, but I'd like to know if there's a q-thonic way of speeding it up.
CodePudding user response:
If you are ok with avoiding that exact algorithm we can get the speed up to a lot closer to numba.
First define S to the the cumulative sum of list A, then each segment i to j of A has sum S[j]-S[i-1]. From there we can say that the maximum sum that ends at j is S[j]-min S[i] where i < j. We can calculate this easily in Q by:
{s-mins 0^prev s:sums x}
which we can then find the maximum by:
{max s-mins 0^prev s:sums x}
For me on a single core I am getting around 148ms which is a lot closer but obviously still not perfect.