Some modulo hijinkery-CodePudding

I am solving a question on LeetCode:

You are given two positive integer arrays nums1 and nums2, both of length n.The absolute sum difference of arrays nums1 and nums2 is defined as the sum of |nums1[i] - nums2[i]| for each 0 <= i < n (0-indexed). You can replace at most one element of nums1 with any other element in nums1 to minimize the absolute sum difference. Return the minimum absolute sum difference after replacing at most one element in the array nums1. Since the answer may be large, return it modulo (10^9 7). For Input: nums1 = [1,7,5], nums2 = [2,3,5], Output: 3.

The code that I came up with is as below:

class Solution {
public:
    int minAbsoluteSumDiff(vector<int>& nums1, vector<int>& nums2) {
        if(nums1==nums2) return 0;

        long long MOD=(int)1e9 7;

        set<long long> s(begin(nums1),end(nums1));
        
        long long diff=0ll, res=LLONG_MAX;
        for(int i=0; i<nums1.size(); i  )
            diff=(diff%MOD (abs(nums1[i]-nums2[i]))%MOD)%MOD;
        
        for(int i=0; i<nums2.size(); i  ) {
            long long curr=nums2[i];
            auto lb=s.lower_bound(curr);

            if(lb!=s.begin()) {
                auto p=prev(lb);
                long long prevElement=*p;
                long long currsum=diff;

                currsum=(currsum%MOD-(abs(nums1[i]-nums2[i]))%MOD)%MOD;
                currsum=(currsum%MOD abs(curr-prevElement)%MOD)%MOD;

                res=min(res, currsum);
            }

            if(lb!=s.end()) {
                long long nextElement=*lb;
                long long currsum=diff;

                currsum=(currsum%MOD-(abs(nums1[i]-nums2[i]))%MOD)%MOD;
                currsum=(currsum%MOD (abs(curr-nextElement))%MOD)%MOD;

                res=min(res, currsum);
            }
        }
        
        return res;
    }
};

This works for 50/51 test cases, but on the last one with large values, some modulo hijinkery breaks it. The reason I do:

currsum=(currsum%MOD-(abs(nums1[i]-nums2[i]))%MOD)%MOD;

is because of the distributive property of modulo: (a b) % c = ((a % c) (b % c)) % c.

What am I missing?

CodePudding user response：

The problem is a common mistake with modular arithmetic, and surprisingly, has nothing to do with integer overflow (as is usually the case) but is solely the result of order properties not mixing well with modulus.

You said the distributive property, (a b) % c = ((a % c) (b % c)) % c, should let you take moduli inside equations. This stops being true when results are being compared against each other. Let's look at a smaller example with a simpler problem:

Given two arrays A and B, find the array with the smallest sum,
 and return its sum modulo 9.

A = [2, 8]
B = [1, 1]

Now, the answer should be array B, its sum is 2, which is less than 10. If you only perform modulo at the end, you get the right answer:

sum(A) = 10
sum(B) = 2

sum(B) < sum(A), so return (2 % 9) == 2

but if you perform modulo in the middle:

sum(A) % 9 == 1
sum(B) % 9 == 2

sum(A) % 9 < sum(B) % 9, so you return (1 % 9) == 1

Since the constraints here mean your sums can't overflow a long long int, your problem is fixed by only performing one modulo operation at the very end, inside the return statement.

In general, problems where you want the (minimum sum) modulo p rather than the minimum (sum modulo p) require you to not perform modular division on intermediate results: you have to use integer types large enough to compare the true value (or at least have a way of comparing true values).