Home > front end >  Incorrect results of modulo operation between large numbers in R
Incorrect results of modulo operation between large numbers in R

Time:12-17

To solve a puzzle from hackerrank, I'm trying to apply modulo operations between large numbers in R (v4.2.2). However, I get incorrect results when at least one of the operands is very large. For example, 52504222585724001 %% 10 yields 0 in R. which is incorrect. However, when I try 52504222585724001 % 10 in python (v3.9.12) I get the correct result 1. So I decided to test some other numbers. I downloaded a set of test cases for which my code was failing and I did n*n mod (10^9 7) for each n value.

R code:

summingSeries <- function(n) {
  return(n^2 %% (10^9   7))
}

n <- c(229137999, 344936985, 681519110, 494844394, 767088309, 307062702, 306074554, 555026606, 4762607, 231677104)
expected <- c( 218194447, 788019571, 43914042, 559130432, 685508198, 299528290, 950527499, 211497519, 425277675, 142106856 )

result <- rep(0L, length(n))

start <- Sys.time()
for (i in 1:length(n)){
  result[i] <- summingSeries(n[i])
}
print(Sys.time() - start)
df <- data.frame(expected, result, diff = abs(expected - result))
print(df)

I am pasting below the results and the absolute differences with the expected values

expected    result   diff
-------------------------
218194447 218194446    1
788019571 788019570    1
43914042  43914070    28
559130432 559130428    4
685508198 685508205    7
299528290 299528286    4
950527499 950527495    4
211497519 211497515    4
425277675 425277675    0
142106856 142106856    0

Python3 code:

import numpy as np

def summingSeries(n):
    return(n ** 2 % (10 ** 9   7))

n = [229137999,
    344936985,
    681519110,
    494844394,
    767088309,
    307062702,
    306074554,
    555026606,
    4762607,
    231677104]

expected = [218194447,
    788019571,
    43914042,
    559130432,
    685508198,
    299528290,
    950527499,
    211497519,
    425277675,
    142106856]

result = [0] * len(n)
for i in range(0, len(n)):
  result[i] = summingSeries(n[i])

print(np.array(result) -  np.array(expected))

I get the correct results using the above python code. Can someone kindly explain why there are inconsistencies and why R is yielding the wrong results?

CodePudding user response:

Using the gmp package (see Carl Witthoft's comment).

gmp::mod.bigz(gmp::as.bigz(n)^2, 1e9   7) - expected
#> Big Integer ('bigz') object of length 10:
#>  [1] 0 0 0 0 0 0 0 0 0 0

Previous/inferior answer:

library(Rmpfr)

n <- c(229137999, 344936985, 681519110, 494844394, 767088309, 307062702, 306074554, 555026606, 4762607, 231677104)
expected <- c(218194447, 788019571, 43914042, 559130432, 685508198, 299528290, 950527499, 211497519, 425277675, 142106856)
data.frame(
  precision = 53:64, # 53 corresponds to double precision
  sumAbsErr = sapply(53:64, function(p) sum(abs(expected - as.numeric(mpfr(n, p)^2 %% (1e9   7)))))
)
#>    precision sumAbsErr
#> 1         53        53
#> 2         54        29
#> 3         55        21
#> 4         56        16
#> 5         57         1
#> 6         58         1
#> 7         59         1
#> 8         60         0
#> 9         61         0
#> 10        62         0
#> 11        63         0
#> 12        64         0

60 bits of precision is just enough for this example.

  • Related