I am asked to introduce to my classmates the the Williams' p 1 algorithm to factor integers and I don't think I get it right. As far as I understand, this algorithm takes an integer N that factors into primes p,q (N=pq) where p 1 is B-smooth. I understand why, starting from these premises, the algorithm works (I've written proof), but I don't get how to correctly implement and use it. I think it must be implemented as follows:
- I take a, randomly, in interval [1,N-1]
- I compute x=gcd(a,N). If x !=1, then I return x (I don't get why we don't check first if x is prime because we don't actually know if N really is equal to p*q and x could be composed, right?)
- Normally, x == 1, so I have to compute y = gcd(V_M-2,N) where V_0 = 2, V_1 = a, V_n= aV_(n-1) - V_(n-2) . I have found a way to compute V_n doing matrix power modulus N but I don't know which M should I use (I've copied the Pollard's one, but I don't know if that works and why).
- If y!=1 and y!=N, I return y (again, as with the x, I think we should check that y is prime, am I right?). Else, just try another random a and start again.
So, that is mainly my question about the implementation, overall concerning the building of M, which might be related to the fact of p 1 B-smoothness I guess.
Regarding the usage, I really don't get in which cases I should use this method and which B should I take. I am going to leave my code in Python3 here and a case example that really is turning me crazy to see if you can give me a hand.
import random
from math import floor, log, gcd
def is_prime(n): #funcion que determina si un numero es primo
for d in range(2,n):
if n%d == 0:
return False
return True
def primes_leq(B): #funcion para obtener los primos que son menor o igual que B
l=[]
for i in range(2,B 1):
if is_prime(i):
l.append(i)
return l
def matrix_square(A, mod):
return mat_mult(A,A,mod)
def mat_mult(A,B, mod):
if mod is not None:
return [[(A[0][0]*B[0][0] A[0][1]*B[1][0])%mod, (A[0][0]*B[0][1] A[0][1]*B[1][1])%mod],
[(A[1][0]*B[0][0] A[1][1]*B[1][0])%mod, (A[1][0]*B[0][1] A[1][1]*B[1][1])%mod]]
def matrix_pow(M, power, mod):
#Special definition for power=0:
if power <= 0:
return [[1,0],[0,1]]
powers = list(reversed([True if i=="1" else False for i in bin(power)[2:]])) #Order is 1,2,4,8,16,...
matrices = [None for _ in powers]
matrices[0] = M
for i in range(1,len(powers)):
matrices[i] = matrix_square(matrices[i-1], mod)
result = None
for matrix, power in zip(matrices, powers):
if power:
if result is None:
result = matrix
else:
result = mat_mult(result, matrix, mod)
return result
def williams(N, B):
flag = False
while not flag :
a = random.randint(1,N-1)
print("a : " str(a))
x = gcd(a,N)
print("x : " str(x))
if x != 1:
return x
else :
M = 1
A = [[0,1],[-1,a]]
for p in primes_leq(B):
M *= p **(floor(log(N,p)))
print("voy por aquí")
C = matrix_pow(A,M,N)
V = 2*C[0][0] a*C[0][1]
y = gcd(V-2,N)
print("y : " str(y))
if y != 1 and y != N:
flag = True
return y
Trying to test my implementation, I've tried to follow some examples to check if my factorization is working alright. For example, I've looked at https://members.loria.fr/PZimmermann/records/Pplus1.html and I've tried williams(2**439-1,10**5)
and I'm getting 104110607 but I understand I should be getting 122551752733003055543 (as in the webpage). As far as I understand, both are primes that factor N=2**439-1, but isn't this fact just contradicting the hypothesis of N being a product of two primes p*q?
Thanks for your help, it'll be appreciated
CodePudding user response:
2439−1 has more than two prime factors. If you’re not getting the one you want, you should divide by the one that you got and keep going on the quotient.
CodePudding user response:
I think you're missing the point of this algorithm...
You find a factor p of N (or a trivial divisor) when M is a multiple of p 1.
If p 1 is smooth -- say all the factors of p 1 are <= B -- then it becomes becomes possible to construct an M that is a multiple of all such possible factors, like this:
M=1
for x in all primes <= B:
let y = largest power of x such that y < N
M = M*y
You should check the successive values of M produced by this loop. Alternatively you could just check all successive factorials. The point is that, in each iteration, you add new factors to M, and when all the factors of p 1 are in M, then M will of course be a multiple of p 1.
The tricky part is that M will very get large, and you can't take M mod N. What you can do, though, is calculate all the VM mod N, and instead of actually calculating each M, you just multiply the subscript of VM by the appropriate factor using the addition formula: Va b = VaVb - Va-b