How to factor XYYXF composites?
Home
News
How?

Status
Results
Records
Primes

Contributors
Join us!
Reserve

Links
Contact
  There are many different factoring algorithms. Usually all of them are separated into two groups. Partial factorization algorithms allow us to find some small factors (having up to 50 and more digits) of a given composite number, and their running time is more dependent on the size of the factor than on the size of the composite. Complete factorization algorithms split one composite (having 100 and more digits) into two (or more) parts of roughly the same size, and in this case running time depends on the size of initial composite number. Such methods as trial division, William's P+1, Pollard's Rho and P-1, Lenstra's elliptic curve method (ECM) belong to the first group, while numerous sieves, such as MPQS, SIQS, GNFS, SNFS, belong to the second group of factoring algorithms.

Note that partial factorization algorithm may produce a complete factorization of a number, but only if this composite contain no more than one big prime factor. On the other hand, complete factorization algorithm may find a composite factor, but applying this method again to this much smaller composite quickly completes the factorization.

The main idea of integer factorization is to extract small factors by ECM until cofactor became prime or factorable enough quickly with sieve methods. Usually, when running ECM on many composites at the same time, some smaller ones should be taken out and factored with sieve methods as soon as the average period of extracting ECM factors become twice less than sieve running time for those small composites.

Here is the table of recommended ECM parameters to remove the most of small factors having less than a given number of digits:
 

Digits
Value of B1
Number of curves for
default B2 in GMP-ECM 6.0
15
2000
30
20
11000
80
25
50000
200
30
250000
400
35
1*106
900
40
3*106
2300
45
11*106
4500
50
43*106
7600
55
110*106
18000
60
260*106
42000
65
850*106
70000

Sometimes P-1 method give better results than ECM when factors are known to possess some special properties. Note that very small factors (up to ~8 digits) are usually to be removed yet before ECM (or P-1) by trial division, Rho, etc.

There are many implementations of ECM, one of the best being GMP-ECM by Paul Zimmermann et. al. based on GMP library.

General sieve methods (quadratic sieves) are those for which any composites are suitable. Usually self-initializing quadratic sieve (SIQS) is used for composites having up to 100 digits. Here you can use an excellent Msieve by Jason S. Papadopoulos.

Number field sieves are usually used for larger numbers. GNFS become faster than QS nearly at 100 decimal digits; SNFS is yet faster than GNFS, but tricky to be applied. Note that numbers of the form xy + yx may be factored by SNFS, altough their composite cofactors may appear enough small to make another method faster, especially when a lot of small factors are removed by ECM.

There are few implementations of NFS, the most popular being GGNFS by Chris Monico and Msieve by Jason S. Papadopoulos, though some people use software written by Jens Franke.

Generally, we can work out enough quick and simple factoring strategy for an array of composites of typical size (50-300 digits). There is my variant; I think it can't be notably speeded up without getting too complicated.

Step 1. Algebraic factoring, trial division, rho, quick P-1
Step 2. ECM, B1 from 1k through 100k with step 1k (k means thousand, M means million)
Step 3. SIQS on C75- (i.e. composites having 75 or less digits; more info)
Step 4. ECM, B1 from 100k through 500k with step 1k
Step 5. SIQS on C90-
Step 6. ECM, B1 from 500k through 2M with step 2k
Step 7. SIQS on C105-

If there are thousands of composites, a single machine usually will not be able to go further. So it's better to stop here and distribute following-up ECM work starting at B1 = 3*106, unless you have a powerful cluster or the number of composites is enough small (say, a hundred).

Step 8. ECM, B1 from 2M through 5M with step 2k
Step 9. GNFS on C120-, SNFS if possible
Step 10. ECM, B1 from 5M with step 5k

The last stage is usually merged with NFS methods.

Some papers about methods of integer factorization and their applications are listed in the Links section.

Back to the top

Andrey Kulsha, Belarus