Until the beginning of 2004, the Spaces projectteam was a joint projectteam of INRIA Lorraine and INRIA Rocquencourt. The main objective of this team was to solve systems of polynomial
equations and inequations. The focus was on algebraic methods which are more robust and frequently more efficient than purely numerical tools. The members of the team located in Paris was
mostly working on the algebraic aspects, whereas the task of the members located in Nancy was to devise arithmetic tools which could enhance the efficiency of the formal method (mainly real
arithmetic, but also more exotic ones, like modular or
padics).
Since the beginning of 2004, the ``Paris subteam'' has decided to create their own project team. In the same time, due to several arrivals in the projectteam, the main interest of the ``Nancy subteam'' has somewhat shifted towards arithmetics, algorithmic number theory and their application in cryptology. A new projectteam named CACAO has been created on October 9, 2006. However, from the beginning of 2004, all work done in the Nancy part of the SPACES project should be thought of as being related to the goals of the CACAO project team. We thus choose to present the objectives of the latter project, and results obtained with respect to them. The present activity report is thus a ``joint report'' SpacesCacao.
The objectives of the projectteam have been along the following lines:
Studying the arithmetic of curves of small genus >1, having in mind applications to cryptology,
Improving the efficiency and the reliability of arithmetics in a very broad sense (i.e., the arithmetics of a wide variety of objects).
These two objectives strongly interplay. On the one hand, arithmetics are, of course, at the core of optimizing algorithms on curves, starting evidently with the arithmetic of curves themselves. On the other hand, curves can sometimes be a tool for some arithmetical problems like integer factorization.
To reach these objectives, we have isolated three key axes of work:
Algebraic Curves: the main issue here is to investigate curves of small genus
>1over finite fields (base field
, for various
pand
n), i.e., mainly: to compute in the Jacobian of a given curve, to be able to check that this variety is suitable for cryptography (cardinality, smoothness test) and
to solve problems in those structures (discrete logarithm). Applications go from number theory (integer factorization) to cryptography (an alternative to RSA).
Arithmetics: we consider here algorithms working on multipleprecision integers, floatingpoints numbers,
padic numbers and finite fields. For such basic data structures, we do not expect new algorithms with better asymptotic behavior to be discovered; however, since
those are firstclass objects in all our computations, every speedup is most welcome, even by a factor of 2
Linear Algebra and Lattices: Solving large linear systems is a key point of factoring and discrete logarithm algorithms, which we need to investigate if curves are to be applied in cryptology. Lattices are central points of the new ideas that have emerged over the very last years for several problems in computer arithmetic or discrete logarithms algorithms.
Starting Fall 2006, Spaces will lead an ANR research grant on the topic of the Number Field Sieve integer factorization algorithm. This research will be done in collaboration with the teams of LIX (École polytechnique, Palaiseau) and IECN (Nancy). The three research axes set above in the objectives of Spaces fit well in the perspective of this research project.
Another new direction of research has started since Fall 2006 with the arrival of Marion Videau, who has been hired as an assistant professor at UHP, coming from the CODES projectteam (Rocquencourt). This should allow the projectteam to start an axis around symmetric primitives for cryptology; this is an interesting complement to the expertise already present regarding asymmetric (and especially curvebased) primitives for cryptology.
Though we are interested in curves by themselves, the applications to cryptology remain a motivation of our research. Therefore, we start by introducing these applications, since they may serve as a guideline to the reader in this somewhat technical section.
The RSA cryptosystem — the de factostandard in publickey cryptography — requires large keys, at least 1024 bits currently. Algebraic curves offer a better level of security for a smaller key size, say 160 bits currently for elliptic curves. They are not specifically used as curves. In practice, a very general construction due to El Gamal associates to any group a cryptosystem, this cryptosystem being secure as soon as the socalled DiffieHellmanproblem (or its decision variant) is difficult:
Given
gG,
g^{a}and
g^{b}for some integers
aand
b, compute
g^{ab}.
Currently, the only way to attack this problem is to tackle the more difficult discrete logarithm problem:
Given
gGand
g^{a}, find
a,
which, in the case of the El Gamal system, is equivalent to the socalled attack on the key (given the public part of the key, recover the secret part). We shall only discuss the discrete logarithm problem in this document, since it is widely believed that the two problems are in fact equivalent.
This problem is easy when the underlying group is or . Classically, multiplicative groups of finite fields are used; however, they can be attacked by algorithms very similar to those existing for factoring, and thus require the same keysize to ensure security.
A trend initiated by Koblitz and Miller and followed by many others is to use as ``cryptographic groups'' the group (``Jacobian'') associated by classical arithmetical geometry to a given algebraic curve.
To use such a group for cryptographic applications, the key algorithmic points are the following:
have an explicit description of the group and the group operation, as efficient as possible (the speed of ciphering and deciphering being directly linked to the efficiency of the group operation);
undertake an as thorough as possible study of the security offered by those groups.
The second point should again be split into two steps: study of the behavior of the group under ``generic attacks'' (avoiding small cardinality, avoiding cardinalities with no large prime factor), and trying to devise ``ad hoc'' attacks. The first step amounts more or less to being able to compute the cardinality of the group; the second one to try as hard as possible to find a way to compute discrete logarithms in this group.
This section now proceeds as follows; we introduce the basic objects (curves and Jacobians) and their properties relevant to the following problems: group structure and arithmetic, cardinality, discrete logarithm.
Finally, and in a somewhat independent way, curves and their Jacobians can be used for integer factorization; we shall also review that point.
A central role is played by a certain algebraic variety of higher dimension associated to a given curve
C, its Jacobian
J(
C), which comes with a natural group structure. We shall not define it, but rather state its most important properties:
Cembeds as a subvariety of
J(
C),
J(
C)is an abelian variety, i.e., has an (abelian) group structure such that group operations (addition, inversion) can be written as rational functions of the
coordinates.
if
Chas genus
g,
J(
C)is a variety of dimension
g(note that in full generality one only knows how to embed it in a space of dimension
2
^{2
g}, i.e. to give many equations in
2
^{2
g}variables rather than, for instance, one equation in
g+ 1variables).
The most important feature of the Jacobian is the fact that it comes with a natural group structure, which is the key point for its uses in applications to primality, factorization, and cryptology.
We intend to focus on the case and its extensions, with subsidiarily a study of the cases where is a number field or a completion of a number field, since those happen to be related to the previous one by reduction/lifting techniques.
In this setting, the situation is rather rigid; the cardinality of the curve over any
gextension fields of the base field determines the cardinality over all extensions, and the cardinality of the Jacobian. We also have ``sharp'' estimates for the
cardinality, namely
and
(the socalled Weil bounds).
The cardinalities have several interpretations, which usually yield different strategies for computing them. We shall review them in an informal way in section .
In this part, we generalize slightly the setting, since we shall also discuss later some aspects on discrete logarithms over finite fields. We shall hence assume that
Gis an abelian group, where we want to solve the equation
g^{x}=
h
where
g,
hare given elements from
G, and the unknown
xis an integer. This is known as the
discrete logarithmproblem (DL for short). A first remark, due to Nechaev
, is that if one uses only operations in the
group, one needs at least
operations to compute a discrete logarithm. One of the quests of cryptology is finding a socalled ``Nechaev group'', for which there are provably no algorithms for computing
discrete logarithms faster than
; it currently appears that elliptic curves are the best candidates to be Nechaev groups, hence the interest in cryptology.
On the other hand, two classical algorithms (Pollard's
method and Shanks' babystep giantstep) allow one to compute a discrete logarithm in any group
Gin time
. The complexity of the ``general discrete logarithm'' is thus completely known. However, for a family of groups or even a specific group, faster algorithms might exist. We shall
discuss some of those algorithms in the sequel.
As a group, the Jacobian is defined as a quotient of the free group generated by points; as any definition based on a quotient, it is not very tractable for explicit computations. It is necessary to devise a specific representation of elements and specific algorithms to deal with computations in the Jacobian. Though general methods exist , the most interesting methods usually take advantage of the specific curve one is dealing with, or even of the specific model of the curve to get a more efficient algorithm.
In the case of elliptic curves, the problem is quite easy; the classical chordandtangent rule yields by simple calculations easytoimplement formulas. One can still improve somewhat upon those formulas. The situation however is quite different as soon as higher genus curves are involved.
In the case of hyperelliptic curves, a now classical algorithm due to Cantor explains how to implement efficiently arithmetic in their Jacobian; numerous improvements have been obtained since, including explicit formulas , which are more efficient in practice than Cantor's algorithm.
Another family of curves has received interest from the cryptology community in recent years, namely the
C_{ab}family. In that case, algorithms have been obtained by Arita
using Gröbner bases computations, then for
a subfamily, a more efficient method was devised by Galbraith, Paulus and Smart
and a common setting was then found by
Harasawa and Suzuki
. Since then, more efficient algorithms
were obtained by using suitable orderings for the Gröbner basis computation in Arita's method, and explicit formulas derived in some cases
. However, recent work by Diem and Thomé
almost completely dismisses nonhyperelliptic
C_{ab}curves, as far as cryptology is concerned.
The question of point counting over finite fields is of central importance for applications to cryptography, see Section
. Recall that we are given an algebraic curve
Cof genus
g, over a finite field
, and we would like to count the number of points of the Jacobian of this curve.
First, for the sake of completeness, we should mention two classical ways to somewhat reverse the problem, i.e., to construct the curve and its number of points at the same time: using Koblitz curvesand complex multiplication.
Those two methods are extremely efficient, especially the first one, but the main drawback is that they introduce some unnecessary structure in the curves they construct; in particular, Koblitz's method yields curves with a large ring of automorphisms. This can be used to speed up discrete logarithm computations, and should thus be considered as a weakness from the cryptographic point of view.
Let us now turn to actual point counting algorithms. HasseWeil's theorem states that computing a certain polynomial
P(
t)is enough to obtain the cardinality (in particular, the cardinality of the jacobian over the base field is exactly
P(1)). There are several interpretations for the polynomial
P(
t), which yield different strategies for computing it:
adic characterization: Schoof's algorithm and its improvements and extensions, is especially suitable in large characteristic. It is well understood for elliptic curves; the hyperelliptic case, though already studied , , would still benefit from significant improvement.
padic characterization: Lift the curve to an extension of
, and compute
P(
t)over this extension. This is the core of Satoh's method and its AGM variants. This method is the most efficient in small characteristic.
MonskyWashnitzer characterization and Kedlaya's algorithm. Again, this is suitable for small or medium primes. This method is interesting by its generality.
In the case of Jacobians of curves, at the time being, no other general algorithm is known. This is the key interest of curves for cryptology, and the reason for which rather small keys give the same level of security that much larger keys in the case of RSA.
However, many ad hoc methods, which exploit (or demonstrate) the weakness of certain families of curves, exist. Let us quote PohligHellman method (if the cardinality of the group is
smooth, hence the interest in computing the cardinality!), Index calculus (for discrete logarithms over finite fields, leading to subexponential complexities), and some weaker instances
of curves (trace 0, supersingular, Weil descent, small extension fields). For curves, higher genus have been showed to be weaker than generic groups for
g5by Gaudry
and then by further work for
g3, see Section
.
Huge linear systems are frequently encountered as last steps of ``indexcalculus'' based algorithms. Those systems correspond to a particular presentation of the underlying group by generators and relations; they are thus always defined on a base ring which is modulo the exponent of the group, typically in the case of factorization, when trying to solve a discrete logarithm problem over .
Those systems are often extremely sparse, meaning that they have a very small number of nonzero coefficients.
The classical, naive elimination algorithm of Gauss yields a complexity of
O(
n^{3}), when the matrix considered has size
n×
n. However, if we assume that we can perform a matrix multiplication in time
, algorithms exist which lower this complexity to
. Furthermore, if we make assumptions on our matrix (mainly that it is sparse, meaning that a matrixvector product can be computed in time
for some
<2), then specialized algorithms (Lanczós, Wiedemann
) relying only on evaluation of matrixvector
products yield a complexity of
, typically
O(
n^{2})for the very sparse matrices (
= 1) that we often encounter.
Many problems described in the other sections, but also numerous problems in computer algebra or algorithmic number theory, involve at some step the solution of a linear problem or the search for a short linear combination of vectors lying in a finitedimensional Euclidean space. As examples of this, we could cite factoring and discrete logarithms methods for the former, finding worst cases for the Table Maker's Dilemma in computer arithmetic for the latter (see Section ).
The important problem in that setting is, given a ``bad'' basis of a lattice, to find a ``good'' one. By good, we mean that it consists of short, almost orthogonal vectors. This is a difficult problem in general, since finding the shortest nonzero vector is already NPhard, under probabilistic reductions.
In 1982, Lenstra, Lenstra, and Lovász
defined the notion of a LLLreduced basis and
described an algorithm to compute such a basis in polynomial time, namely
O(
n^{2}log
M)linear algebra steps (of type matrixvector multiplication), or
O(
n^{4}log
M)operations
on coefficients at most
O(
nlog
M), therefore giving a
O(
n^{6}log
^{3}M)bit complexity if the underlying arithmetic is naive.
Recall that the systems we are dealing with are usually systems with coefficients in a finite ring, which can be either small ( ), or quite a large ring.
Given a symmetric matrix
Aand a vector
x, Lanczós' method computes, using GramSchmidt process, an orthogonal basis
of the subspace generated by
{
x,
A
x,
A^{n}x}for the scalar product
[
x,
y] = (
x
A
y). As soon as one finds an isotropic vector
w_{i}, i.e.,
[
w
_{i},
w
_{i}] = 0, one has
w_{i}^{t}Aw_{i}= 0. In our situation, we take
A=
B^{t}B, where we want to find a vector in the kernel of
B; we thus have
(
w
_{i}
B)
^{t}
B
w
_{i}= 0. Over a finite field this does not always imply
Bw_{i}= 0, but this remains true with probability close to 1 over a finite ring of large characteristic. This approach works over
as well, but with some caution.
Given a matrix
A(not necessarily symmetric) and a vector
x, Wiedemann's algorithm looks for a trivial linear combination of the vectors
A^{i}x,
i1. Such a relation can be written as
. Now, if
is a nonzero vector, we have
Au= 0, and
uis a vector of the kernel of
A. The linear combination, in turn, is searched by choosing a random vector
yand computing the elements
_{i}=
yA^{i}x. If a relation of the type we are looking for exists, then
_{i}is a linear recurring sequence of order
n. Given
2
nelements of the sequence, BerlekampMassey's algorithm allows one to compute the coefficients of the recurrence. Thus, with
O(
n)matrixvectors and
O(
n)vectorvector products, one hopes to recover a vector of the kernel. The overall complexity is thus, on average,
, as announced.
Algorithms for solving large sparse linear systems have been designed with implementation, and parallelism or distribution in mind, or both. The Lanczós and Wiedemann algorithms have ``block'' versions , , which one can use in order to take advantage of an advanced computing facility, like a massively parallel computer, or a much cheaper resource like a computer cluster, which can be turned into an effective task force. A key problem is therefore the identification of the computational tasks which either can, or cannot be effectively spanned across many processors or machines. In the case of a computer cluster, evaluating the cost of communications between nodes taking part to the computation is of course very important. To this regard, the different algorithms (block or nonblock versions, Lanczós or Wiedemann) do not compare equally. A variety of running times can be obtained depending on the exact characteristics of the input system (size, density, definition field), the number of computing nodes, and on the choice of certain parameters of the algorithms (for the block versions).
The block Wiedemann algorithm has been used by Thomé
,
in the course of solving a
500, 000×500, 000linear system defined over
, where
pis a prime of 183 decimal digits. This computation was made feasible using an algorithm based on the Fast Fourier Transform (FFT), which permitted broader
distribution of the computation
.
Today, block versions of the Lanczós and Wiedemann algorithms are a necessity for who wants to solve linear systems encountered in recordsize factoring problems, discrete logarithm problems, or in some other cases. Yet, a precise account on the positive and negative sides of both block algorithms, and a formulation of their preferred setting, seems to be missing.
We consider here the following arithmetics: integers, rational numbers, integers modulo a fixed modulus
n, finite fields, floatingpoint numbers and
padic numbers. We can divide those numbers in two classes:
exact numbers(integers, rationals, modular computations or finite fields), and
inexact numbers(floatingpoint and
padic numbers).
Algorithms on integers (respectively floatingpoint numbers) are very similar to those on polynomials (respectively Taylor or Laurent series). The main objective in that domain is to find new algorithms that make operations on those numbers more efficient. These new algorithms may use an alternate number representation.
The integral types of the current processors have a width
wof either 32 or 64 bits. This means that, using hardware instructions, one is only able to compute modulo
2
^{32}or
2
^{64}. An arbitrary precision integer is then usually represented under the form
, with
n_{i}a machine integer. In algorithmic terms, it means that a multiprecision integer is an array of machine integers. Naive operations can then be defined by using the classical
``schoolbook methods'' in base
2
^{w}, with linear complexity in the case of addition and subtraction, and quadratic complexity in the case of division and multiplication.
Integers modulo
nare usually represented by the representative of their class in the interval
[0,
n1]or sometimes
]
n/2,
n/2].
Addition, subtraction and multiplication are obtained from the corresponding operation over the integers, after reduction modulo
n. This means that after each operation, a reduction modulo
nmust be performed. This is not very costly in the case of addition and subtraction (where it implies a single test and half the time another addition of subtraction),
but implies a division in the case of multiplication.
The modular division is a completely different operation, and amounts to compute a socalled extended gcd of
xand
n, i.e., a pair
a,
bwith
ax+
bn= 1. This is classically performed by the Euclidean algorithm or one of its variants, and is thus, in practice, by far the most costly operation. Many improvements
in lowlevel algorithms are obtained by choosing suitable representations of objects which avoid divisions modulo
n.
Finite fields can be separated in two types. Prime fields correspond to the integers modulo
nfor prime
n. Extension fields are algebraic extensions of those prime fields, i.e.,
modulo an irreducible polynomial
P(
X). Elements of a nonprime finite field are thus often represented as polynomials of elements of a prime field. This means that ideas from polynomial arithmetic
can, and should be used.
A difficult case is the case when
p^{deg
P}(the cardinality of the field) is large whereas neither
pnor
deg
Preally are. The case where
pis large is indeed a classical case where we have to deal with arithmetic with large integers, and fast algorithms exist in that case. The case where
pis small and
deg
Plarge corresponds to the realm of fast polynomial arithmetic. However, in the ``middle range'', neither
pnor
deg
Pare large enough to justify the use of fast techniques. This is also at the core of some technical theoretical difficulties.
A
padic number is defined as the formal limit of a sequence
(
x
_{n})of integers such that
x_{i}=
x_{i+ 1}mod
p^{i}. One could think of it as a formal series
, with
, though alternative representations are sometimes more efficient for some computations. In particular, a
padic number given to the precision
nis simply an element of
.
The
padic numbers offer the capability of lifting information known in a finite field to a field of characteristic zero, keeping some structure information at the same
time. They are extensively used by many algorithms in computer algebra and algorithms related to algebraic curves, together with their extensions.
When we are trying to lift information from a nonprime finite field, say
for
q=
p^{n}, we are led to introduce algebraic extensions of
; algebraic extensions of the
padics can be of two types,
unramified extensionsand
ramified extensions; roughly speaking, ramified extensions contain fractional powers of
p.
In practice, we are mostly interested in the case of small
pand unramified extensions. Of lesser importance are
padic integers for large
p, and extensions of these, because the algorithms we have in mind are generally not practical for large
p. Yet, this is not necessarily the case for any possible
padic algorithm, hence this point of view may change. At present, our application realms do not call for
padic arithmetic requiring computations in ramified extensions, but this may change in the future as well.
When discussing inexact types, one stumbles very quickly on two critical difficulties:
since approximation is inherent to the manipulation of inexact types, how should approximation be performed? This amounts to defining the (necessarily) finite set of numbers that can be exactly represented (format),
even if the two operands of an operation can be exactly represented, in general the result cannot be. How should one define the result of an operation (rounding)? This is the key for a precise and portable semantics of floatingpoint computations.
From now on, we shall focus on the floatingpoint numbers, which are the main inexact data type, at least from the practical point of view.
A floatingpoint format is a quadruple
(
,
n,
E
_{min},
E
_{max}); a floatingpoint number in that format is of the form
b
_{0}.
b
_{1}...
b
_{n1})·
^{e},
where
is the
base— usually 2 or 10 —,
nis the
significand width,
e[
E_{min},
E_{max}]is the
exponent, and the
b_{i}are the
digits,
0
b
_{i}<
. The IEEE754 standard defines four binary floatingpoint formats (single
precision, singleextended, double precision, doubleextended), the singleextended format being obsolete:
format  total width  significand width 
E
_{
m
i
n
}

E
_{
m
a
x
}


single  32  2  24  126  + 127 
double  64  2  53  1022  + 1023 
doubleextended  79  2  64  16382  + 16383 
The ongoing revision (754r) forgets about the singleextended and doubleextended formats, and defines a new quadruple precision format (binary128). It also defines new decimal formats:
format  total width  significand width 
E
_{
m
i
n
}

E
_{
m
a
x
}


binary128  128  2  113  16382  + 16383 
decimal32  32  10  7  95  + 96 
decimal64  64  10  16  383  + 384 
decimal128  128  10  34  6143  + 6144 
The IEEE754 standard defines four rounding modes: rounding to zero, to
+
, to

, and to nearesteven. It requires that any of the four basic arithmetic
operations (
+ , , ×, ÷), and the square root, must be
correctly rounded, i.e., the rounded value of
abfor
{ + , , ×, ÷}must be the closest one to the exact value (assuming that the
inputs are exact) — as if one were using infinite precision — according to the rounding direction. (In case of an exact result lying exactly in the middle of two consecutive machine
numbers, the nearesteven mode chooses that with an even mantissa, i.e., ending with
b_{n1}= 0in binary.)
Let
fbe a mathematical function (for example the exponential, the logarithm, or a trigonometric function), and a given floatingpoint format
(
,
n,
E
_{min},
E
_{max}). Assume
= 2, i.e., a binary format for simplicity. Given a floatingpoint number
xin that format, we want to determine the floatingpoint number
yin that format — or in another output format — that is closest to
f(
x)for a given rounding mode. In that case, we say that
is
correctly rounded. The problem here is that we cannot compute an infinite number of bits of
f(
x). All we can do is to compute an approximation
zto
f(
x)on
m>
nbits, with an error bounded by one
ulp(unit in last place). Consider for example the arctangent function, with the doubleprecision number
x= 4621447055448553·2
^{11}, and rounding to nearest. We have in binary:
where the first line contains 53 significant bits, and the second one has 45 consecutive zeros. If
m99, we'll get as approximation
, which is exactly the middle of two doubleprecision numbers, and therefore we will not be able to determine the correct rounding of
arctan
x. We say that
xis a
worst casefor the
arctanfunction and rounding to nearest. Since a given format contains a finite number of numbers — at most
2
^{64}for doubleprecision —, the maximal working precision
mrequired for any
xin that format is finite. The Table Maker's Dilemma (TMD for short) consists in determining that maximal working precision
m_{max}needed, which depends on
f, the format and the rounding mode, and possibly the corresponding worst cases
x. Once we know
m_{max}, we can design an efficient routine to correctly round
fas follows: (i) compute a
m_{max}bit approximation
zto
f(
x), with an error of at most one ulp, (ii) round
z.
Most basic algorithms for integers are believed to be optimal, up to constant factors. The main goal here is thus to save on those constant factors. For the multiplication, one challenge is to find the best algorithm for each input size; since the thresholds between the different algorithms (naive, Karatsuba, ToomCook, FFT) are machinedependent, there is no theoretical answer to that question. The same holds for the problem of finding which kind of FFT (Mersenne, Fermat, complex, Discrete Weighted Transform or DWT) is the fastest one for a given application or input size.
For the division, it is well known that it can be performed — as any algebraic operation — in a constant times that of the corresponding multiplication: for example, a
n×
nproduct corresponds to a
(2
n)/
ndivision. One main challenge is to decrease that constant factor, say
d. In the naive (quadratic) range, we have
d= 1, but already in the Karatsuba range, the best known implementation has
d= 2
. (Van der Hoeven
gives an algorithm with
d= 1, however its implementation seems tricky, and its memory usage is superlinear.)
Algorithms for floatingpoint numbers make great use from those for integers. Indeed, a binary floatingpoint number may be represented as an integer significand multiplied by 2 ^{e}. Multiplication of two floatingpoint numbers therefore reduces to the product of their significands; this product is in fact a short product, since only the high part is needed (assuming all numbers have the same precision). Despite some recent theoretical advances , no great practical speedup has been obtained so far for the computation of a short product with respect to the corresponding plain product. The same holds for division, though extension of the ideas of the middleproduct to floatingpoint numbers might allow one to gain somewhat on division.
A special case of integer division is when the divisor
nis constant. This happens in particular in modular or finite field computations (discrete logarithm and factorization via ECM for instance). There are basically two
kinds of algorithms in that case: (i) Barrett's division
precomputes an approximation to
1/
n, which is used to get an approximation to the quotient, which after a second product yields an approximate remainder, (ii) Montgomery's reduction precomputes
1/
nmod
^{k}(where the input
nhas
kwords in base
) which gives in two products the value of
c^{
k}mod
n, for
chaving
2
kwords in base
. Both algorithms perform two products with operands of size equal to the size of
n. These products are in fact short products, but according to the above remark, the global cost is close to that of two plain products. A speedup can be obtained in
the FFT range, where the second product (to obtain the remainder) produces a known high part (resp. low part) in Barrett's division (resp. Montgomery's reduction); using the fact
that the FFT computes that product modulo
2
^{m}±1, one can save a factor of two for that product, with a global gain of 25%. Together with caching the transform of the input
nand of its approximate inverse, one approaches
d= 1. These ideas still need to be implemented in common multipleprecision software.
Recently, a large number of new ``
padic'' algorithms for solving very concrete problems have been designed, notably for counting points on algebraic varieties defined over finite fields. The
application of such algorithms to coding theory or cryptology is immediate, as this is a considerable aid for quickly setting up elliptic curve cryptosystems, or for finding good codes.
Some of these algorithms have been listed in Section
.
In such algorithms, computations are carried out in ``
padic structures'', but this vague wording reflects a relatively wide variety of mathematical structures (not unrelated to the underlying finite field, of course).
We are frequently led to computing in the ring of 2adic integers, which can be regarded as the integers modulo
2
^{n}for some variable precision
n. Also, just as extensions of
are very common in computer algebra in general, the ring of integers of unramified extensions of 2adic numbers plays an important role.
Some instances of the TMD are easy. For example, for an algebraic function of total degree
d, we get an upper bound of
m_{max}dn+
O(1)
, which is attained when
d= 2. Another easy case is the base conversion, where the TMD reduces to
O(
E_{max}
E_{min})computations of continued fractions
.
However, in general, and especially for nonalgebraic functions, the TMD is a difficult problem, because we know no rigorous upper bound for
m, or the corresponding upper bound is much too large. However, a quickanddirty statistical analysis shows that for a
nbit input format (including the exponent bits if needed), the worst case is about
. But to determine a rigorous bound, the only known methods are based on exhaustive search. Basically, they compute a
2
nbit approximation to
f(
x)for every
xin the given format, and see how many consecutive zeros or ones appear after (or from) the round bit. This naive approach has complexity
(2
^{n}). Fortunately, faster — but still exponential — methods do exist. The first one is Lefèvre's algorithm
,
, with a complexity of
. An improved algorithm of complexity
is given in
.
The main application domain of our project is cryptology. As it has been mentioned several times in this document, curves have taken an increasing importance in cryptology over the last ten years. Various works have shown the usability and the usefulness of elliptic curves in cryptology, standards and realworld applications.
We collaborate with the TANCprojectteam from INRIA Futurs and École polytechnique on the study of the suitability of higher genus curves to cryptography (mainly hyperelliptic curves of genus two, three) This implies some work on three concrete objectives, which are of course highly linked with our main theoretical objectives:
improvement of the arithmetic of those curves, so as to guarantee fast enough cipheringdeciphering;
fast key generation. This rests on fast computations in the curve and in the ability to quickly compute the cardinality. Another approach (complex multiplication) is followed by TANC.
study of the security of the algorithmic primitives relying on curves. This implies attempts at solving discrete logarithms problems in Jacobians using the best known techniques, so as to determine the right keysize.
We also have connections to cryptology through the study and development of the integer LLL algorithm, which is one of the favourite tools to cryptanalyse publickey cryptosystems. For example, we can mention the cryptanalysis of knapsackbased cryptosystems, the cryptanalyses of some fast variants of RSA, the cryptanalyses of fast variants of signature schemes such as DSA or Elgamal, or the attacks against lattice based cryptosystems like NTRU. The use of floatingpoint arithmetic within this algorithm dramatically speeds it up, which renders the aforementioned cryptanalyses more feasible.
We have strong ties with several computational number theory systems, and code written by members of the projectteam can be found in the Magma software and in the Pari/GP software.
Magma ( http://magma.maths.usyd.edu.au/magma/) is the leading computational number theory software. It also has some features of computer algebra (algebraic geometry, polynomial system solving) but not all of what is expected of a computer algebra system. It is developed by the team of John Cannon in Sydney, and while it describes itself as a noncommercial system, it is sold to cover the development cost, porting and maintaining.
In many areas, programs originating from very specialized research works are ported into Magmaby their authors, who are invited to Sydney for this purpose. Several members of our projectteam have already visited Sydney; there has even been an official collaboration supported by the French embassy in Sydney involving people from 3 groups in France (Toulouse, Palaiseau, Nancy) in 20002002. Gaudry, Thomé, and Zimmermann have had the occasion to visit the Magmagroup in Sydney in 2001 in order to implement within Magmasome code they had written for their personal research (on computing the cardinality of Jacobians of hyperelliptic curves, on computing discrete logarithms in , and on the ECM factorization algorithm, respectively). Zimmermann visited again the Magmagroup in April 2005 to help integrating mpfrand libecminto Magma.
The Magma system now uses
mpfr(see Section
) for its multipleprecision floatingpoint arithmetic.
Pari/GP is a computational number theory system which comes with a library which can be used to access Pari functions within a C program. It has originally been developed at the Bordeaux 1 university, and is currently maintained (and expanded) by Karim Belabas, from Bordeaux University. It is free (GPL) software. We sometimes use it for validation of our algorithms.
Again, some code written by members of the project has been incorporated into Pari.
Another indirect transfer is the usage of
mpfrin GCC (
Gnu Compiler Collection), originally for the
gfortrancompiler
The
mpfrlibrary is also used by the
CGALsoftware, a library for computational geometry developed at INRIA SophiaAntipolis. The
CGAL
An important part of the research done in the SPACESproject is published within software.
MPFRis one of the main pieces of software developed by the SPACESteam. MPFRis a library for computing with arbitrary precision floatingpoint numbers, together with welldefined semantics, distributed under the LGPL license. In particular, all arithmetic operations are performed according to a rounding mode provided by the user, and all results are guaranteed correct to the last bit, according to the given rounding mode.
From September 2003 to August 2005, P. Pélissier joined the MPFRteam, as a Junior technical staff, to help improve the efficiency of MPFRfor small precision (up to 200 bits, in particular in double, double extended and quadruple precision). He also greatly improved the portability of the library, and added the use of libtoolto enable dynamic libraries. P. Pélissier is now working for SopraGroup — a small company near Toulouse, subcontractor for Airbus Industry — on the validation of A380 commands.
In October 2005, the
MPFRteam took part in the ``many digits'' friendly competition organized by the group of Henk Barendregt at the University of Nijmegen, Netherlands
MPFR 2.2.1 was released on November 29, 2006.
Several software systems use MPFR, for example: the KDE calculator Abakus by Michael Pyne; CGAL (Computational Geometry Algorithms Library) developed by the Geometrica team (INRIA SophiaAntipolis); Gappa, by Guillaume Melquiond (ARENAIRE team); Genius Math Tool and the GEL language, by Jiri Lebl; GCC; Giac/Xcas, a free computer algebra system, by Bernard Parisse; the iRRAM exact arithmetic implementation from Norbert Müller (University of Trier, Germany); the Magma computational algebra system; and the Wcalc calculator by Kyle Wheeler.
Finally, a paper has been written summarizing the objectives, architecture, and features of MPFR. It will appear in 2007.
MPCis a complex floatingpoint library developed on top of the
MPFRlibrary, and distributed under the LGPL license. It is cowritten with Andreas Enge (TANC team, INRIA Futurs). A complex floatingpoint number is
represented by
x+
iy, where
xand
yare real floatingpoint numbers, represented using the
MPFRlibrary. The
MPClibrary currently implements all basic arithmetic operations, and the exponential function, all with correct rounding on both the real part
xand the imaginary part
yof any result.
GMPECM is a program to factor integers using the Elliptic Curve Method. Its efficiency comes both from the use of the GNU MP library, and from the implementation of stateoftheart algorithms. GMPEMC contains a library ( libecm) in addition of the binary program ( ecm). The binary program is distributed under GPL, while the library is distributed under LGPL, to allow its integration into other nonGPL software. For example, the Magma computational number theory software uses libecm, up from version V2.12 of Magma.
Since October 2005 where this project moved to
gforge.inria.fr, and up to September 2006,
there were about 1000 downloads. According to the ``table of champions'' maintained by Richard Brent
GMPECM is used by many mathematicians and computer scientists to factor integers, either for fun or for for real purpose; for example it can be used to prove the primality of an integer, since several primality tests require to factor a given proportion of a number .
The programs to search for the worst cases for the correct rounding of mathematical functions ( exp, log, sin, cos, etc.) using Lefèvre's algorithm have still been improved. In particular, several steps use Maple to perform multipleprecision interval arithmetic; the Maple interface had to be completely redesigned to work with Maple 9.5, and is now provided by a separate Perl module Maple.pm.
The results are used:
by us, to detect bugs in MPFRand in the GNUC library (glibc);
by the ARENAIRE team, for their implementation of the mathematical functions with correct rounding.
Bacsel is a still evolving efficient implementation (10000 lines of C code) of the SLZ algorithm for finding worst cases of elementary functions. A no longer uptodate version is available on http://www.loria.fr/~stehle. A release should occur before the end of 2006.
fpLLL is a program (10000 lines of C code) initiated as a proofofconcept for the papier . It is an efficient implementation of several variants of the program described in this paper, from a ``fast'' variant where the output basis is not guaranteed to be LLLreduced (but should be on most inputs), to a completely rigorous variant. The underlying floatingpoint arithmetic can also be selected by the user (machine double precision, DPE, MPFR). This code is already distributed on http://www.loria.fr/~stehle, and should evolve into a library in the near future. A tailored version of this code is used in BACSEL. This program is by far more stable and efficient than its main competitors, NTL, GP/Pari. It is also more stable and efficient than Magma LLL code, but this latter code is under complete rewriting by D. Stehlé.
CRQ is a library for arbitraryprecision numerical integration (quadrature) developped by Laurent Fousse. It is based on the MPFR library and distributed under the LGPL. Its aim is to extend the idea of correct rounding (present in the IEEE754 norm and in MPFR) to the more complex operation of numerical integration, or at least to bound its error. Its impact is currently limited as no numerical software is using it. The operation of numerical integration is commonplace in symbolic and numerical systems like Maple or Mathematica, but none of them have the goal to provide a rigorous bound on the error.
Mploc is a
Clibrary for computing in
padic fields and their unramified extensions. The focus is mainly on
for prime
p, and unramified extensions of
. The ability to compute in these structures is important to several applications, for example counting zeta functions of algebraic varieties —this application encompasses the problem of
point counting on algebraic curves—. In a similar realm, some algorithms for constructing curves via complex multiplication methods have a
padic analogue: the Mploc library can be used for this purpose.
The Mploc library is already distributed
Mpfq is (yet another) library for computing in finite fields. The purpose of Mpfq is not to provide a software layer for accessing finite fields determined at runtime within a computer algebra system like Magma, but rahter to give a very efficient, optimized code for computed in finite fields precisely known at compile time. Mpfq is not restricted to a finite field in particular, and can adapt to finite fields of any characteristic and any extension degree. Cryptology being one of the contexts of application, however, Mpfq somehow focuses on prime fields and on fields of characteristic two.
Mpfq's ability to generate specialized code for desired finite fields differentiates this library from existing software. The performance achieved is far superior. Mpfq can be readily used
for example to assess the throughput of an efficient software implementation of a given cryptosystem. Such an evaluation is the purpose of the ``EBats'' benchmarking tool
The library's purpose being the generationof code rather than its execution, the working core of Mpfq consists of roughly 5,000 lines of Perl code, which generate most of the currently 13,000 lines of Ccode. Mpfq is currently under active development, and a first release is expected in 2007.
Two problems remain to find all worst cases of the standard C99 functions in the double precision IEEE 754 format:
periodic functions with large arguments, for example
sin
xfor
xnear
2
^{1024}. The distance between two consecutive floatingpoint numbers being large — here
2
^{971}— with respect to the function period — here
2
— the classical methods (Lefèvre's and the SLZ algorithms) cannot be applied.
twovariable functions like
x^{y},
or
. The problem here is that the input set has up to
2
^{128}elements.
We have obtained a first result for the first problem. Namely, if
is the distance between two consecutive floatingpoint numbers, the idea is to search for an integer multiple
=
qmod
that is small after reduction by the period
, using for example the continued fraction from
/
. Then we apply the classical methods (Lefèvre's and the SLZ algorithms) to
arithmetic progressions of the form
x_{i}=
x_{0}+
i, instead of
x_{i}=
x_{0}+
iin the classical case. The obtained complexity is slightly worse than in the classical
case, because
is not as small as
= u l p (
x_{i}). We were however able to find some nontrivial bad cases in a few days of computing time, for example:
We expect this first result will lead to new developments. In particular a complete description of the
x_{0}+
ithat are in a small interval modulo the period
might lead to a still better algorithm.
Another important topic in computer arithmetic is the search for polynomial approximations to functions. Indeed, instead of implementing a sophisticated algorithm to compute a function, it
is often more efficient to precompute good polynomial approximations over subintervals, and to evaluate only these approximations. However, in order to get a good control of the evaluation
error, one should use polynomials with floatingpoint coefficients. This problem has been studied jointly with N. Brisebarre
(Arénaire projectteam, Lyon). We have shown
that, when studying approximation in the
L^{2}sense, ie. finding a polynomial
Pmaking
minimal for some function
fand some measure
over
I, it is possible to find the best polynomial
Pwith floatingpoint coefficients; this has been shown by reducing the problem to a problem of finding a closest vector in a lattice. We also show that the reduction can
be performed the other way, thus showing that
L^{2}approximation is NPhard.
The common work with Richard Brent and Colin Percival on the fine error analysis of the complex floatingpoint multiplication was accepted for publication in Mathematics of Computation, and is currently in press .
The paper written on the algorithm developed in 2005 (using a double large prime variation for the discrete logarithm problem, DLP for short, in jacobian of curves) with Diem and Thériault
has been accepted. In the most important cases, genus 3 and genus 4 curves, it brings the following changes to the complexity of the DLP in the jacobian of a curve over a finite field with
qelements:
g

Index calculus  1 large prime  our algo 
3 
q
^{3/2}

q
^{10/7}

q
^{4/3}

4 
q
^{8/5}

q
^{14/9}

q
^{3/2}

Another important contribution in our work was to do computational experiments in order to demonstrate that the asymptotically fast algorithms were also the fastest in practice, already for small sizes.
Our work has since been improved by Diem to curves of small degree. In the particular case of nonhyperelliptic curves of genus 3, this has been more precisely studied by Diem and Thomé
. The conclusion is that for those curves,
there exists an attack with complexity
O(
q), and the algorithm is quite practical. This result is a very important one: until recently curves of genus 3 were studied for potential replacement of elliptic
curves in cryptosystems; it is now clear that there is not much hope in that direction for nonhyperelliptic curves.
Another contribution in the context of discrete logarithms has been obtained by Enge and Gaudry. For a general curve of large enough genus
gover a finite field
q, the complexity of a discrete log computation is in
L_{qg}(1/2), where
L()is the classical subexponential function (this has been recently proven in a rigorous way by Hess
). Enge and Gaudry
have shown that for plane curves having a
particular shape of degrees in
xand
y, this complexity can been reduced heuristically to
L_{qg}(1/3), recovering the kind of complexity we have for integer factorization or discrete logarithms in finite fields.
On the point counting side, after a series of lectures given at IHP, Gaudry has written a survey article that takes a snapshot of the current situation and highlight the major difficulties that should be overcome to continue, in particular in the nonelliptic, large characteristic case.
In the case of elliptic curves over large prime field, Gaudry and Morain have cleaned the phase called ``Eigenvalue computation'' of the SEA algorithm . Using algorithmic tools of computer algebra, they reduced the theoretical complexity of this phase and its practical running time.
The alternate approach to point counting is the CM method that produces a curve together with its number of points. In a collaboration of Gaudry, Houtmann, Weng, Ritzenthaler and Kohel, started at LIX, it is shown that a 2adic algorithm can be used to speedup the computations in the case of genus 2 curves. This work has been recently accepted.
Gaudry and specialists of protocol design have worked together to improve the efficiency of keyexchange when elliptic curves are used as a building block. This ended up in a fast protocol that is provably secure in the standard model (no random oracle needed).
MPQS is a program that factors integers using the Multiple Polynomial Quadratic Sieve, developed by Scott Contini and Paul Zimmermann. It is distributed under GPL from http://www.loria.fr/~zimmerma/free/. A license agreement is under discussion with Waterloo Maple Inc. (WMI), to enable the use of a fixed version of the MPQS software within the Maple computer algebra software.
The team has obtained a financial support from the ANR (``programme blanc'') for a project, common with the TANC projectteam and the number theory team of the mathematics lab in Nancy of studying the number field sieve algorithm.
The team has obtained a financial support from the ANR ("Sécurité et Informatique", SETIN2006 program) for a common research project with CODES project team, XLIM laboratory (Arithmetic, codes and cryptography group) of the University of Limoges, and the CITI laboratory (Middleware  Security group) of INSA in Lyon. This project is coordinated by Marion Videau.
The research aimed concerns stream ciphers especially the ones designed for constrained environments. This is a particularly hot topic as it takes place in a context where one can say that there is no such ciphers that can presently be declared secure. It also participates in the analysis efforts towards the final evaluation of the proposals submitted to the eSTREAM stream cipher project issued by ECRYPT, the European Network of Excellence in Cryptology.
We have a grant from the French Ministry of Foreign Affairs in the PAI program (Programme d'Actions Intégrées) with Germany. This is an exchange research program with Florian Heß and the ``Algebra und Zahlentheorie'' group in the TU Berlin. The topic fits with our overall objectives, since the goal is to investigate new methods in number theory and geometry with a view towards cryptology.
The members of the project have organized the 7th Real Numbers and Computers conference (RNC'7) in July 2006 (see http://rnc7.loria.fr). E. Thomé was publicity chair, L. Fousse and V. Lefèvre were organizing a ``friendly competition'', C. Simon was in charge of the invited speakers and the conference budget, and G. Hanrot, P. Zimmermann were cochairs of the program committee, and editors of the conference proceedings .
Emmanuel Thomé coorganizes the Journées Nationales de calcul Formel, to be held in Luminy in 2007.
Emmanuel Thomé participates to the program committee of the C2 workshop ( Codage et Cryptographie), to be held in Eymoutiers in october 2006.
G. Hanrot and P. Zimmermann have been program cochairs of the RNC'7 conference, that took place in Nancy in July 2006.
G. Hanrot is vicehead of the Project Committee of INRIA Lorraine. He is also an appointed member of the INRIA Commission d'Évaluation, of the Mathematics ``Commissions de Spécialistes'' from Universités Montpellier 2, HenriPoincaré Nancy 1Nancy 2INPL, JeanMonnet SaintÉtienne. He was a member of the hiring committee for CR2 at INRIA Futurs and INRIA SophiaAntipolis in 2006. He is a member of the steering committee of the RNC conference. In 2006, he was one of the reviewers of the PhD theses of R. Dupont (École polytechnique) and G. Melquiond (E.N.S. Lyon), and a member of the committee for M. Abouzaid PhD thesis (Bordeaux 1).
P. Zimmermann is also an elected member from the INRIA Evaluation Committee, and of the Computer Science ``Commission de Spécialistes'' from University Henri Poincaré Nancy 1. He is also a member of the steering committee of the RNC conference, of the program committee of ARITH'18 (to be held in 2007), and of the editorial board of a special issue of Journal of Logic and Algebraic Programmingon the development of exact real number computation.
P. Gaudry is an appointed member of the Computer Science ``Commissions de Spécialistes'' from Universités HenriPoincaré Nancy 1 and Paris 8.
P. Gaudry and G. Hanrot gave each three 3 hours lectures at MPRI (Master Parisien de Recherche en Informatique) about algorithmic number theory, in the Cryptology course.
P. Gaudry and G. Hanrot are members of the jury of ``agrégation externe de mathématiques'', a competitive exam to hire high school teachers.