Section: New Results
Algebraic computing and high-performance kernels
Multiple binomial sums
Multiple binomial sums form a large class of multi-indexed sequences, closed under partial summation, which contains most of the sequences obtained by multiple summation of binomial coefficients and also all the sequences with algebraic generating function. We study the representation of the generating functions of binomial sums by integrals of rational functions. The outcome is twofold. Firstly, we show that a univariate sequence is a multiple binomial sum if and only if its generating function is the diagonal of a rational function. Secondly we propose algorithms that decide the equality of multiple binomial sums and that compute recurrence relations for them. In conjunction with geometric simplifications of the integral representations, this approach behaves well in practice. The process avoids the computation of certificates and the problem of accurate summation that afflicts discrete creative telescoping, both in theory and in practice [12].
Algebraic Diagonals and Walks: Algorithms, Bounds, Complexity
The diagonal of a multivariate power series
Computing minimal interpolation bases
In [20]
we consider the problem of computing univariate polynomial matrices over a field that represent minimal solution bases for a general interpolation problem, some forms of which are the vector M-Padé approximation problem in [Van Barel and Bultheel, Numerical Algorithms 3, 1992] and the rational interpolation problem in [Beckermann and Labahn, SIAM J. Matrix Anal. Appl. 22, 2000]. Particular instances of this problem include the bivariate interpolation steps of Guruswami-Sudan hard-decision and Kötter-Vardy soft-decision decodings of Reed-Solomon codes, the multivariate interpolation step of list-decoding of folded Reed-Solomon codes, and Hermite-Padé approximation. In the mentioned references, the problem is solved using iterative algorithms based on recurrence relations. Here, we discuss a fast, divide-and-conquer version of this recurrence, taking advantage of fast matrix computations over the scalars and over the polynomials. This new algorithm is deterministic, and for computing shifted minimal bases of relations between
Fast and deterministic computation of the Hermite normal form and determinant of a polynomial matrix
Given a nonsingular
Computing canonical bases of modules of univariate relations
We study in [44] the computation of canonical bases
of sets of univariate relations
Matrices with displacement structure: generalized operators and faster algorithms
For matrices with displacement structure, basic operations like multiplication, inversion,
and linear system solving can be expressed in terms of the following task: evaluate
the product
Absolute real root separation
While the separation (the minimal nonzero distance) between roots of a polynomial is a classical topic, its absolute counterpart (the minimal nonzero distance between their absolute values) does not seem to have been studied much. We present the general context and give tight bounds for the case of real roots [14].
Weighted Lattice Walks and Universality Classes
In this work we consider two different aspects of weighted walks in cones. To begin we examine a particular weighted model, known as the Gouyou-Beauchamps model. Using the theory of analytic combinatorics in several variables we obtain the asymptotic expansion of the total number of Gouyou-Beauchamps walks confined to the quarter plane. Our formulas are parametrized by weights and starting point, and we identify six different asymptotic regimes (called universality classes) which arise according to the values of the weights. The weights allowed in this model satisfy natural algebraic identities permitting an expression of the weighted generating function in terms of the generating function of unweighted walks on the same steps. The second part of this article explains these identities combinatorially for walks in arbitrary cones and dimensions, and provides a characterization of universality classes for general weighted walks. Furthermore, we describe an infinite set of models with non-D-finite generating function [15].
Introduction to the IEEE 1788-2015 Standard for Interval Arithmetic
Interval arithmetic is a tool of choice for numerical software verification, as every result computed using this arithmetic is self-verified: every result is an interval that is guaranteed to contain the exact numerical values, regardless of uncertainty or roundoff errors. From 2008 to 2015, interval arithmetic underwent a standardization effort, resulting in the IEEE 1788-2015 standard. The main features of this standard are developed in [26]: the structure into levels, from the mathematic model to the implementation on computers; the possibility to accomodate different mathematical models, called flavors; the decoration system that keeps track of relevant events during the course of a calculation; the exact dot product for point (as opposed to interval) vectors.
Influence of the Condition Number on Interval Computations: Some Examples
The condition number is a quantity that is well-known in “classical” numerical analysis, that is, where numerical computations are performed using floating-point numbers. This quantity appears much less frequently in interval numerical analysis, that is, where the computations are performed on intervals. In [56], two aspects are developed. On the one hand, it is stressed that the notion of condition number already appears in the literature on interval analysis, even if it does not bear that name. On the other hand, three small examples are used to illustrate experimentally the impact of the condition number on interval computations. As expected, problems with a larger condition number are more difficult to solve: this means either that the solution is not very accurate (for moderate condition numbers) or that the method fails to solve the problem, even inaccurately (for larger condition numbers). Different strategies to counteract the impact of the condition number are discussed and experimented: use of a higher precision, iterative refinement, bisection of the input.
Error bounds on complex floating-point multiplication with an FMA
The accuracy analysis of complex floating-point multiplication done by
Brent, Percival, and Zimmermann is extended to the case where a fused multiply-add (FMA) operation is available.
Considering floating-point arithmetic with rounding to nearest and unit roundoff
Automatic source-to-source error compensation of floating-point programs
Numerical programs with IEEE 754 floating-point computations may suffer from inaccuracies, since finite precision arithmetic is an approximation of real arithmetic. Solutions that reduce the loss of accuracy are available, such as, compensated algorithms or double-double precision floating-point arithmetic. Our goal is to automatically improve the numerical quality of a numerical program with the smallest impact on its performance. In [25] we define and implement source code transformations in order to derive automatically compensated programs. We present several experimental results to compare the transformed programs and existing solutions. The transformed programs are as accurate and efficient as the implementations of compensated algorithms when the latter exist. Furthermore, we propose some transformation strategies allowing us to improve partially the accuracy of programs and to tune the impact on execution time. Trade-offs between accuracy and performance are assured by code synthesis. Experimental results show that, with the help of the tools presented here, user-defined trade-offs are achievable in a reasonable amount of time.
Formal correctness of comparison algorithms between binary64 and decimal64 floating-point numbers
We present a full Coq formalisation of the correctness of some comparison algorithms between binary64 and decimal64 floating-point numbers [28].
Implementation and performance evaluation of an extended precision floating-point arithmetic library for high-accuracy semidefinite programming
Semidefinite programming (SDP) is widely used in optimization problems with many applications, however, certain SDP instances are ill-posed and need more precision than the standard double-precision available. Moreover, these problems are large-scale and could benefit from parallelization on specialized architectures such as GPUs. In this article, we implement and evaluate the performance of a floating-point expansion-based arithmetic library (newFPLib) in the context of such numerically highly accurate SDP solvers. We plugged-in the newFPLib with the state-of-the-art SDPA solver for both CPU and GPU-tuned implementations. We compare and contrast both the numerical accuracy and performance of SDPA-GMP,-QD and-DD, which employ other multiple-precision arithmetic libraries against SDPA-newFPLib. We show that our newFPLib is a very good trade-off for accuracy and speed when solving ill-conditioned SDP problems [38].
The classical relative error bounds for computing and
in binary floating-point arithmetic are asymptotically optimal
We study the accuracy of classical algorithms for evaluating expressions of the form
On the relative error of computing complex square roots in floating-point arithmetic
We study the accuracy of a classical approach to computing complex square-roots in floating-point arithmetic.
Our analyses are done in binary floating-point arithmetic in precision
More accurate complex multiplication for embedded processors
In [36] we present some work in progress on the development of fast and accurate support for complex floating-point arithmetic on embedded processors. Focusing on the case of multiplication, we describe algorithms and implementations for computing both the real and imaginary parts with high relative accuracy. We show that, in practice, such accuracy guarantees can be achieved with reasonable overhead compared with conventional algorithms (which are those offered by current implementations and for which the real or imaginary part of a product can have no correct digit at all). For example, the average execution-time overheads when computing an FFT on the ARM Cortex-A53 and -A57 processors range from 1.04x to 1.17x only, while arithmetic costs suggest overheads from 1.5x to 1.8x.
Tight and rigorous error bounds for basic building blocks of double-word arithmetic
We analyze several classical basic building blocks of double-word arithmetic (frequently called “double-double arithmetic” in the literature): the addition of a double-word number and a floating-point number, the addition of two double-word numbers, the multiplication of a double-word number by a floating-point number, the multiplication of two double-word numbers, the division of a double-word number by a floating-point number, and the division of two double-word numbers. For multiplication and division we get better relative error bounds than the ones previously published. For addition of two double-word numbers, we show that the previously published bound was incorrect, and we provide a new relative error bound. We introduce new algorithms for division. We also give examples that illustrate the tightness of our bounds [21].
On the robustness of the 2Sum and Fast2Sum algorithms
The 2Sum and Fast2Sum algorithms are important building blocks in numerical computing. They are used (implicitely or explicitely) in many compensated algorithms (such as compensated summation or compensated polynomial evaluation). They are also used for manipulating floating-point expansions. We show that these algorithms are much more robust than it is usually believed: The returned result makes sense even when the rounding function is not round-to-nearest, and they are almost immune to overflow [9].
Formal verification of a floating-point expansion renormalization algorithm
Many numerical problems require a higher computing precision than the one offered by standard floating-point formats. A common way of extending the precision is to use floating-point expansions. As the problems may be critical and as the algorithms used have very complex proofs (many sub-cases), a formal guarantee of correctness is a wish that can now be fulfilled, using interactive theorem proving. In this article we give a formal proof in Coq for one of the algorithms used as a basic brick when computing with floating-point expansions, the renormalization, which is usually applied after each operation. It is a critical step needed to ensure that the resulted expansion has the same property as the input one, and is more “compressed”. The formal proof uncovered several gaps in the pen-and-paper proof and gives the algorithm a very high level of guarantee [30].
Interactive proof protocols
We present in [46] an interactive probabilistic proof protocol that certifies
in
New development on GNU MPFR
Work on the new fast, low-level algorithm to compute the correctly rounded summation of several floating-point numbers in arbitrary precision in radix 2 (each number having its own precision), and its implementation in GNU MPFR (new mpfr_sum function), has been completed [23].
The basic operations of GNU MPFR have also been optimized in small precision, and faithful rounding (mainly for internal use) is now partly supported [39].
These improvements, among many other ones, will be available in GNU MPFR 4.0.0; a release candidate is distributed in December 2017.