EN FR
EN FR


Section: New Results

Towards ultra-scale Big Optimization

During the year 2019, we have addressed the ultra-scale optimization research line following two main directions: designing efficient optimization algorithms dealing with scalability in terms of number of processing cores and/or in terms of the size of tackled problem instances. For the first direction, the challenge is to take into account in addition to the traditional performance objective the productivity awareness which is highly important to deal with the increasing complexity of medern supercomputers. With the short-term perspective of establishing a collaboration with one of the major HPC builders (Cray Inc.), we have investigated in [13], [24], [25] the exascale-aware Chapel language. Regarding the second direction, we contributed to solve difficult unsolved flow-shop [15] and Quadratic Assignement Problem (QAP) [22] permutation problem instances using moderate-scale parallelism. The contributions are summarized in the following.

Towards ultra-scale Branch-and-Bound using a high-productivity language

Participants : Tiago Carneiro Pessoa, Jan Gmys, Nouredine Melab, Daniel Tuyttens [University of Mons, Blegium] .

Productivity is crucial for designing ultra-scale algorithms able to harness modern supercomputers which are increasingly complex, including millions of processing cores and heterogeneous building-block devices. In [13], we investigate the partitioned global address space (PGAS)-based approach using Chapel for the productivity-aware design and implementation of distributed Branc-and-Bound (B&B) for solving large optmization problems. The proposed algorithms are intensively experimented using the Flow-shop scheduling problem as a test-case. The Chapel-based implementation is compared to its MPI+X-based traditionally used counterpart in terms of performance, scalability, and productivity. The results show that Chapel is much more expressive and up to 7.8× more productive than MPI+Pthreads. In addition, the Chapel-based search presents performance equivalent to MPI+Pthreads for its best results on 1024 cores and reaches up to 84% of the linear speedup. However, there are cases where the built-in load balancing provided by Chapel cannot produce regular load among computer nodes. In such cases, the MPI-based search can be up to 4.2× faster and reaches speedups up to 3× higher than its Chapel-based counterpart. Thorough feedback on the experience is given, pointing out the strengths and limitations of the two opposite approaches (Chapel vs. MPI+X). To the best of our knowledge, the present study is pioneering within the context of exact parallel optimization.

A computationally efficient Branch-and-Bound algorithm for the permutation flow-shop scheduling problem

Participants : Jan Gmys, Nouredine Melab, Mohand Mezmaz [University of Mons, Blegium] , Daniel Tuyttens [University of Mons, Blegium] .

In [15], we propose an efficient Branch-and-Bound (B&B) algorithm for the permutation flow-shop problem (PFSP) with makespan objective. We present a new node decomposition scheme that combines dynamic branching and lower bound refinement strategies in a computationally efficient way. To alleviate the computational burden of the two-machine bound used in the refinement stage, we propose an online learning-inspired mechanism to predict promising couples of bottleneck machines. The algorithm offers multiple choices for branching and bounding operators and can explore the search tree either sequentially or in parallel on multi-core CPUs. In order to empirically determine the most efficient combination of these components, a series of computational experiments with 600 benchmark instances is performed. A main insight is that the problem size, as well as interactions between branching and bounding operators substantially modify the trade-off between the computational requirements of a lower bound and the achieved tree size reduction. Moreover, we demonstrate that parallel tree search is a key ingredient for the resolution of large problem instances, a strong super-linear speedups can be observed. An overall evaluation using two well-known benchmarks indicates that the proposed approach is superior to previously published B&B algorithms. For the first benchmark we report the exact resolution – within less than 20 minutes – of two instances defined by 500 jobs and 20 machines that remained open for more than 25 years, and for the second a total of 89 improved best-known upper bounds, including proofs of optimality for 74 of them.

A Parallel Tabu Search for the Large-scale Quadratic Assignment Problem

Participants : Omar Abdelkafi, Bilel Derbel, Arnaud Liefooghe.

Parallelization is an important paradigm for solving massive optimization problems. Understanding how to fully benefit form the aggregated computing power and what makes a parallel strategy successful is a difficult issue. In [22], we propose a simple parallel iterative tabu search (PITS) and study its effectiveness with respect to different experimental settings. Using the quadratic assignment problem (QAP) as a case study, we first consider different small-and medium-size instances from the literature and then tackle a large-size instance that was rarely considered due the its inherent solving difficulty. In particular, we show that a balance between the number of function evaluations each parallel process is allowed to perform before resuming the search is a critical issue to obtain an improved quality.