Bibliography
Major publications by the team in recent years
-
1M. Baboulin, D. Becker, J. Dongarra.
A Parallel Tiled Solver for Dense Symmetric Indefinite Systems on Multicore Architectures, in: Proceedings of IEEE International Parallel & Distributed Processing Symposium (IPDPS 2012), 2012, pp. 14-24. -
2M. Baboulin, S. Donfack, J. Dongarra, L. Grigori, A. Rémy, S. Tomov.
A class of communication-avoiding algorithms for solving general dense linear systems on CPU/GPU parallel machines, in: International Conference on Computational Science (ICCS 2012), Procedia Computer Science, Elsevier, 2012, vol. 9, pp. 17–26. -
3M. Baboulin, J. Dongarra, J. Herrmann, S. Tomov.
Accelerating linear system solutions using randomization techniques, in: ACM Trans. Math. Softw., 2012, vol. 39, no 2. -
4M. Baboulin, S. Gratton.
A contribution to the conditioning of the total least squares problem, in: SIAM J. Matrix Anal. and Appl., 2011, vol. 32, no 3, pp. 685–699. -
5R. Bolze, F. Cappello, E. Caron, M. J. Daydé, F. Desprez, E. Jeannot, Y. Jégou, S. Lanteri, J. Leduc, N. Melab, G. Mornet, R. Namyst, P. Primet, B. Quétier, O. Richard, E.-G. Talbi, T. Irena.
Grid'5000: a large scale and highly reconfigurable experimental Grid testbed, in: International Journal of High Performance Computing Applications, November 2006, vol. 20, no 4, pp. 481-494. -
6A. Bouteiller, T. Hérault, G. Krawezik, P. Lemarinier, F. Cappello.
MPICH-V Project: a Multiprotocol Automatic Fault Tolerant MPI, in: International Journal of High Performance Computing Applications, 2005, vol. 20, no 3, pp. 319–333. -
7F. Cappello, S. Djilali, G. Fedak, T. Hérault, F. Magniette, V. Néri, O. Lodygensky.
Computing on Large Scale Distributed Systems: XtremWeb Architecture, Programming Models, Security, Tests and Convergence with Grid, in: FGCS Future Generation Computer Science, 2004. -
8G. Fursin, Y. Kashnikov, A. Memon, Z. Chamski, O. Temam, M. Namolaru, E. Yom-Tov, B. Mendelson, A. Zaks, E. Courtois, F. Bodin, P. Barnard, E. Ashton, E. Bonilla, J. Thomson, C. Williams, M. O'Boyle.
Milepost GCC: Machine Learning Enabled Self-tuning Compiler, in: International Journal of Parallel Programming, 2011, vol. 39, pp. 296-327, 10.1007/s10766-010-0161-2.
http://dx.doi.org/10.1007/s10766-010-0161-2 -
9L. Grigori, J. Demmel, X. S. Li.
Parallel Symbolic Factorization for Sparse LU Factorization with Static Pivoting, in: SIAM Journal on Scientific Computing, 2007, vol. 29, no 3, pp. 1289-1314. -
10L. Grigori, J. Demmel, H. Xiang.
Communication Avoiding Gaussian Elimination, in: Proceedings of the ACM/IEEE SC08 Conference, 2008. -
11L. Grigori, J. Demmel, H. Xiang.
CALU: a communication optimal LU factorization algorithm, in: SIAM Journal on Matrix Analysis and Applications, 2011, vol. 32, pp. 1317-1350. -
12L. Grigori, F. Nataf.
Generalized Filtering Decomposition, May 2011, Session 7.
http://hal.inria.fr/inria-00581744/en -
13L. Grigori, F. Nataf.
Generalized Filtering Decomposition, Inria, March 2011, no RR-7569, 8 p.
http://hal.inria.fr/inria-00576894/en -
14T. Hérault, R. Lassaigne, S. Peyronnet.
APMC 3.0: Approximate Verification of Discrete and Continuous Time Markov Chains, in: Proceedings of the 3rd International Conference on the Quantitative Evaluation of SysTems (QEST'06), California, USA, September 2006. -
15Q. Niu, L. Grigori, P. Kumar, F. Nataf.
Modified tangential frequency filtering decomposition and its Fourier analysis, in: Numerische Mathematik, 2010, vol. 116, no 1, pp. 123-148. -
16S. Tomov, J. Dongarra, M. Baboulin.
Towards dense linear algebra for hybrid GPU accelerated manycore systems, in: Parallel Computing, 2010, vol. 36, no 5&6, pp. 232–240. -
17B. Wei, G. Fedak, F. Cappello.
Scheduling Independent Tasks Sharing Large Data Distributed with BitTorrent, in: IEEE/ACM Grid'2005 workshop Seattle, USA, 2005.
Articles in International Peer-Reviewed Journals
-
18G. Antoniu, J. Bigot, C. Blanchet, L. Bougé, F. Briant, F. Cappello, A. Costan, F. Desprez, G. Fedak, S. Gault, K. Keahey, B. Nicolae, C. Pérez, A. Simonet, F. Suter, B. Tang, R. Terreux.
Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures, in: International Journal of Cloud Computing (IJCC), 2013, vol. 2, no 2/3. [ DOI : 10.1504/IJCC.2013.055265 ]
http://hal.inria.fr/hal-00767029 -
19M. Baboulin, J. Dongarra, J. Herrmann, S. Tomov.
Accelerating linear system solutions using randomization technique, in: ACM Transactions on Mathematical Software, February 2013, vol. 39, no 2. [ DOI : 10.1145/2427023.2427025 ]
http://hal.inria.fr/hal-00908496 -
20D. Barthou, O. Brand-Foissac, O. Pene, G. Grosdidier, R. Dolbeau, C. Eisenbeis, M. Kruse, K. Petrov, C. Tadonki.
Automated Code Generation for Lattice Quantum Chromodynamics and beyond, in: Journal of Physics: Conference Series, December 2013, LPT-Orsay-13-142.
http://hal.inria.fr/hal-00926513 -
21G. Bosilca, A. Bouteiller, É. Brunet, F. Cappello, J. Dongarra, A. Guermouche, T. Hérault, Y. Robert, F. Vivien, D. Zaidouni.
Unified Model for Assessing Checkpointing Protocols at Extreme-Scale, in: Journal of Concurrency and Computation: Practice and Experience, November 2013. [ DOI : 10.1002/cpe.3173 ]
http://hal.inria.fr/hal-00908447 -
22P. Have, R. Masson, F. Nataf, M. Szydlarski, H. Xiang, T. Zhao.
Algebraic Domain Decomposition Methods for Highly Heterogeneous Problems, in: SIAM Journal on Scientific Computing, 2013, vol. 35, no 3, pp. C284-C302.
http://hal.inria.fr/hal-00611997 -
23B. Nicolae, F. Cappello.
BlobCR: Virtual Disk Based Checkpoint-Restart for HPC Applications on IaaS Clouds, in: Journal of Parallel and Distributed Computing, February 2013, vol. 73, no 5, pp. 698-711. [ DOI : 10.1016/j.jpdc.2013.01.013 ]
http://hal.inria.fr/hal-00857964
International Conferences with Proceedings
-
24A. Bouteiller, F. Cappello, J. Dongarra, A. Guermouche, T. Hérault, Y. Robert.
Multi-criteria checkpointing strategies: response-time versus resource utilization, in: Euro-Par 2013, Aachen, Germany, S. Verlag (editor), LNCS, 2013, vol. 8097, pp. 420-431. [ DOI : 10.1007/978-3-642-40047-6_43 ]
http://hal.inria.fr/hal-00926606 -
25S. Di, D. Kondo, F. Cappello.
Characterizing Cloud Applications on a Google Data Center, in: 42nd International Conference on Parallel Processing (ICPP'13), 2013, pp. 468-473. [ DOI : 10.1109/ICPP.2013.56 ]
http://hal.inria.fr/hal-00936827 -
26S. Di, Y. Robert, F. Vivien, D. Kondo, C.-L. Wang, F. Cappello.
Optimization of Cloud Task Processing with Checkpoint-Restart Mechanism, in: SC13 - Supercomputing - 2013, Denver, United States, ACM, November 2013. [ DOI : 10.1145/2503210.2503217 ]
http://hal.inria.fr/hal-00847635 -
27M. E. M. Diouri, O. Glück, L. Lefèvre, F. Cappello.
ECOFIT: A Framework to Estimate Energy Consumption of Fault Tolerance protocols during HPC executions, in: 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Delft, Netherlands, May 2013.
http://hal.inria.fr/hal-00806500 -
28M. E. M. Diouri, O. Glück, L. Lefèvre, F. Cappello.
Towards an Energy Estimator for Fault Tolerance Protocols, in: 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Shenzhen, China, February 2013, pp. 313–314. [ DOI : 10.1145/2442516.2442561 ]
http://hal.inria.fr/hal-00806499 -
29A. W. Memon, G. Fursin.
Crowdtuning: systematizing auto-tuning using predictive modeling and crowdsourcing, in: PARCO mini-symposium on "Application Autotuning for HPC (Architectures)", Munich, Germany, September 2013.
http://hal.inria.fr/hal-00944513 -
30B. Nicolae, F. Cappello.
AI-Ckpt: Leveraging Memory Access Patterns for Adaptive Asynchronous Incremental Checkpointing, in: HPDC '13: 22th International ACM Symposium on High-Performance Parallel and Distributed Computing, New York, United States, April 2013, pp. 155-166. [ DOI : 10.1145/2462902.2462918 ]
http://hal.inria.fr/hal-00809847 -
31Y. Wang, M. Baboulin, J. Dongarra, J. Falcou, Y. Fraigneau, O. Le Maitre.
A parallel solver for incompressible fluid flows, in: International Conference on Computational Science (ICCS 2013), Barcelona, Italy, June 2013. [ DOI : 10.1016/j.procs.2013.05.207 ]
http://hal.inria.fr/hal-00915356
Conferences without Proceedings
-
32L. Giraud, F. Cappello.
Resilience at extreme scale : system level, algorithmic level or both ?, in: SIAM Conference on Computational Science and Engineering - CSE 2013, Boston, United States, SIAM, March 2013.
http://hal.inria.fr/hal-00799309
Internal Reports
-
33D. Barthou, G. Grosdidier, K. Petrov, M. Kruse, C. Eisenbeis, O. Pène, O. Brand-Foissac, C. Tadonki, R. Dolbeau.
Automated Code Generation for Lattice QCD Simulation, Inria, December 2013, no RR-8417, 13 p.
http://hal.inria.fr/hal-00918812 -
34J. Beauquier, P. Blanchard, J. Burman.
Self-stabilizing Leader Election in Population Protocols over Arbitrary Communication Graphs, September 2013.
http://hal.inria.fr/hal-00867287 -
35A. Ferreira Leite, C. Tadonki, C. Eisenbeis, A. C. M. A. De Melo.
A Fine-grained Approach for Power Consumption Analysis and Prediction, Inria, December 2013, no RR-8416, 12 p.
http://hal.inria.fr/hal-00918810 -
36G. Fursin.
Collective Mind: cleaning up the research and experimentation mess in computer engineering using crowdsourcing, big data and machine learning, August 2013.
http://hal.inria.fr/hal-00850880
Other Publications
-
37D. Barthou, O. Brand-Foissac, R. Dolbeau, G. Grosdidier, C. Eisenbeis, M. Kruse, O. Pene, K. Petrov, C. Tadonki.
Automated Code Generation for Lattice Quantum Chromodynamics and beyond, 2014.
http://hal.inria.fr/hal-00930288 -
38G. Fursin.
Keynote at HPSC 2013 at NTU, Taiwan: Systematizing tuning of computer systems using crowdsourcing and statistics, in: HPSC - Conference on Advanced Topics and Auto Tuning in High Performance and Scientific Computing - 2013, Taipei, Taiwan, March 2013.
http://hal.inria.fr/hal-00819000 -
39G. Fursin.
Tutorial at HPSC 2013 at NTU, Taiwan: Collective Mind: novel methodology, framework and repository to crowd-source auto-tuning, in: HPSC - Conference on Advanced Topics and Auto Tuning in High Performance and Scientific Computing - 2013, Taipee, Taiwan, March 2013.
http://hal.inria.fr/hal-00819002 -
40G. Fursin, A. W. Memon, C. Guillon.
Machine Learning for Compilation and Architecture: Myth or Reality?, 2013.
http://hal.inria.fr/hal-00907143
-
41K. Aida, A. Takefusa, H. Nakada, S. Matsuoka, S. Sekiguchi, U. Nagashima.
Performance evaluation model for scheduling in a global computing system, in: International Journal of High Performance Computing Applications, 2000, vol. 14, No. 3, pp. 268-279.
http://dx.doi.org/10.1177/109434200001400308 -
42A. D. Alexandrov, M. Ibel, K. E. Schauser, C. J. Scheiman.
SuperWeb: Research Issues in JavaBased Global Computing, in: Concurrency: Practice and Experience, June 1997, vol. 9, no 6, pp. 535–553. -
43L. Alvisi, K. Marzullo.
Message Logging: Pessimistic, Optimistic and Causal, 2001, Proc. 15th Int'l Conf. on Distributed Computing. -
44D. P. Anderson.
BOINC, 2011.
http://boinc.berkeley.edu/ -
45A. Barak, O. La'adan.
The MOSIX multicomputer operating system for high performance cluster computing, in: Future Generation Computer Systems, 1998, vol. 13, no 4–5, pp. 361–372. -
46A. Baratloo, M. Karaul, Z. M. Kedem, P. Wyckoff.
Charlotte: Metacomputing on the Web, in: Proceedings of the 9th International Conference on Parallel and Distributed Computing Systems (PDCS-96), 1996. -
47J. Beauquier, C. Genolini, S. Kutten.
Optimal reactive k-stabilization: the case of mutual exclusion. In Proceedings of the 18th Annual ACM Symposium on Principles of Distributed Computing, may 1999, pp. 199-208. -
48J. Beauquier, T. Hérault.
Fault-Local Stabilization: the Shortest Path Tree., October 2002, Proceedings of the 21th Symposium of Reliable Distributed Systems. -
49G. Bosilca, A. Bouteiller, F. Cappello, S. Djilali, G. Fedak, C. Germain, T. Hérault, P. Lemarinier, O. Lodygensky, F. Magniette, V. Néri, A. Selikhov.
MPICH-V: Toward a Scalable Fault Tolerant MPI for Volatile Nodes, 2002, in IEEE/ACM SC 2002. -
50A. Bouteiller, F. Cappello, T. Hérault, G. Krawezik, P. Lemarinier, F. Magniette.
MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging, November 2003, in IEEE/ACM SC 2003. -
51A. Bouteiller, P. Lemarinier, G. Krawezik, F. Cappello.
Coordinated Checkpoint versus Message Log for fault tolerant MPI, December 2003, in IEEE Cluster. -
52T. Brecht, H. Sandhu, M. Shan, J. Talbot.
ParaWeb: Towards World-Wide Supercomputing, in: Proceedings of the Seventh ACM SIGOPS European Workshop on System Support for Worldwide Applications, 1996. -
53R. Buyya, M. Murshed.
GridSim: A Toolkit for the Modeling and Simulation of Distributed Resource Management and Scheduling for Grid Computing, Wiley Press, May 2002. -
54N. Camiel, S. London, N. Nisan, O. Regev.
The POPCORN Project: Distributed Computation over the Internet in Java, in: Proceedings of the 6th International World Wide Web Conference, April 1997. -
55H. Casanova.
Simgrid: A Toolkit for the Simulation of Application Scheduling. In Proceedings of the IEEE International Symposium on Cluster Computing and the Grid (CCGrid '01), May 2001, pp. 430–437. -
56K. M. Chandy, L. Lamport.
Distributed Snapshots: Determining Global States of Distr. systems, 1985, ACM Trans. on Comp. Systems, 3(1):63–75. -
57B. O. Christiansen, P. Cappello, M. F. Ionescu, M. O. Neary, K. E. Schauser, D. Wu.
Javelin: Internet-Based Parallel Computing Using Java, in: Concurrency: Practice and Experience, November 1997, vol. 9, no 11, pp. 1139–1160. -
58J. W. Demmel, L. Grigori, M. Hoemmen, J. Langou.
Communication-optimal parallel and sequential QR and LU factorizations, in: SIAM Journal on Scientific Computing, 2012, short version of technical report UCB/EECS-2008-89 from 2008. -
59S. Dolev.
Self-stabilization, 2000, M.I.T. Press. -
60G. Fedak, C. Germain, V. Néri, F. Cappello.
XtremWeb: A Generic Global Computing System, in: CCGRID'01: Proceedings of the 1st International Symposium on Cluster Computing and the Grid, IEEE Computer Society, 2001, 582 p. -
61I. Foster, A. Iamnitchi.
On Death, Taxes, and the Convergence of Peer-to-Peer and Grid Computing, in: 2nd International Workshop on Peer-to-Peer Systems (IPTPS'03), Berkeley, CA, February 2003. -
62V. K. Garg.
Principles of distributed computing, John Wiley and Sons, May 2002. -
63C. Genolini, S. Tixeuil.
A lower bound on k-stabilization in asynchronous systems, October 2002, Proceedings of the 21th Symposium of Reliable Distributed Systems. -
64Douglas P. Ghormley, D. Petrou, Steven H. Rodrigues, Amin M. Vahdat, Thomas E. Anderson.
GLUnix: A Global Layer Unix for a Network of Workstations, in: Software Practice and Experience, 1998, vol. 28, no 9, pp. 929–961. -
65D. E. Keyes.
A Science-based Case for Large Scale Simulation, Vol. 1, Office of Science, US Department of Energy, Report Editor-in-Chief, July 30 2003. -
66S. Kutten, B. Patt-Shamir.
Stabilizing time-adaptive protocols. Theoretical Computer Science 220(1), 1999, pp. 93-111. -
67S. Kutten, D. Peleg.
Fault-local distributed mending. Journal of Algorithms 30(1), 1999, pp. 144-165. -
68N. Leibowitz, M. Ripeanu, A. Wierzbicki.
Deconstructing the Kazaa Network, in: Proceedings of the 3rd IEEE Workshop on Internet Applications WIAPP'03, Santa Clara, CA, 2003. -
69M. Litzkow, M. Livny, M. Mutka.
Condor — A Hunter of Idle Workstations, in: Proceedings of the Eighth Conference on Distributed Computing, San Jose, 1988. -
70Nancy A. Lynch.
M. Kaufmann (editor), Distributed Algorithms, 1996. -
71Message Passing Interface Forum.
MPI: A message passing interface standard, June 12 1995, Technical report, University of Tennessee, Knoxville. -
72N. Minar, R. Murkhart, C. Langton, M. Askenazi.
The Swarm Simulation System: A Toolkit for Building Multi-Agent Simulations, 1996. -
73H. Pedroso, L. M. Silva, J. G. Silva.
Web-Based Metacomputing with JET, in: Proceedings of the ACM, 1997. -
74B. Quétier, M. Jan, F. Cappello.
One step further in large-scale evaluations: the V-DS environment, Inria, December 2007, no RR-6365.
http://hal.inria.fr/inria-00189670 -
75S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker.
A Scalable Content Addressable Network, in: Proceedings of ACM SIGCOMM 2001, 2001. -
76A. Rowstron, P. Druschel.
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems, in: IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), 2001, pp. 329–350. -
77L. F. G. Sarmenta, S. Hirano.
Bayanihan: building and studying Web-based volunteer computing systems using Java, in: Future Generation Computer Systems, 1999, vol. 15, no 5–6, pp. 675–686. -
78S. Saroiu, P. K. Gummadi, S. D. Gribble.
A Measurement Study of Peer-to-Peer File Sharing Systems, in: Proceedings of Multimedia Computing and Networking, San Jose, CA, USA, January 2002. -
79J. F. Shoch, J. A. Hupp.
The Worm Programs: Early Experiences with Distributed Systems, in: Communications of the Association for Computing Machinery, March 1982, vol. 25, no 3. -
80I. Stoica, R. Morris, D. Karger, F. Kaashoek, H. Balakrishnan.
Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, in: Proceedings of the 2001 ACM SIGCOMM Conference, 2001, pp. 149–160. -
81G. Tel.
Introduction to distributed algorithms, 2000, Cambridge University Press. -
82Y.-M. Wang, W. K. Fuchs.
Optimistic Message Logging for Independent Checkpointing in Message-Passing Systems, 1992, pp. 147-154, Symposium on Reliable Distributed Systems. -
83Y. Yi, T. Park, H. Y. Yeom.
A Causal Logging Scheme for Lazy Release Consistent Distributed Shared Memory Systems, December 1998, In Proc. of the 1998 Int'l Conf. on Parallel and Distributed Systems. -
84B. Y. Zhao, J. D. Kubiatowicz, A. D. Joseph.
Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and Routing, UC Berkeley, April 2001, no UCB/CSD-01-1141.