Bibliography
Major publications by the team in recent years
-
1C. Augonnet, S. Thibault, R. Namyst, P.-A. Wacrenier.
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, in: Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par 2009, February 2011, vol. 23, pp. 187–198. [ DOI : 10.1002/cpe.1631 ]
http://hal.inria.fr/inria-00550877 -
2F. Broquedis, J. Clet-Ortega, S. Moreaud, N. Furmento, B. Goglin, G. Mercier, S. Thibault, R. Namyst.
hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications, in: Proceedings of the 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP2010), Pisa, Italia, IEEE Computer Society Press, February 2010, pp. 180–186. [ DOI : 10.1109/PDP.2010.67 ]
http://hal.inria.fr/inria-00429889 -
3F. Broquedis, N. Furmento, B. Goglin, P.-A. Wacrenier, R. Namyst.
ForestGOMP: an efficient OpenMP environment for NUMA architectures, in: International Journal on Parallel Programming, Special Issue on OpenMP; Guest Editors: Matthias S. Müller and Eduard Ayguadé, 2010, vol. 38, no 5, pp. 418-439. [ DOI : 10.1007/s10766-010-0136-3 ]
http://hal.inria.fr/inria-00496295 -
4D. Buntinas, G. Mercier, W. Gropp.
Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem, in: Recent Advances in Parallel Virtual Machine and Message Passing Interface: Proc. 13th European PVM/MPI Users Group Meeting, Bonn, Germany, September 2006. -
5B. Goglin, N. Furmento.
Finding a Tradeoff between Host Interrupt Load and MPI Latency over Ethernet, in: Proceedings of the IEEE International Conference on Cluster Computing, New Orleans, LA, IEEE Computer Society Press, September 2009.
http://hal.inria.fr/inria-00397328 -
6B. Goglin.
High-Performance Message Passing over generic Ethernet Hardware with Open-MX, in: Journal of Parallel Computing, February 2011, vol. 37, no 2, pp. 85-100. [ DOI : 10.1016/j.parco.2010.11.001 ]
http://hal.inria.fr/inria-00533058/en -
7S. Thibault, R. Namyst, P.-A. Wacrenier.
Building Portable Thread Schedulers for Hierarchical Multiprocessors: the BubbleSched Framework, in: EuroPar, Rennes,France, ACM, 8 2007.
http://hal.inria.fr/inria-00154506 -
8F. Trahay, É. Brunet, A. Denis, R. Namyst.
A multithreaded communication engine for multicore architectures, in: CAC 2008: Workshop on Communication Architecture for Clusters, held in conjunction with IPDPS 2008, Miami, FL, IEEE Computer Society Press, April 2008.
http://hal.inria.fr/inria-00224999
Articles in International Peer-Reviewed Journals
-
9D. Barthou, O. Brand-Foissac, O. Pene, G. Grosdidier, R. Dolbeau, C. Eisenbeis, M. Kruse, K. Petrov, C. Tadonki.
Automated Code Generation for Lattice Quantum Chromodynamics and beyond, in: Journal of Physics: Conference Series, December 2013, LPT-Orsay-13-142.
http://hal.inria.fr/hal-00926513 -
10B. Goglin, S. Moreaud.
KNEM: a Generic and Scalable Kernel-Assisted Intra-node MPI Communication Framework, in: Journal of Parallel and Distributed Computing, February 2013, vol. 73, no 2, pp. 176-188. [ DOI : 10.1016/j.jpdc.2012.09.016 ]
http://hal.inria.fr/hal-00731714 -
11E. Jeannot.
Symbolic Mapping and Allocation for the Cholesky Factorization on NUMA machines: Results and Optimizations, in: International Journal of High Performance Computing Applications, 2013, vol. 27, no 3, pp. 283–290.
http://hal.inria.fr/hal-00921611 -
12E. Jeannot, G. Mercier, F. Tessier.
Process Placement in Multicore Clusters: Algorithmic Issues and Practical Techniques, in: IEEE Transactions on Parallel and Distributed Systems, May 2013.
http://hal.inria.fr/hal-00921605
International Conferences with Proceedings
-
13G. Antoniu, T. Boku, C. Calvin, P. Codognet, M. Dayde, N. Emad, Y. Ishikawa, S. Matsuoka, K. Nakajima, H. Nakashima, R. Namyst, S. Petiton, T. Sakurai, M. Sato.
Towards exascale with the ANR-JST japanese-french project FP3C (Framework and Programming for Post- Petascale Computing), in: 9th International Conference on Computer Science and Information Technologies, Yerevan, Armenia, 2013.
http://hal.inria.fr/hal-00922754 -
14P.-A. Arras, D. Fuin, E. Jeannot, A. Stoutchinin, S. Thibault.
List Scheduling in Embedded Systems under Memory Constraints, in: SBAC-PAD'2013 - 25th International Symposium on Computer Architecture and High-Performance Computing, Porto de Galinhas, Brazil, J. Guerrero (editor), IEEE Computer Society, October 2013.
http://hal.inria.fr/hal-00906117 -
15O. Aumage, D. Barthou, C. Haine, T. Meunier.
Detecting SIMDization Opportunities through Static/Dynamic Dependence Analysis, in: PROPER - 6th Workshop on Productivity and Performance - 2013, Aachen, Germany, September 2013.
http://hal.inria.fr/hal-00858004 -
16A. Charif-Rubial, D. Barthou, C. Valensi, S. Sameer, A. Malony, W. Jalby.
MIL : A language to build program analysis tools through static binary instrumentation, in: High Performance Computing, India, 2013, pp. 206-215.
http://hal.inria.fr/hal-00920875 -
17A. Duchâteau, D. Padua, D. Barthou.
Hydra: Automatic algorithm exploration from linear algebra equations, in: Code Generation and Optimization, Shenzhen, China, 2013, pp. 1-10.
http://hal.inria.fr/hal-00920869 -
18S. Henry.
ViperVM: a Runtime System for Parallel Functional High-Performance Computing on Heterogeneous Architectures, in: 2nd Workshop on Functional High-Performance Computing (FHPC'13), Boston, United States, September 2013.
http://hal.inria.fr/hal-00851122 -
19A.-E. Hugo, A. Guermouche, R. Namyst, P.-A. Wacrenier.
Composing multiple StarPU applications over heterogeneous machines: a supervised approach, in: Third International Workshop on Accelerators and Hybrid Exascale Systems, Boston, United States, May 2013.
http://hal.inria.fr/hal-00824514 -
20A.-E. Hugo.
Le problème de la composition parallèle : une approche supervisée, in: RenPAR - 21e Rencontres Francophones du Parallélisme (2013), Grenoble, France, January 2013.
http://hal.inria.fr/hal-00773610 -
21E. Jeannot, E. Meneses, G. Mercier, F. Tessier, G. Zheng.
Communication and Topology-aware Load Balancing in Charm++ with TreeMatch, in: IEEE Cluster 2013, Indianapolis, United States, IEEE, September 2013.
http://hal.inria.fr/hal-00851148 -
22P. Li, E. Brunet, R. Namyst.
High Performance Code Generation for Stencil Computation on Heterogeneous Multi-device Architectures, in: HPCC-15th IEEE International Conference on High Performance Computing and Communications, Zhangjiajie, China, IEEE Computer Society, 2013.
http://hal.inria.fr/hal-00925481 -
23A. Mazouz, S.-A.-A. Touati, D. Barthou.
Dynamic Thread Pinning for Phase-Based OpenMP Programs, in: The Euro-Par 2013 conference, Aachen, Germany, F. Wolf, B. Mohr, D. an Mey (editors), Lecture Notes in Computer Science, Springer, August 2013, vol. 8097, pp. 53-64. [ DOI : 10.1007/978-3-642-40047-6_8 ]
http://hal.inria.fr/hal-00847482 -
24T. Odajima, T. Boku, M. Sato, T. Hanawa, Y. Kodama, R. Namyst, S. Thibault, O. Aumage.
Adaptive Task Size Control on High Level Programming for GPU/CPU Work Sharing, in: The 2013 International Symposium on Advances of Distributed and Parallel Computing (ADPC 2013), Vietri sul Mare, Italy, December 2013.
http://hal.inria.fr/hal-00920915 -
25S. Ohshima, S. Katagiri, K. Nakajima, S. Thibault, R. Namyst.
Implementation of FEM Application on GPU with StarPU, in: SIAM CSE13 - SIAM Conference on Computational Science and Engineering 2013, Boston, United States, SIAM, February 2013.
http://hal.inria.fr/hal-00926144 -
26C. Rossignon, H. Pascal, O. Aumage, S. Thibault.
A NUMA-aware fine grain parallelization framework for multi-core architecture, in: PDSEC - 14th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing - 2013, Boston, United States, May 2013.
http://hal.inria.fr/hal-00858350 -
27E. Saillard, P. Carribault, D. Barthou.
Combining Static and Dynamic Validation of MPI Collective Communication, in: EuroMPI 2013, Madrid, Spain, September 2013, pp. 117-122. [ DOI : 10.1145/2488551.2488555 ]
http://hal.inria.fr/hal-00920901
National Conferences with Proceedings
-
28P.-A. Arras, D. Fuin, E. Jeannot, A. Stoutchinin, S. Thibault.
Ordonnancement de liste dans les systèmes embarqués sous contrainte de mémoire, in: ComPAS'13 / RenPar'21 - 21es Rencontres francophones du Parallélisme, Grenoble, France, Inria Grenoble, January 2013.
http://hal.inria.fr/hal-00772854 -
29E. Jeannot, G. Mercier, F. Tessier.
TreeMatch : Un algorithme de placement de processus sur architectures multicœurs, in: RenPAR - 21e Rencontres Francophones du Parallélisme, Grenoble, France, January 2013.
http://hal.inria.fr/hal-00773254 -
30C. Rossignon.
Optimisation du produit matrice-vecteur creux sur architecture GPU pour un simulateur de reservoir, in: ComPAS'13 / RenPar'21 - 21es Rencontres francophones du Parallélisme, Grenoble, France, Inria Grenoble, 2013.
http://hal.inria.fr/hal-00773571
Scientific Books (or Scientific Book chapters)
-
31T. Hoefler, E. Jeannot, G. Mercier.
An Overview of Process Mapping Techniques and Algorithms in High-Performance Computing, in: High Performance Computing on Complex Environments, E. Jeannot, J. Zilinskas (editors), Wiley, 2014, pp. 65–84, To be published.
http://hal.inria.fr/hal-00921626
Books or Proceedings Editing
-
32E. Jeannot, J. Zvilinskas (editors)
High Performance Computing on Complex Environments, Wiley, 2014, 499 p, to be published.
http://hal.inria.fr/hal-00921619
Internal Reports
-
33D. Barthou, G. Grosdidier, K. Petrov, M. Kruse, C. Eisenbeis, O. Pène, O. Brand-Foissac, C. Tadonki, R. Dolbeau.
Automated Code Generation for Lattice QCD Simulation, Inria, December 2013, no RR-8417, 13 p.
http://hal.inria.fr/hal-00918812 -
34L. Courtès.
C Language Extensions for Hybrid CPU/GPU Programming with StarPU, Inria, April 2013, no RR-8278, 25 p.
http://hal.inria.fr/hal-00807033 -
35S. Henry, D. Barthou, A. Denis, R. Namyst, M.-C. Counilh.
SOCL: An OpenCL Implementation with Automatic Multi-Device Adaptation Support, Inria, August 2013, no RR-8346, 18 p.
http://hal.inria.fr/hal-00853423 -
36E. Jeannot, G. Mercier, F. Tessier.
Process Placement in Multicore Clusters: Algorithmic Issues and Practical Techniques, Inria, March 2013, no RR-8269, 32 p.
http://hal.inria.fr/hal-00803548 -
37X. Lacoste, M. Faverge, P. Ramet, S. Thibault, G. Bosilca.
Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes, Inria, January 2014, no RR-8446, 25 p.
http://hal.inria.fr/hal-00925017 -
38A. Rousseau, A. Darnaud, B. Goglin, C. Acharian, C. Leininger, C. Godin, C. Holik, C. Kirchner, D. Rives, E. Darquie, E. Kerrien, F. Neyret, F. Masseglia, F. Dufour, G. Berry, G. Dowek, H. Robak, H. Xypas, I. Illina, I. Gnaedig, J. Jongwane, J. Ehrel, L. Viennot, L. Guion, L. Calderan, L. Kovacic, M. Collin, M.-A. Enard, M.-H. Comte, M. Quinson, M. Olivi, M. Giraud, M. Dorémus, M. Ogouchi, M. Droin, N. Lacaux, N. Rougier, N. Roussel, P. Guitton, P. Peterlongo, R.-M. Cornus, S. Vandermeersch, S. Maheo, S. Lefebvre, S. Boldo, T. Viéville, V. Poirel, A. Chabreuil, A. Fischer, C. Farge, C. Vadel, I. Astic, J.-P. Dumont, L. Féjoz, P. Rambert, P. Paradinas, S. De Quatrebarbes, S. Laurent.
Médiation Scientifique : une facette de nos métiers de la recherche, March 2013, 34 p.
http://hal.inria.fr/hal-00804915
Scientific Popularization
-
39B. Goglin.
Les réseaux pour le calcul haute performance : facteur, livreur ou déménageur ?, in: Interstices, December 2013.
http://hal.inria.fr/hal-00915723 -
40B. Goglin, B. Putigny.
Idée reçue: Comparer la puissance de deux ordinateurs, c'est facile !, in: Interstices, April 2013.
http://hal.inria.fr/hal-00816422
-
41P. Balaji, H.-W. Jin, K. Vaidyanathan, D. K. Panda.
Supporting iWARP Compatibility and Features for Regular Network Adapters, in: Proceedings of the Workshop on Remote Direct Memory Access (RDMA): Applications, Implementations, and Technologies (RAIT); held in conjunction with the IEEE International Confer ence on Cluster Computing, Boston, MA, September 2005. -
42G. Ciaccio, G. Chiola.
GAMMA and MPI/GAMMA on GigabitEthernet, in: Proceedings of 7th EuroPVM-MPI conference, Balatonfured, Hongrie, Lecture Notes in Computer Science, Springer Verlag, Septembre 2000, vol. 1908. -
43G. R. Gao, T. Sterling, R. Stevens, M. Hereld, W. Zhu.
Hierarchical multithreading: programming model and system software, in: 20th International Parallel and Distributed Processing Symposium (IPDPS), April 2006.