Bibliography

Major publications by the team in recent years

1C. Augonnet, S. Thibault, R. Namyst, P.-A. Wacrenier.

StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, in: Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par 2009, February 2011, vol. 23, p. 187–198. [ DOI : 10.1002/cpe.1631 ]

http://hal.inria.fr/inria-00550877
2F. Broquedis, J. Clet-Ortega, S. Moreaud, N. Furmento, B. Goglin, G. Mercier, S. Thibault, R. Namyst.

hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications, in: Proceedings of the 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP2010), Pisa, Italia, IEEE Computer Society Press, February 2010, p. 180–186. [ DOI : 10.1109/PDP.2010.67 ]

http://hal.inria.fr/inria-00429889
3F. Broquedis, N. Furmento, B. Goglin, P.-A. Wacrenier, R. Namyst.

ForestGOMP: an efficient OpenMP environment for NUMA architectures, in: International Journal on Parallel Programming, Special Issue on OpenMP; Guest Editors: Matthias S. Müller and Eduard Ayguadé, 2010, vol. 38, n^o 5, p. 418-439. [ DOI : 10.1007/s10766-010-0136-3 ]

http://hal.inria.fr/inria-00496295
4D. Buntinas, G. Mercier, W. Gropp.

Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem, in: Recent Advances in Parallel Virtual Machine and Message Passing Interface: Proc. 13th European PVM/MPI Users Group Meeting, Bonn, Germany, September 2006.
5B. Goglin, N. Furmento.

Finding a Tradeoff between Host Interrupt Load and MPI Latency over Ethernet, in: Proceedings of the IEEE International Conference on Cluster Computing, New Orleans, LA, IEEE Computer Society Press, September 2009.

http://hal.inria.fr/inria-00397328
6B. Goglin.

High-Performance Message Passing over generic Ethernet Hardware with Open-MX, in: Journal of Parallel Computing, February 2011, vol. 37, n^o 2, p. 85-100. [ DOI : 10.1016/j.parco.2010.11.001 ]

http://hal.inria.fr/inria-00533058/en
7S. Thibault, R. Namyst, P.-A. Wacrenier.

Building Portable Thread Schedulers for Hierarchical Multiprocessors: the BubbleSched Framework, in: EuroPar, Rennes,France, ACM, 8 2007.

http://hal.inria.fr/inria-00154506
8F. Trahay, É. Brunet, A. Denis, R. Namyst.

A multithreaded communication engine for multicore architectures, in: CAC 2008: Workshop on Communication Architecture for Clusters, held in conjunction with IPDPS 2008, Miami, FL, IEEE Computer Society Press, April 2008.

http://hal.inria.fr/inria-00224999

Publications of the year

Doctoral Dissertations and Habilitation Theses

9A. Charif-Rubial.

On code performance analysis and optimization for multicore architectures, Université de Versailles Saint-Quentin, 2012.
10J. Clet-Ortega.

Exploitation efficace des architectures parallèles de type grappes de NUMA à l'aide de modèles hybrides de programmation, Université Sciences et Technologies - Bordeaux I, April 2012.

http://hal.inria.fr/tel-00773007
11J. Jaeger.

Source-to-source transformations for irregular and multithreaded code optimization, Université de Versailles Saint-Quentin, 2012.

Articles in International Peer-Reviewed Journals

12A. Benoit, L.-C. Canon, E. Jeannot, Y. Robert.

Reliability of task graph schedules with transient and fail-stop failures: complexity and algorithms, in: Journal of Scheduling, 2012, vol. 15, n^o 5, p. 615-627. [ DOI : 10.1007/s10951-011-0236-y ]

http://hal.inria.fr/hal-00763343
13B. Goglin, S. Moreaud.

KNEM: a Generic and Scalable Kernel-Assisted Intra-node MPI Communication Framework, in: Journal of Parallel and Distributed Computing, February 2013, vol. 73, n^o 2, p. 176-188. [ DOI : 10.1016/j.jpdc.2012.09.016 ]

http://hal.inria.fr/hal-00731714

Articles in National Peer-Reviewed Journals

14S. Mahmoudi, P. Manneback, C. Augonnet, S. Thibault.

Traitements d'Images sur Architectures Parallèles et Hétérogènes, in: Technique et Science Informatiques, 2012.

http://hal.inria.fr/hal-00714858
15H. Sylvain, A. Denis, D. Barthou.

Programmation unifiée multi-accélérateur OpenCL, in: Techniques et Sciences Informatiques, 2012, vol. 31, n^o 8-9-10, p. 1233-1249. [ DOI : 10.3166/TSI.31.1233-1249 ]

http://hal.inria.fr/hal-00772742

Invited Conferences

16C. Bordage.

Parallelization on Heterogeneous Multicore and Multi-GPU Systems of the Fast Multipole Method for the Helmholtz Equation Using a Runtime System, in: ADVCIMP12, Barcelone, Spain, IARIA, September 2012, p. 90-95.

http://hal.inria.fr/hal-00773114

International Conferences with Proceedings

17C. Augonnet, O. Aumage, N. Furmento, R. Namyst, S. Thibault.

StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators, in: The 19th European MPI Users' Group Meeting (EuroMPI 2012), Vienna, Austria, J. L. Träff, S. Benkner, J. Dongarra (editors), LNCS, Springer, 2012, vol. 7490.

http://hal.inria.fr/hal-00725477
18D. Barthou, G. Grosdidier, M. Kruse, O. Pene, C. Tadonki.

QIRAL: A High Level Language for Lattice QCD Code Generation, in: Programming Language Approaches to Concurrency and Communication-centric Software Workshop, 2012, To appear.
19D. Barthou, G. Grosdidier, M. Kruse, O. Pène, C. Tadonki.

QIRAL: A High Level Language for Lattice QCD Code Generation, in: European Joint Conferences on Theory and Practice of Software (ETAPS), Tallin, Estonia, Electronic Proceedings in Theoretical Computer Science, 2012, p. 37-43. [ DOI : 10.4204/EPTCS ]

http://hal.inria.fr/hal-00666885
20S. Benkner, E. Bajrovic, E. Marth, M. Sandrieser, R. Namyst, S. Thibault.

High-Level Support for Pipeline Parallelism on Many-Core Architectures, in: Europar - International European Conference on Parallel and Distributed Computing - 2012, Rhodes Island, Greece, August 2012.

http://hal.inria.fr/hal-00697020
21A. Denis, F. Trahay, Y. Ishikawa.

High performance checksum computation for fault-tolerant MPI over InfiniBand, in: the 19th European MPI Users' Group Meeting (EuroMPI 2012), Vienna, Austria, J. L. Träff, S. Benkner, J. Dongarra (editors), LNCS, Springer, September 2012, vol. 7490.

http://hal.inria.fr/hal-00716478
22A. Duchateau, D. Padua, D. Barthou.

Hydra: Automatic Algorithm Exploration from Linear Algebra Equations, in: ACM/IEEE Intl. Symp. on Code Optimization and Generation, Shenzhen, China, IEEE Computer Society, February 2013, To appear.
23A.-E. Hugo.

Le problème de la composition parallèle : une approche supervisée, in: RenPAR - 21e Rencontres Francophones du Parallélisme (2013), Grenoble, France, January 2013.

http://hal.inria.fr/hal-00773610
24J. Jaeger, D. Barthou.

Automatic efficient data layout for multithreaded stencil codes on CPUs and GPUs, in: IEEE Intl. High Performance Computing Conference, Pune, India, December 2012, To appear.
25E. Jeannot.

Performance Analysis and Optimization of the Tiled Cholesky Factorization on NUMA Machines, in: PAAP 2012 - IEEE International Symposium on Parallel Architectures, Algorithms and Programming, Taipei, Taiwan, Province Of China, IEEE, December 2012.

http://hal.inria.fr/hal-00772790
26C. Kessler, U. Dastgeer, S. Thibault, R. Namyst, A. Richards, U. Dolinsky, S. Benkner, J. L. Träff, S. Pllana.

Programmability and Performance Portability Aspects of Heterogeneous Multi-/Manycore Systems, in: DATE-2012 conference on Design, Automation and Test in Europe, Dresden, Germany, IEEE CS Press, March 2012, p. 1403–1408.

National Conferences with Proceeding

27P.-A. Arras, D. Fuin, E. Jeannot, A. Stoutchinin, S. Thibault.

Ordonnancement de liste dans les systèmes embarqués sous contrainte de mémoire, in: ComPAS'13 / RenPar'21 - 21es Rencontres francophones du Parallélisme, Grenoble, France, Inria Grenoble, January 2013.

http://hal.inria.fr/hal-00772854
28E. Jeannot, G. Mercier, F. Tessier.

TreeMatch : Un algorithme de placement de processus sur architectures multicœurs, in: RenPAR - 21e Rencontres Francophones du Parallélisme, Grenoble, France, January 2013.

http://hal.inria.fr/hal-00773254
29C. Rossignon.

Optimisation du produit matrice-vecteur creux sur architecture GPU pour un simulateur de réservoir, in: ComPAS'13 / RenPar'21 - 21es Rencontres francophones du Parallélisme, Grenoble, France, Inria Grenoble (editor), 2013.

http://hal.inria.fr/hal-00773571

Scientific Books (or Scientific Book chapters)

30P. de Oliveira Castro, S. Louise, D. Barthou.

DSL Stream Programming on Multicore Architectures, in: Programming Multi-core and Many-core Computing Systems, Parallel and Distributed Computing, Wiley-Blackwell, February 2013, To appear.

Internal Reports

31D. Balouek, A. Carpen Amarie, G. Charrier, F. Desprez, E. Jeannot, E. Jeanvoine, A. Lèbre, D. Margery, N. Niclausse, L. Nussbaum, O. Richard, C. Pérez, F. Quesnel, C. Rohr, L. Sarzyniec.

Adding Virtualization Capabilities to Grid'5000, Inria, July 2012, n^o RR-8026, 18 p.

http://hal.inria.fr/hal-00720910
32F. Desprez, G. Fox, E. Jeannot, K. Keahey, M. Kozuch, D. Margery, P. Neyron, L. Nussbaum, C. Pérez, O. Richard, W. Smith, G. Von Laszewski, J. Vöckler.

Supporting Experimental Computer Science, March 2012, n^o Argonne National Laboratory Technical Memo 326.

http://hal.inria.fr/hal-00720815
33F. Desprez, G. Fox, E. Jeannot, K. Keahey, M. Kozuch, D. Margery, P. Neyron, L. Nussbaum, C. Pérez, O. Richard, W. Smith, G. Von Laszewski, J. Vöckler.

Supporting Experimental Computer Science, Inria, July 2012, n^o RR-8035, 29 p.

http://hal.inria.fr/hal-00722605

References in notes

34P. Balaji, H.-W. Jin, K. Vaidyanathan, D. K. Panda.

Supporting iWARP Compatibility and Features for Regular Network Adapters, in: Proceedings of the Workshop on Remote Direct Memory Access (RDMA): Applications, Implementations, and Technologies (RAIT); held in conjunction with the IEEE International Confer ence on Cluster Computing, Boston, MA, September 2005.
35G. Ciaccio, G. Chiola.

GAMMA and MPI/GAMMA on GigabitEthernet, in: Proceedings of 7th EuroPVM-MPI conference, Balatonfured, Hongrie, Lecture Notes in Computer Science, Springer Verlag, Septembre 2000, vol. 1908.
36G. R. Gao, T. Sterling, R. Stevens, M. Hereld, W. Zhu.

Hierarchical multithreading: programming model and system software, in: 20th International Parallel and Distributed Processing Symposium (IPDPS), April 2006.
37A. Mazouz, S.-A.-A. Touati, D. Barthou.

Study of Variations of Native Program Execution Times on Multi-Core Architectures, in: Intl. IEEE Workshop on Multi-Core Computing Systems, Krakow, Poland, IEEE Computer Society, February 2010, 919—924 p.

Previous |

Home