EN FR
EN FR


Bibliography

Major publications by the team in recent years
  • 1C. Augonnet, S. Thibault, R. Namyst, P.-A. Wacrenier.

    StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, in: Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par 2009, February 2011, vol. 23, pp. 187–198. [ DOI : 10.1002/cpe.1631 ]

    http://hal.inria.fr/inria-00550877
  • 2F. Broquedis, J. Clet-Ortega, S. Moreaud, N. Furmento, B. Goglin, G. Mercier, S. Thibault, R. Namyst.

    hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications, in: Proceedings of the 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP2010), Pisa, Italia, IEEE Computer Society Press, February 2010, pp. 180–186. [ DOI : 10.1109/PDP.2010.67 ]

    http://hal.inria.fr/inria-00429889
  • 3F. Broquedis, N. Furmento, B. Goglin, P.-A. Wacrenier, R. Namyst.

    ForestGOMP: an efficient OpenMP environment for NUMA architectures, in: International Journal on Parallel Programming, Special Issue on OpenMP; Guest Editors: Matthias S. Müller and Eduard Ayguadé, 2010, vol. 38, no 5, pp. 418-439. [ DOI : 10.1007/s10766-010-0136-3 ]

    http://hal.inria.fr/inria-00496295
  • 4D. Buntinas, G. Mercier, W. Gropp.

    Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem, in: Recent Advances in Parallel Virtual Machine and Message Passing Interface: Proc. 13th European PVM/MPI Users Group Meeting, Bonn, Germany, September 2006.
  • 5B. Goglin, N. Furmento.

    Finding a Tradeoff between Host Interrupt Load and MPI Latency over Ethernet, in: Proceedings of the IEEE International Conference on Cluster Computing, New Orleans, LA, IEEE Computer Society Press, September 2009.

    http://hal.inria.fr/inria-00397328
  • 6B. Goglin.

    High-Performance Message Passing over generic Ethernet Hardware with Open-MX, in: Journal of Parallel Computing, February 2011, vol. 37, no 2, pp. 85-100. [ DOI : 10.1016/j.parco.2010.11.001 ]

    http://hal.inria.fr/inria-00533058/en
  • 7S. Thibault, R. Namyst, P.-A. Wacrenier.

    Building Portable Thread Schedulers for Hierarchical Multiprocessors: the BubbleSched Framework, in: EuroPar, Rennes,France, ACM, 8 2007.

    http://hal.inria.fr/inria-00154506
  • 8F. Trahay, É. Brunet, A. Denis, R. Namyst.

    A multithreaded communication engine for multicore architectures, in: CAC 2008: Workshop on Communication Architecture for Clusters, held in conjunction with IPDPS 2008, Miami, FL, IEEE Computer Society Press, April 2008.

    http://hal.inria.fr/inria-00224999
Publications of the year

Doctoral Dissertations and Habilitation Theses

Articles in International Peer-Reviewed Journals

  • 11P.-A. Arras, D. Fuin, E. Jeannot, A. Stoutchinin, S. Thibault.

    List Scheduling in Embedded Systems Under Memory Constraints, in: International Journal of Parallel Programming, November 2014. [ DOI : 10.1007/s10766-014-0338-1 ]

    https://hal.inria.fr/hal-01087067
  • 12D. Barthou, O. Brand-Foissac, O. Pene, G. Grosdidier, R. Dolbeau, C. Eisenbeis, M. Kruse, K. Petrov, C. Tadonki.

    Automated Code Generation for Lattice Quantum Chromodynamics and beyond, in: Journal of Physics: Conference Series, 2014, vol. 510, 11 p, LPT-Orsay-13-142. [ DOI : 10.1088/1742-6596/510/1/012005 ]

    https://hal.inria.fr/hal-00926513
  • 13A. Hugo, A. Guermouche, P.-A. Wacrenier, R. Namyst.

    Composing multiple StarPU applications over heterogeneous machines: A supervised approach, in: The International Journal of High Performance Computing Applications, February 2014, vol. 28, pp. 285 - 300. [ DOI : 10.1177/1094342014527575 ]

    https://hal.inria.fr/hal-01101045
  • 14E. Jeannot, G. Mercier, F. Tessier.

    Process Placement in Multicore Clusters: Algorithmic Issues and Practical Techniques, in: IEEE Transactions on Parallel and Distributed Systems, April 2014, vol. 25, no 4, pp. 993- 1002. [ DOI : 10.1109/TPDS.2013.104 ]

    https://hal.inria.fr/hal-01109978
  • 15E. Saillard, P. Carribault, D. Barthou.

    PARCOACH: Combining static and dynamic validation of MPI collective communications, in: International Journal of High Performance Computing Applications, 2014. [ DOI : 10.1177/1094342014552204 ]

    https://hal.archives-ouvertes.fr/hal-01078762

International Conferences with Proceedings

  • 16M. Alaniz, S. Nesmachnow, B. Goglin, S. Iturriaga, V. Gil Costa, M. Printista.

    MBSPDiscover: An Automatic Benchmark for MultiBSP Performance Analysis, in: First HPCLATAM - CLCAR Joint Latin American High Performance Computing Conference, Valparaiso, Chile, Communications in Computer and Information Science (CCIS), Springer, October 2014, vol. 485, pp. 158-172.

    https://hal.inria.fr/hal-01062528
  • 17D. Barthou, E. Jeannot.

    SPAGHETtI: Scheduling/Placement Approach for Task-Graphs on HETerogeneous archItecture, in: Euro-Par, Lisboa, Portugal, LNCS, August 2014, vol. 8632, pp. 174 - 185. [ DOI : 10.1007/978-3-319-09873-9_15 ]

    https://hal.archives-ouvertes.fr/hal-01100948
  • 18A. Denis.

    pioman: a Generic Framework for Asynchronous Progression and Multithreaded Communications, in: IEEE International Conference on Cluster Computing (IEEE Cluster), Madrid, Spain, September 2014.

    https://hal.inria.fr/hal-01064652
  • 19A. Denis.

    pioman: a pthread-based Multithreaded Communication Engine, in: Euromicro International Conference on Parallel, Distributed and Network-based Processing, Turku, Finland, March 2015.

    https://hal.inria.fr/hal-01087775
  • 20B. Goglin.

    Managing the Topology of Heterogeneous Cluster Nodes with Hardware Locality (hwloc), in: International Conference on High Performance Computing & Simulation (HPCS 2014), Bologna, Italy, IEEE, July 2014.

    https://hal.inria.fr/hal-00985096
  • 21B. Goglin, J. Hursey, J. M. Squyres.

    netloc: Towards a Comprehensive View of the HPC System Topology, in: Fifth International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2014), Minneapolis, United States, IEEE, September 2014.

    https://hal.inria.fr/hal-01010599
  • 22C. Haine, O. Aumage, P. Enguerrand, D. Barthou.

    Exploring and Evaluating Array Layout Restructuration for SIMDization, in: The 27th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2014), Hillsboro, United States, Intel Corporation, September 2014.

    https://hal.inria.fr/hal-01070467
  • 23S. Henry, A. Denis, D. Barthou, M.-C. Counilh, R. Namyst.

    Toward OpenCL Automatic Multi-Device Support, in: Euro-Par 2014, Porto, Portugal, F. Silva, I. Dutra, V. S. Costa (editors), Springer, August 2014.

    https://hal.inria.fr/hal-01005765
  • 24A.-E. Hugo, A. Guermouche, P.-A. Wacrenier, R. Namyst.

    A runtime approach to dynamic resource allocation for sparse direct solvers, in: 2014 43rd International Conference on Parallel Processing, Minneapolis, United States, September 2014. [ DOI : 10.1109/ICPP.2014.57 ]

    https://hal.inria.fr/hal-01101054
  • 25X. Lacoste, M. Faverge, P. Ramet, S. Thibault, G. Bosilca.

    Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes, in: HCW'2014 workshop of IPDPS, Phoenix, United States, IEEE, May 2014.

    https://hal.inria.fr/hal-00987094
  • 26B. Putigny, B. Goglin, D. Barthou.

    A Benchmark-based Performance Model for Memory-bound HPC Applications, in: International Conference on High Performance Computing & Simulation (HPCS 2014), Bologna, Italy, IEEE, July 2014.

    https://hal.inria.fr/hal-00985598
  • 27B. Putigny, B. Ruelle, B. Goglin.

    Analysis of MPI Shared-Memory Communication Performance from a Cache Coherence Perspective, in: PDSEC - The 15th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing, held in conjunction with IPDPS, Phoenix, AZ, United States, IEEE, May 2014.

    https://hal.inria.fr/hal-00956307
  • 28E. Saillard, P. Carribault, D. Barthou.

    Static Validation of Barriers and Worksharing Constructs in OpenMP Applications, in: IWOMP, Salvador, Brazil, September 2014, pp. 73 - 86. [ DOI : 10.1007/978-3-319-11454-5_6 ]

    https://hal.archives-ouvertes.fr/hal-01078759
  • 29M. Sergent, S. Archipoff.

    Modulariser les ordonnanceurs de tâches : une approche structurelle, in: ComPAS 2014 : conférence en parallélisme, architecture et systèmes, Neuchâtel, Switzerland, P. Felber, L. Philippe, E. Riviere, A. Tisserand (editors), April 2014.

    https://hal.inria.fr/hal-00978364
  • 30L. Stanisic, S. Thibault, A. Legrand, B. Videau, J.-F. Méhaut.

    Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures, in: Euro-par - 20th International Conference on Parallel Processing, Porto, Portugal, Euro-Par 2014, LNCS 8632, Springer International Publishing Switzerland, August 2014, pp. 50-62.

    https://hal.inria.fr/hal-01011633
  • 31G. Vaumourin, D. Thomas, G. Alexandre, D. Barthou.

    Specific Read Only Data Management for Memory Hierarchy Optimization, in: EWiLi 2014 - Workshop Embed With Linux, Lisboa, Portugal, J. Boukhobza, J. P. Diguet, P. Ficheux, J. Rufino, F. Singhoff (editors), Proceedings of the Embed With Linux 2014 Workshop, November 2014, vol. Vol-1291, Session 2.

    https://hal.archives-ouvertes.fr/hal-01090218
  • 32P. Virouleau, P. Brunet, F. Broquedis, N. Furmento, S. Thibault, O. Aumage, T. Gautier.

    Evaluation of OpenMP Dependent Tasks with the KASTORS Benchmark Suite, in: IWOMP - 10th International Workshop on OpenMP, Salvador, Brazil, France, Springer, September 2014, pp. 16 - 29. [ DOI : 10.1007/978-3-319-11454-5_2 ]

    https://hal.inria.fr/hal-01081974

Conferences without Proceedings

  • 33E. Jeannot, G. Mercier, F. Tessier.

    Matching communication pattern with underlying hardware architecture, in: 6th European Conference on Computational Fluid Dynamics, Barcelona, Spain, July 2014.

    https://hal.inria.fr/hal-01087611

Scientific Books (or Scientific Book chapters)

  • 34P. De Oliveira Castro, S. Louise, D. Barthou.

    DSL Stream Programming on Multicore Architectures, in: Programming multi-core and many-core computing systems, John Wiley and Sons, 2014, chapter 12.

    https://hal.archives-ouvertes.fr/hal-00952318
  • 35T. Hoefler, E. Jeannot, G. Mercier.

    An Overview of Process Mapping Techniques and Algorithms in High-Performance Computing, in: High Performance Computing on Complex Environments, E. Jeannot, J. Žilinskas (editors), Wiley, June 2014, pp. 75-94.

    https://hal.inria.fr/hal-00921626
  • 36L. Lopez, J. Žilinskas, A. Costan, R. G. Cascella, G. Kecskemeti, E. Jeannot, M. Cannataro, L. Ricci, S. Benkner, S. Petit, V. Scarano, J. Gracia, S. Hunold, S. L. Scott, S. Lankes, C. Lengauer, J. Carretero, J. Breitbart, M. Alexander.

    Euro-Par 2014: Parallel Processing Workshops, Part I, Lecture Note In Computer Science, Springer, December 2014, vol. 8805.

    https://hal.inria.fr/hal-01110069
  • 37L. Lopez, J. Žilinskas, A. Costan, R. G. Cascella, G. Kecskemeti, E. Jeannot, M. Cannataro, L. Ricci, S. Benkner, S. Petit, V. Scarano, J. Gracia, S. Hunold, S. L. Scott, S. Lankes, C. Lengauer, J. Carretero, J. Breitbart, M. Alexander.

    Euro-Par 2014: Parallel Processing Workshops, Part II, Lecture Note In Computer Science, Springer, December 2014, vol. 8806.

    https://hal.inria.fr/hal-01110071

Books or Proceedings Editing

Internal Reports

  • 39C. Augonnet, O. Aumage, N. Furmento, S. Thibault, R. Namyst.

    StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators, May 2014, no RR-8538.

    https://hal.inria.fr/hal-00992208
  • 40X. Lacoste, M. Faverge, P. Ramet, S. Thibault, G. Bosilca.

    Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes, January 2014, no RR-8446, 25 p.

    https://hal.inria.fr/hal-00925017
  • 41L. Stanisic, S. Thibault, A. Legrand, B. Videau, J.-F. Méhaut.

    Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures, March 2014, no RR-8509.

    https://hal.inria.fr/hal-00966862
  • 42A. Tate, A. Kamil, A. Dubey, A. Größlinger, B. Chamberlain, B. Goglin, C. Edwards, C. J. Newburn, D. Padua, D. Unat, E. Jeannot, F. Hannig, T. Gysi, H. Ltaief, J. Sexton, J. Labarta, J. Shalf, K. Fürlinger, K. O’Brien, L. Linardakis, M. Besta, M.-C. Sawley, M. Abraham, M. Bianco, M. Pericàs, N. Maruyama, P. H. J. Kelly, P. Messmer, R. B. Ross, R. Cledat, S. Matsuoka, T. Schulthess, T. Hoefler, V. J. Leung.

    Programming Abstractions for Data Locality, PADAL Workshop 2014, April 28–29, Swiss National Supercomputing Center (CSCS), Lugano, Switzerland, November 2014, 54 p.

    https://hal.inria.fr/hal-01083080

Scientific Popularization

  • 43E. Agullo, O. Aumage, M. Faverge, N. Furmento, F. Pruvost, M. Sergent, S. Thibault.

    Overview of Distributed Linear Algebra on Hybrid Nodes over the StarPU Runtime, February 2014, SIAM Conference on Parallel Processing for Scientific Computing.

    https://hal.inria.fr/hal-00978602
References in notes
  • 44P. Balaji, H.-W. Jin, K. Vaidyanathan, D. K. Panda.

    Supporting iWARP Compatibility and Features for Regular Network Adapters, in: Proceedings of the Workshop on Remote Direct Memory Access (RDMA): Applications, Implementations, and Technologies (RAIT); held in conjunction with the IEEE International Confer ence on Cluster Computing, Boston, MA, September 2005.
  • 45G. Ciaccio, G. Chiola.

    GAMMA and MPI/GAMMA on GigabitEthernet, in: Proceedings of 7th EuroPVM-MPI conference, Balatonfured, Hongrie, Lecture Notes in Computer Science, Springer Verlag, Septembre 2000, vol. 1908.
  • 46G. R. Gao, T. Sterling, R. Stevens, M. Hereld, W. Zhu.

    Hierarchical multithreading: programming model and system software, in: 20th International Parallel and Distributed Processing Symposium (IPDPS), April 2006.