EN FR
EN FR


Project Team Runtime


Scientific Foundations
Application Domains
Contracts and Grants with Industry
Bibliography


Project Team Runtime


Scientific Foundations
Application Domains
Contracts and Grants with Industry
Bibliography


Bibliography

Major publications by the team in recent years
  • 1G. Antoniu, L. Bougé, P. Hatcher, M. MacBeth, K. McGuigan, R. Namyst.

    The Hyperion system: Compiling multithreaded Java bytecode for distributed execution, in: Parallel Computing, October 2001, vol. 27, p. 1279–1297.
  • 2O. Aumage, L. Bougé, A. Denis, L. Eyraud, J.-F. Méhaut, G. Mercier, R. Namyst, L. Prylli.

    A Portable and Efficient Communication Library for High-Performance Cluster Computing (extended version), in: Cluster Computing, January 2002, vol. 5, no 1, p. 43-54.
  • 3O. Aumage, É. Brunet, N. Furmento, R. Namyst.

    NewMadeleine: a Fast Communication Scheduling Engine for High Performance Networks, in: CAC 2007: Workshop on Communication Architecture for Clusters, held in conjunction with IPDPS 2007, Long Beach, California, USA, March 2007, Also available as LaBRI Report 1421-07 and INRIA RR-6085.

    http://hal.inria.fr/inria-00127356
  • 4O. Aumage, G. Mercier.

    MPICH/MadIII: a Cluster of Clusters Enabled MPI Implementation, in: Proc. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2003), Tokyo, IEEE, May 2003, p. 26–35.
  • 5O. Aumage, G. Mercier, R. Namyst.

    MPICH/Madeleine: a True Multi-Protocol MPI for High-Performance Networks, in: Proc. 15th International Parallel and Distributed Processing Symposium (IPDPS 2001), San Francisco, IEEE, April 2001, 51 p, Extended proceedings in electronic form only..
  • 6F. Broquedis, J. Clet-Ortega, S. Moreaud, N. Furmento, B. Goglin, G. Mercier, S. Thibault, R. Namyst.

    hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications, in: Proceedings of the 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP2010), Pisa, Italia, IEEE Computer Society Press, February 2010, p. 180–186. [ DOI : 10.1109/PDP.2010.67 ]

    http://hal.inria.fr/inria-00429889
  • 7F. Broquedis, N. Furmento, B. Goglin, P.-A. Wacrenier, R. Namyst.

    ForestGOMP: an efficient OpenMP environment for NUMA architectures, in: International Journal on Parallel Programming, Special Issue on OpenMP; Guest Editors: Matthias S. Müller and Eduard Ayguadé, 2010, vol. 38, no 5, p. 418-439. [ DOI : 10.1007/s10766-010-0136-3 ]

    http://hal.inria.fr/inria-00496295
  • 8D. Buntinas, G. Mercier, W. Gropp.

    Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem, in: Recent Advances in Parallel Virtual Machine and Message Passing Interface: Proc. 13th European PVM/MPI Users Group Meeting, Bonn, Germany, September 2006.
  • 9V. Danjean, R. Namyst, R. Russell.

    Linux Kernel Activations to Support Multithreading, in: Proc. 18th IASTED International Conference on Applied Informatics (AI 2000), Innsbruck, Austria, IASTED, February 2000, p. 718-723.
  • 10B. Goglin, N. Furmento.

    Finding a Tradeoff between Host Interrupt Load and MPI Latency over Ethernet, in: Proceedings of the IEEE International Conference on Cluster Computing, New Orleans, LA, IEEE Computer Society Press, September 2009.

    http://hal.inria.fr/inria-00397328
  • 11S. Moreaud, B. Goglin.

    Impact of NUMA Effects on High-Speed Networking with Multi-Opteron Machines, in: The 19th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS 2007), Cambridge, Massachussetts, November 2007.

    http://hal.inria.fr/inria-00175747
  • 12R. Namyst.

    Contribution à la conception de supports exécutifs multithreads performants, Université Claude Bernard de Lyon, pour des travaux effectués à l'école normale supérieure de Lyon, December 2001, Habilitation à diriger des recherches.
  • 13S. Thibault, F. Broquedis, B. Goglin, R. Namyst, P.-A. Wacrenier.

    An Efficient OpenMP Runtime System for Hierarchical Architectures, in: International Workshop on OpenMP (IWOMP), Beijing,China, 6 2007, p. 148–159.

    http://hal.inria.fr/inria-00154502
  • 14S. Thibault, R. Namyst, P.-A. Wacrenier.

    Building Portable Thread Schedulers for Hierarchical Multiprocessors: the BubbleSched Framework, in: EuroPar, Rennes,France, ACM, 8 2007.

    http://hal.inria.fr/inria-00154506
  • 15F. Trahay, É. Brunet, A. Denis, R. Namyst.

    A multithreaded communication engine for multicore architectures, in: CAC 2008: Workshop on Communication Architecture for Clusters, held in conjunction with IPDPS 2008, Miami, FL, IEEE Computer Society Press, April 2008.

    http://hal.inria.fr/inria-00224999
  • 16F. Trahay, A. Denis, O. Aumage, R. Namyst.

    Improving Reactivity and Communication Overlap in MPI using a Generic I/O Manager, in: EuroPVM/MPI, Recent Advances in Parallel Virtual Machine and Message Passing Interface, F. Cappello, T. Herault, J. Dongarra (editors), Lecture Notes in Computer Science, Springer, 2007, no 4757, p. 170-177.

    http://hal.inria.fr/inria-00177167
Publications of the year

Doctoral Dissertations and Habilitation Theses

  • 17C. Augonnet.

    Scheduling Tasks over Multicore machines enhanced with Accelerators: a Runtime System's Perspective, Université Sciences et Technologies - Bordeaux I, December 2011.
  • 18S. Moreaud.

    Mouvement de données et placement des tâches pour les communications haute performance sur machines hiérarchiques, Université Sciences et Technologies - Bordeaux I, October 2011.

    http://hal.inria.fr/tel-00635651/en

Articles in International Peer-Reviewed Journal

  • 19C. Augonnet, S. Thibault, R. Namyst, P.-A. Wacrenier.

    StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, in: Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par 2009, February 2011, vol. 23, p. 187–198. [ DOI : 10.1002/cpe.1631 ]

    http://hal.inria.fr/inria-00550877
  • 20S. Benkner, S. Pllana, J. L. Träf, P. Tsigas, U. Dolinsky, C. Augonnet, B. Bachmayer, C. Kessler, D. Moloney, V. Osipov.

    PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems, in: IEEE Micro, 2011, vol. 31, no 5, p. 28-41. [ DOI : 10.1109/MM.2011.67 ]

    http://hal.inria.fr/hal-00648480/en
  • 21A. Benoit, L.-C. Canon, E. Jeannot, Y. Robert.

    Reliability of task graph schedules with transient and fail-stop failures: complexity and algorithms, in: Journal of Scheduling, May 2011.

    http://hal.inria.fr/hal-00653477/en
  • 22B. Goglin.

    High-Performance Message Passing over generic Ethernet Hardware with Open-MX, in: Journal of Parallel Computing, February 2011, vol. 37, no 2, p. 85-100. [ DOI : 10.1016/j.parco.2010.11.001 ]

    http://hal.inria.fr/inria-00533058/en
  • 23B. Goglin.

    NIC-assisted cache-efficient receive stack for message passing over Ethernet, in: Concurrency and Computation: Practice and Experience, 2011, vol. 23, no 2, p. 199-210. [ DOI : 10.1002/cpe.1632 ]

    http://hal.inria.fr/inria-00496301/en
  • 24B. Goglin, J. Squyres, S. Thibault.

    Hardware Locality: Peering under the hood of your server, in: Linux Pro Magazine, July 2011, no 128, p. 28-33.

    http://hal.inria.fr/inria-00597961/en
  • 25E. Jeannot, E. Saule, D. Trystram.

    Optimizing Performance and Reliability on Heterogeneous Parallel Systems: Approximation Algorithms and Heuristics, in: Journal of Parallel and Distributed Computing, 2012, vol. 72, no 2, p. 268 – 280. [ DOI : 10.1016/j.jpdc.2011.11.003 ]

International Conferences with Proceedings

  • 26E. Agullo, C. Augonnet, J. Dongarra, M. Faverge, J. Langou, H. Ltaief, S. Tomov.

    LU Factorization for Accelerator-based Systems, in: 9th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 11), Sharm El-Sheikh, Egypt, June 2011.

    http://hal.inria.fr/hal-00654193/en
  • 27E. Agullo, C. Augonnet, J. Dongarra, M. Faverge, H. Ltaief, S. Thibault, S. Tomov.

    QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators, in: 25th IEEE International Parallel & Distributed Processing Symposium, Anchorage, United States, May 2011.

    http://hal.inria.fr/inria-00547614/en
  • 28S. Benkner, S. Pllana, J. Larsson Träff, P. Tsigas, A. Richards, R. Namyst, B. Bachmayer, C. Kessler, D. Moloney, P. Sanders.

    The PEPPHER Approach to Programmability and Performance Portability for Heterogeneous many-core Architectures, in: ParCo, Ghent, Belgique, 2011.

    http://hal.inria.fr/hal-00661320
  • 29É. Brunet, F. Trahay, A. Denis, R. Namyst.

    A sampling-based approach for communication libraries auto-tuning, in: IEEE International Conference on Cluster Computing, Austin, United States, September 2011.

    http://hal.inria.fr/inria-00605735/en
  • 30L.-C. Canon, E. Jeannot.

    MO-Greedy: an extended beam-search approach for solving a multi-criteria scheduling problem on heterogeneous machines, in: International Heterogeneity in Computing Workshop, Anchorage, United States, September 2011.

    http://hal.inria.fr/hal-00653724/en
  • 31L.-C. Canon, E. Jeannot, J. Weissman.

    A Scheduling and Certification Algorithm for Defeating Collusion in Desktop Grids, in: International Conference on Distributed Computing Systems, Minneapolis, United States, July 2011.

    http://hal.inria.fr/hal-00653493/en
  • 32U. Dastgeer, C. Kessler, S. Thibault.

    Flexible runtime support for efficient skeleton programming on hybrid systems, in: International conference on Parallel Computing (ParCo), Gent, Belgium, August 2011.

    http://hal.inria.fr/inria-00606200/en
  • 33A. Denis.

    A High-Performance Superpipeline Protocol for InfiniBand, in: Euro-Par 2011, Bordeaux, France, E. Jeannot, R. Namyst, J. Roman (editors), Lecture Notes in Computer Science, Springer, August 2011, vol. 6853, p. 276-287.

    http://hal.inria.fr/inria-00586015/en
  • 34B. Goglin, S. Moreaud.

    Dodging Non-Uniform I/O Access in Hierarchical Collective Operations for Multicore Clusters, in: CASS 2011: The 1st Workshop on Communication Architecture for Scalable Systems, held in conjunction with IPDPS 2011, Anchorage, United States, May 2011, 7p p.

    http://hal.inria.fr/inria-00566246/en
  • 35T. Ma, G. Bosilca, A. Bouteiller, B. Goglin, J. Squyres, J. Dongarra.

    Kernel Assisted Collective Intra-node MPI Communication Among Multi-core and Many-core CPUs, in: 40th International Conference on Parallel Processing (ICPP-2011), Taipei, Taiwan, Province Of China, September 2011.

    http://hal.inria.fr/inria-00602877/en
  • 36A. Mazouz, S.-A.-A. Touati, D. Barthou.

    Analysing the Variability of OpenMP Programs Performances on Multicore Architectures, in: Fourth Workshop on Programmability Issues for Heterogeneous Multicores (MULTIPROG-2011), Heraklion, Greece, Held in conjunction with: the 6th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC), 2011, 14 p.

    http://hal.inria.fr/inria-00637957/en
  • 37G. Mercier, E. Jeannot.

    Improving MPI Applications Performance on Multicore Clusters with Rank Reordering, in: EuroMPI, Santorini, Italy, Springer Verlag, September 2011, vol. 6960, p. 39-49. [ DOI : 10.1007/978-3-642-24449-0 ]

    http://hal.inria.fr/hal-00643151/en
  • 38B. Putigny, B. Goglin, D. Barthou.

    Performance modeling for power consumption reduction on SCC, in: 4th Many-core Applications Research Community (MARC) Symposium, Potsdam, Germany, H. Plattner (editor), December 2011.

    http://hal.inria.fr/hal-00649635/en
  • 39F. Trahay, F. Rue, M. Faverge, Y. Ishikawa, R. Namyst, J. Dongarra.

    EZTrace: a generic framework for performance analysis, in: IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Newport Beach, CA, United States, May 2011, Poster Session.

    http://hal.inria.fr/inria-00587216/en
  • 40S. Yi, E. Jeannot, D. Kondo, D. P. Anderson.

    Towards Real-Time, Volunteer Distributed Computing, in: 11th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2011), Newport Beach, CA, United States, 2011.

    http://hal.inria.fr/hal-00654691/en

National Conferences with Proceeding

  • 41S. Mahmoudi, P. Manneback, C. Augonnet, S. Thibault.

    Détection optimale des coins et contours dans des bases d'images volumineuses sur architectures multicœurs hétérogènes, in: Rencontres francophones du parallélisme, Saint-Malo, France, May 2011.

    http://hal.inria.fr/inria-00606195/en
  • 42H. Sylvain.

    Programmation multi-accélérateurs unifiée en OpenCL, in: RenPAR'20, Saint Malo, France, May 2011.

    http://hal.inria.fr/hal-00643257/en

Scientific Books (or Scientific Book chapters)

  • 43P. Vicat-Blanc Primet, B. Goglin, R. Guillier, S. Soudan.

    Computing Networks: From Cluster to Cloud Computing, Wiley-ISTE, May 2011.

    http://hal.inria.fr/inria-00590739/en
  • 44P. de Oliveira Castro, S. Louise, D. Barthou.

    Programming Multi-core and Many-core Computing Systems, Wiley-Blackwell, 2012, To Appear.

Books or Proceedings Editing

  • 45E. Jeannot, R. Namyst, J. Roman (editors)

    Euro-Par 2011 Parallel Processing - 17th International Conference, Euro-Par 2011, Bordeaux, France, August 29 - September 2, 2011, Proceedings, Part I, Lecture Notes in Computer Science, Springer, 2011, vol. 6852.
  • 46E. Jeannot, R. Namyst, J. Roman (editors)

    Euro-Par 2011 Parallel Processing - 17th International Conference, Euro-Par 2011, Bordeaux, France, August 29 - September 2, 2011, Proceedings, Part II, Lecture Notes in Computer Science, Springer, 2011, vol. 6853.

Scientific Popularization

  • 47B. Goglin.

    De votre boulangerie à un système d'exploitation multiprocesseur, in: Interstices, February 2011.

    http://hal.inria.fr/inria-00566232/en
  • 48B. Goglin.

    Et plus vite si affinités..., in: Interstices, June 2011.

    http://hal.inria.fr/inria-00604025/en
  • 49R. Namyst.

    Virtualization of Hybrid Architectures, in: Super-computers: at the frontiers of extreme computing, November 2011.

Other Publications

  • 50S. Barascou.

    Optimisation des communications pour les calculs parallèles avec SALOME/YACS et PadicoTM, Université Sciences et Technologies - Bordeaux I, September 2011.

    http://hal.inria.fr/hal-00652882/en
  • 51A.-E. Hugo.

    Composabilité de codes parallèles sur architectures hétérogènes, Université Sciences et Technologies - Bordeaux I, 2011.

    http://hal.inria.fr/inria-00619654/en
  • 52J. Jaeger, D. Barthou.

    Stencils sur CPU et GPU, December 2011, Quatrième rencontres de la communauté française de compilation, Saint-Hippolyte, France.
  • 53R. Namyst.

    Programming heterogeneous, accelerator-based multicore machines:current situation and main challenges, May 2011, Invited Talk.

    http://hal.inria.fr/inria-00590670/en
  • 54B. Putigny, D. Barthou, B. Goglin.

    Modélisation du coût de la cohérence de cache pour améliorer le tuilage de boucles, December 2011, Quatrième rencontres de la communauté française de compilation, Saint-Hippolyte, France.
  • 55C. Roelandt.

    Association de modèles de programmation pour l'exploitation de clusters de GPUs dans le calcul intensif, Université Sciences et Technologies - Bordeaux I, June 2011.
  • 56C. Rossignon.

    Étude du GMRES dans un code de simulation de réservoir, Université Sciences et Technologies - Bordeaux I, June 2011.
References in notes
  • 57P. Balaji, H.-W. Jin, K. Vaidyanathan, D. K. Panda.

    Supporting iWARP Compatibility and Features for Regular Network Adapters, in: Proceedings of the Workshop on Remote Direct Memory Access (RDMA): Applications, Implementations, and Technologies (RAIT); held in conjunction with the IEEE International Confer ence on Cluster Computing, Boston, MA, September 2005.
  • 58G. Ciaccio, G. Chiola.

    GAMMA and MPI/GAMMA on GigabitEthernet, in: Proceedings of 7th EuroPVM-MPI conference, Balatonfured, Hongrie, Lecture Notes in Computer Science, Springer Verlag, Septembre 2000, vol. 1908.
  • 59G. R. Gao, T. Sterling, R. Stevens, M. Hereld, W. Zhu.

    Hierarchical multithreading: programming model and system software, in: 20th International Parallel and Distributed Processing Symposium (IPDPS), April 2006.
  • 60B. Goglin, S. Moreaud.

    KNEM: a Generic and Scalable Kernel-Assisted Intra-node MPI Communication Framework, in: Journal of Parallel and Distributed Computing, 2012, Submitted.
  • 61A. Mazouz, S.-A.-A. Touati, D. Barthou.

    Study of Variations of Native Program Execution Times on Multi-Core Architectures, in: Intl. IEEE Workshop on Multi-Core Computing Systems, Krakow, Poland, IEEE Computer Society, February 2010, 919—924 p.