Bibliography
Major publications by the team in recent years
-
1M. Cornero, R. Costa, R. Fernández Pascual, A. Ornstein, E. Rohou.
An Experimental Environment Validating the Suitability of CLI as an Effective Deployment Format for Embedded Systems, in: Conference on HiPEAC, Göteborg, Sweden, P. Stenström, M. Dubois, M. Katevenis, R. Gupta, T. Ungerer (editors), Springer, January 2008, pp. 130–144. -
2R. Costa, E. Rohou.
Comparing the size of .NET applications with native code, in: 3rd Intl Conference on Hardware/software codesign and system synthesis, Jersey City, NJ, USA, P. Eles, A. Jantsch, R. A. Bergamaschi (editors), ACM, September 2005, pp. 99–104. -
3D. Hardy, I. Puaut.
WCET analysis of multi-level non-inclusive set-associative instruction caches, in: Proc. of the 29th IEEE Real-Time Systems Symposium, Barcelona, Spain, December 2008. -
4D. Hardy, I. Puaut.
Static probabilistic Worst Case Execution Time Estimation for architectures with Faulty Instruction Caches, in: 21st International Conference on Real-Time Networks and Systems, Sophia Antipolis, France, October 2013. [ DOI : 10.1145/2516821.2516842 ]
https://hal.inria.fr/hal-00862604 -
5P. Michaud, Y. Sazeides, A. Seznec, T. Constantinou, D. Fetis.
A study of thread migration in temperature-constrained multi-cores, in: ACM Transactions on Architecture and Code Optimization, 2007, vol. 4, no 2, 9 p. -
6P. Michaud, A. Seznec, S. Jourdan.
An Exploration of Instruction Fetch Requirement in Out-of-Order Superscalar Processors, in: International Journal of Parallel Programming, 2001, vol. 29, no 1, pp. 35-58. -
7A. Perais, A. Seznec.
Practical data value speculation for future high-end processors, in: International Symposium on High Performance Computer Architecture, Orlando, FL, United States, IEEE, February 2014, pp. 428 - 439. [ DOI : 10.1109/HPCA.2014.6835952 ]
https://hal.inria.fr/hal-01088116 -
8N. Prémillieu, A. Seznec.
Efficient Out-of-Order Execution of Guarded ISAs, in: ACM Transactions on Architecture and Code Optimization (TACO) , December 2014, 21 p. [ DOI : 10.1145/2677037 ]
https://hal.inria.fr/hal-01103230 -
9E. Rohou, M. Smith.
Dynamically managing processor temperature and power, in: Second Workshop on Feedback-Directed Optimizations, 1999. -
10A. Seznec, P. Michaud.
A case for (partially)-tagged geometric history length predictors, in: Journal of Instruction Level Parallelism (http://www.jilp.org/vol8), April 2006.
http://www.jilp.org/vol8
Doctoral Dissertations and Habilitation Theses
-
11H. Li.
Extraction and traceability of annotations for WCET estimation, Université Rennes 1, October 2015.
https://tel.archives-ouvertes.fr/tel-01232613 -
12B. Narasimha Swamy.
Exploiting heterogeneous manycores on sequential code, UNIVERSITE DE RENNES 1, March 2015.
https://hal.inria.fr/tel-01126807 -
13S. N. Natarajan.
Modeling performance of serial and parallel sections of multi-threaded programs in manycore era, Inria Rennes - Bretagne Atlantique and University of Rennes 1, France, June 2015.
https://hal.inria.fr/tel-01170039 -
14A. Perais.
Increasing the Performance of Superscalar Processors through Value Prediction, Rennes 1, September 2015.
https://hal.inria.fr/tel-01235370 -
15E. Rohou.
Infrastructures and Compilation Strategies for the Performance of Computing Systems, Université de Rennes 1, November 2015, Habilitation à diriger des recherches.
https://hal.inria.fr/tel-01237164
Articles in International Peer-Reviewed Journals
-
16S. Collange, D. Defour, S. Graillat, R. Iakymchuk.
Numerical Reproducibility for the Parallel Reduction on Multi- and Many-Core Architectures, in: Parallel Computing, September 2015, vol. 49, pp. 83-97. [ DOI : 10.1016/j.parco.2015.09.001 ]
http://hal-lirmm.ccsd.cnrs.fr/lirmm-01206348 -
17M.-K. Lee, P. Michaud, J. S. Sim, D. Nyang.
A simple proof of optimality for the MIN cache replacement policy, in: Information Processing Letters, September 2015, 3 p. [ DOI : 10.1016/j.ipl.2015.09.004 ]
https://hal.inria.fr/hal-01199424 -
18P. Michaud, A. Mondelli, A. Seznec.
Revisiting Clustered Microarchitecture for Future Superscalar Cores: A Case for Wide Issue Clusters, in: ACM Transactions on Architecture and Code Optimization (TACO) , August 2015, vol. 13, no 3, 22 p. [ DOI : 10.1145/2800787 ]
https://hal.inria.fr/hal-01193178 -
19A. Perais, A. Seznec.
EOLE: Toward a Practical Implementation of Value Prediction, in: IEEE Micro, June 2015, vol. 35, no 3, pp. 114 - 124. [ DOI : 10.1109/MM.2015.45 ]
https://hal.inria.fr/hal-01193287 -
20A. Suresh, B. Narasimha Swamy, E. Rohou, A. Seznec.
Intercepting Functions for Memoization: A Case Study Using Transcendental Functions, in: ACM Transactions on Architecture and Code Optimization (TACO) , July 2015, vol. 12, no 2, 23 p. [ DOI : 10.1145/2751559 ]
https://hal.inria.fr/hal-01178085
International Conferences with Proceedings
-
21A. Bonenfant, F. Carrier, H. Cassé, P. Cuenot, D. Claraz, N. Halbwachs, H. Li, C. Maiza, M. De Michiel, V. Mussot, C. Parent-Vigouroux, I. Puaut, P. Raymond, E. Rohou, P. Sotin.
When the worst-case execution time estimation gains from the application semantics, in: 8th European Congress on Embedded Real-Time Software and Systems, Toulouse, France, January 2016.
https://hal.inria.fr/hal-01235781 -
22S. Eyerman, P. Michaud, W. Rogiest.
Revisiting Symbiotic Job Scheduling, in: IEEE International Symposium on Performance Analysis of Systems and Software, Philadelphia, United States, March 2015. [ DOI : 10.1109/ISPASS.2015.7095791 ]
https://hal.inria.fr/hal-01139807 -
23N. Hallou, E. Rohou, P. Clauss, A. Ketterlin.
Dynamic Re-Vectorization of Binary Code, in: International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation - SAMOS XV, Agios Konstantinos, Greece, July 2015.
https://hal.inria.fr/hal-01155207 -
24M. Hataba, A. El-Mahdy, E. Rohou.
OJIT: A Novel Obfuscation Approach Using Standard Just-In-Time Compiler Transformations, in: International Workshop on Dynamic Compilation Everywhere, Amsterdam, Netherlands, January 2015.
https://hal.inria.fr/hal-01162998 -
25R. Iakymchuk, D. Defour, S. Collange, S. Graillat.
Reproducible Triangular Solvers for High-Performance Computing, in: ITNG'2015: 12th International Conference on Information Technology - New Generations, Las Vegas, NV, United States, April 2015, pp. 353-358. [ DOI : 10.1109/ITNG.2015.63 ]
http://hal-lirmm.ccsd.cnrs.fr/lirmm-01206371 -
26H. Li, I. Puaut, E. Rohou.
Tracing Flow Information for Tighter WCET Estimation: Application to Vectorization, in: 21st IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, Hong-Kong, China, August 2015, 10 p.
https://hal.inria.fr/hal-01177902 -
28S. N. Natarajan, B. Narasimha Swamy, A. Seznec.
An Empirical High Level Performance Model For FutureMany-cores, in: Proceedings of the 12th ACM International Conference on Computing Frontiers, Ischia, Italy, 2015. [ DOI : 10.1145/2742854.2742867 ]
https://hal.inria.fr/hal-01170038 -
29S. N. Natarajan, A. Seznec.
Sequential and Parallel Code Sections are Different: they may require different Processors, in: PARMA-DITAM '15 - 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures, Amsterdam, Netherlands, 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and 4th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms, ACM, January 2015, pp. 13-18. [ DOI : 10.1145/2701310.2701314 ]
https://hal.inria.fr/hal-01170061 -
30M.-M. Papadopoulou, X. Tong, A. Seznec, A. Moshovos.
Prediction-based superpage-friendly TLB designs, in: 21st IEEE symposium on High Performance Computer Architecture, San Francisco, United States, 2015. [ DOI : 10.1109/HPCA.2015.7056034 ]
https://hal.inria.fr/hal-01193176 -
31A. Perais, A. Seznec, P. Michaud, A. Sembrant, E. Hagersten.
Cost-Effective Speculative Scheduling in High Performance Processors, in: International Symposium on Computer Architecture, Portland, United States, Proceedings of the International Symposium on Computer Architecture, ACM/IEEE, June 2015, vol. 42, pp. 247-259. [ DOI : 10.1145/2749469.2749470 ]
https://hal.inria.fr/hal-01193233 -
32A. Perais, A. Seznec.
BeBoP: A Cost Effective Predictor Infrastructure for Superscalar Value Prediction, in: International Symposium on High Performance Computer Architecture, San Francisco, United States, IEEE, February 2015, vol. 21, pp. 13 - 25 ). [ DOI : 10.1109/HPCA.2015.7056018 ]
https://hal.inria.fr/hal-01193175 -
33E. Rohou, D. Guyon.
Sequential Performance: Raising Awareness of the Gory Details, in: International Conference on Computational Science, Reykjavik, Iceland, June 2015. [ DOI : 10.1016/j.procs.2015.05.347 ]
https://hal.inria.fr/hal-01162336 -
34E. Rohou, B. Narasimha Swamy, A. Seznec.
Branch Prediction and the Performance of Interpreters - Don't Trust Folklore, in: International Symposium on Code Generation and Optimization, Burlingame, United States, February 2015.
https://hal.inria.fr/hal-01100647 -
35A. Sembrant, T. Carlson, E. Hagersten, D. Black-Shaffer, A. Perais, A. Seznec, P. Michaud.
Long Term Parking (LTP): Criticality-aware Resource Allocation in OOO Processors, in: International Symposium on Microarchitecture, Micro 2015, Honolulu, United States, Proceeding of the International Symposium on Microarchitecture, Micro 2015, ACM, December 2015, 11 p.
https://hal.inria.fr/hal-01225019 -
36A. Seznec, J. San Miguel, J. Albericio.
The Inner Most Loop Iteration counter: a new dimension in branch history , in: 48th International Symposium On Microarchitecture, Honolulu, United States, ACM, December 2015, 11 p.
https://hal.inria.fr/hal-01208347 -
37A. Seznec.
Bank-interleaved cache or memory indexing does not require euclidean division, in: 11th Annual Workshop on Duplicating, Deconstructing and Debunking, Portland, United States, June 2015.
https://hal.inria.fr/hal-01208356 -
38C. Silvano, G. Agosta, A. Bartolini, A. R. Beccari, L. Benini, J. Bispo, R. Cmar, J. M. P. Cardoso, C. Cavazzoni, J. Martinovič, G. Palermo, M. Palkovič, P. Pinto, E. Rohou, N. Sanna, K. Slaninová.
AutoTuning and Adaptivity appRoach for Energy efficient eXascale HPC systems: the ANTAREX Approach, in: Design, Automation, and Test in Europe, Dresden, Germany, Design, Automation, and Test in Europe, March 2016.
https://hal.inria.fr/hal-01235741 -
39C. Silvano, G. Agosta, A. Bartolini, A. Beccari, L. Benini, J. M. P. Cardoso, C. Cavazzoni, J. Martinovič, G. Palermo, M. Palkovič, E. Rohou, N. Sanna, K. Slaninova.
ANTAREX – AutoTuning and Adaptivity appRoach for Energy efficient eXascale HPC systems, in: 18th IEEE International Conference on Computational Science and Engineering, Porto, Portugal, October 2015.
https://hal.inria.fr/hal-01235713 -
40M. Suzana, J. Abella, D. Hardy, E. Quinones, I. Puaut, F. J. Cazorla.
Speeding up Static Probabilistic Timing Analysis, in: International Conference on Architecture of Computing Systems, Porto, Portugal, Springer Lecture Notes on Computer Science (LNCS) series, March 2015.
https://hal.inria.fr/hal-01235544
Conferences without Proceedings
-
41V. A. NGUYEN, D. Hardy, I. Puaut.
Scheduling of parallel applications on many-core architectures with caches: bridging the gap between WCET analysis and schedulability analysis, in: 9th Junior Researcher Workshop on Real-Time Computing (JRWRTC 2015), Lille, France, November 2015.
https://hal.inria.fr/hal-01236191
Internal Reports
-
42S. Kalathingal, S. Collange, B. Narasimha Swamy, A. Seznec.
Transforming TLP into DLP with the Dynamic Inter-Thread Vectorization Architecture, Inria Rennes Bretagne Atlantique, December 2015, no RR-8830.
https://hal.inria.fr/hal-01244938 -
43A. Sridharan, A. Seznec.
Discrete Cache Insertion Policies for Shared Last Level Cache Management on Large Multicores, Inria-IRISA Rennes Bretagne Atlantique, équipe ALF, December 2015, no RR-8816.
https://hal.inria.fr/hal-01236706
Other Publications
-
44S. Collange, D. Defour, S. Graillat, R. Iakymchuk.
Numerical Reproducibility for the Parallel Reduction on Multi- and Many-Core Architectures, September 2015, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-00949355 -
45R. Iakymchuk, S. Collange, D. Defour, S. Graillat.
ExBLAS: Reproducible and Accurate BLAS Library, July 2015, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01202396 -
46R. Iakymchuk, S. Collange, D. Defour, S. Graillat.
Reproducibility and Accuracy for High-Performance Computing, April 2015, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01140531 -
47R. Iakymchuk, D. Defour, S. Collange, S. Graillat.
Reproducible and Accurate Matrix Multiplication for GPU Accelerators, January 2015, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01102877 -
48R. Iakymchuk, D. Defour, S. Collange, S. Graillat.
Reproducible Triangular Solvers for High-Performance Computing, February 2015, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01116588 -
49R. Iakymchuk, S. Graillat, S. Collange, D. Defour.
ExBLAS: Reproducible and Accurate BLAS Library, April 2015, RAIM'2015: 7ème Rencontre Arithmétique de l'Informatique Mathématique, Poster.
https://hal.archives-ouvertes.fr/hal-01140280
-
50G. M. Amdahl.
Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities, in: SJCC, 1967, pp. 483–485. -
51L. A. Belady.
A study of replacement algorithms for a virtual-storage computer, in: IBM Systems Journal, 1966, vol. 5, no 2, pp. 78-101. -
52D. Burger, T. M. Austin.
The simplescalar tool set, version 2.0, 1997. -
53R. S. Chappell, J. Stark, S. P. Kim, S. K. Reinhardt, Y. N. Patt.
Simultaneous subordinate microthreading (SSMT), in: ISCA '99: Proceedings of the 26th annual international symposium on Computer architecture, Washington, DC, USA, IEEE Computer Society, 1999, pp. 186–195.
http://doi.acm.org/10.1145/300979.300995 -
54S. Eyerman, L. Eeckhout.
Probabilistic job symbiosis modeling for SMT processor scheduling, in: Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2010. -
55C. Ferdinand, R. Wilhelm.
Efficient and Precise Cache Behavior Prediction for Real-Time Systems, in: Real-Time Systems, 1999, vol. 17, no 2-3, pp. 131–181.
http://dx.doi.org/10.1023/A:1008186323068 -
56T. S. Karkhanis, J. E. Smith.
A First-Order Superscalar Processor Model, in: Proceedings of the International Symposium on Computer Architecture, Los Alamitos, CA, USA, IEEE Computer Society, 2004, 338 p.
http://doi.ieeecomputersociety.org/10.1109/ISCA.2004.1310786 -
57B. Lee, J. Collins, H. Wang, D. Brooks.
CPR : composable performance regression for scalable multiprocessor models, in: Proceedings of the 41st International Symposium on Microarchitecture, 2008. -
58Y. Liang, T. Mitra.
Cache modeling in probabilistic execution time analysis, in: DAC '08: Proceedings of the 45th annual conference on Design automation, New York, NY, USA, ACM, 2008, pp. 319–324.
http://doi.acm.org/10.1145/1391469.1391551 -
59T. Lundqvist, P. Stenström.
Timing Anomalies in Dynamically Scheduled Microprocessors, in: RTSS '99: Proceedings of the 20th IEEE Real-Time Systems Symposium, Washington, DC, USA, IEEE Computer Society, 1999. -
60R. L. Mattson, J. Gecsei, D. R. Slutz, I. L. Traiger.
Evaluation techniques for storage hierarchies, in: IBM Systems Journal, 1970, vol. 9, no 2, pp. 78-117. -
61L. Rauchwerger, Y. Zhan, J. Torrellas.
Hardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors, in: HPCA '98: Proceedings of the 4th International Symposium on High-Performance Computer Architecture, Washington, DC, USA, IEEE Computer Society, 1998, 162 p. -
62K. Skadron, M. Stan, W. Huang, S. Velusamy.
Temperature-aware microarchitecture, in: Proceedings of the International Symposium on Computer Architecture, 2003. -
63A. Snavely, D. M. Tullsen.
Symbiotic jobscheduling for a simultaneous multithreading processor, in: Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2000. -
64J. G. Steffan, C. Colohan, A. Zhai, T. C. Mowry.
The STAMPede approach to thread-level speculation, in: ACM Transactions on Computer Systems, 2005, vol. 23, no 3, pp. 253–300.
http://doi.acm.org/10.1145/1082469.1082471 -
65V. Suhendra, T. Mitra.
Exploring locking & partitioning for predictable shared caches on multi-cores, in: DAC '08: Proceedings of the 45th annual conference on Design automation, New York, NY, USA, ACM, 2008, pp. 300–303.
http://doi.acm.org/10.1145/1391469.1391545 -
66D. M. Tullsen, S. Eggers, H. M. Levy.
Simultaneous Multithreading: Maximizing On-Chip Parallelism, in: Proceedings of the 22th Annual International Symposium on Computer Architecture, 1995. -
67J. Yan, W. Zhan.
WCET Analysis for Multi-Core Processors with Shared L2 Instruction Caches, in: Proceedings of Real-Time and Embedded Technology and Applications Symposium, 2008. RTAS '08, 2008, pp. 80-89.