Bibliography
Major publications by the team in recent years
-
1F. Bodin, T. Kisuki, P. M. W. Knijnenburg, M. F. P. O'Boyle, E. Rohou.
Iterative Compilation in a Non-Linear Optimisation Space, in: Workshop on Profile and Feedback-Directed Compilation (FDO-1), in conjunction with PACT '98, October 1998. -
2A. Cohen, E. Rohou.
Processor Virtualization and Split Compilation for Heterogeneous Multicore Embedded Systems, in: DAC, June 2010, pp. 102–107. -
3N. Hallou, E. Rohou, P. Clauss, A. Ketterlin.
Dynamic Re-Vectorization of Binary Code, in: SAMOS, July 2015.
https://hal.inria.fr/hal-01155207 -
4D. Hardy, I. Puaut.
Static probabilistic Worst Case Execution Time Estimation for architectures with Faulty Instruction Caches, in: 21st International Conference on Real-Time Networks and Systems, Sophia Antipolis, France, October 2013. [ DOI : 10.1145/2516821.2516842 ]
https://hal.inria.fr/hal-00862604 -
5D. Hardy, I. Sideris, A. Saidi, Y. Sazeides.
EETCO: A tool to estimate and explore the implications of datacenter design choices on the tco and the environmental impact, in: Workshop on Energy-efficient Computing for a Sustainable World in conjunction with the 44th Annual IEEE/ACM International Symposium on Microarchitecture (Micro-44), 2011. -
6M.-K. Lee, P. Michaud, J. S. Sim, D. Nyang.
A simple proof of optimality for the MIN cache replacement policy, in: Information Processing Letters, September 2015, 3 p. [ DOI : 10.1016/j.ipl.2015.09.004 ]
https://hal.inria.fr/hal-01199424 -
7P. Michaud.
A Best-Offset Prefetcher Champion, in: 2nd Data Prefetching Championship, Portland, OR, USA, June 2015.
https://hal.inria.fr/hal-01165600 -
8P. Michaud, A. Mondelli, A. Seznec.
Revisiting Clustered Microarchitecture for Future Superscalar Cores: A Case for Wide Issue Clusters, in: ACM Transactions on Architecture and Code Optimization (TACO) , August 2015, vol. 13, no 3, 22 p. [ DOI : 10.1145/2800787 ]
https://hal.inria.fr/hal-01193178 -
9P. Michaud, A. Seznec.
Pushing the branch predictability limits with the multi-poTAGE+SC predictor : Champion in the unlimited category, in: 4th JILP Workshop on Computer Architecture Competitions (JWAC-4): Championship Branch Prediction (CBP-4), Minneapolis, United States, June 2014.
https://hal.archives-ouvertes.fr/hal-01087719 -
10A. Perais.
Increasing the performance of superscalar processors through value prediction, Université Rennes 1, September 2015.
https://tel.archives-ouvertes.fr/tel-01282474 -
11A. Perais, A. Seznec.
EOLE: Paving the Way for an Effective Implementation of Value Prediction, in: International Symposium on Computer Architecture, Minneapolis, MN, United States, ACM/IEEE, June 2014, vol. 42, pp. 481 - 492. [ DOI : 10.1109/ISCA.2014.6853205 ]
https://hal.inria.fr/hal-01088130 -
12A. Perais, A. Seznec.
Practical data value speculation for future high-end processors, in: International Symposium on High Performance Computer Architecture, Orlando, FL, United States, IEEE, February 2014, pp. 428 - 439. [ DOI : 10.1109/HPCA.2014.6835952 ]
https://hal.inria.fr/hal-01088116 -
13A. Perais, A. Seznec.
EOLE: Toward a Practical Implementation of Value Prediction, in: IEEE Micro, June 2015, vol. 35, no 3, pp. 114 - 124. [ DOI : 10.1109/MM.2015.45 ]
https://hal.inria.fr/hal-01193287 -
14E. Riou, E. Rohou, P. Clauss, N. Hallou, A. Ketterlin.
PADRONE: a Platform for Online Profiling, Analysis, and Optimization, in: Dynamic Compilation Everywhere, Vienna, Austria, January 2014. -
15S. Sardashti, A. Seznec, D. A. Wood.
Skewed Compressed Caches, in: 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014, Minneapolis, United States, December 2014.
https://hal.inria.fr/hal-01088050 -
16A. Sembrant, T. Carlson, E. Hagersten, D. Black-Shaffer, A. Perais, A. Seznec, P. Michaud.
Long Term Parking (LTP): Criticality-aware Resource Allocation in OOO Processors, in: International Symposium on Microarchitecture, Micro 2015, Honolulu, United States, Proceeding of the International Symposium on Microarchitecture, Micro 2015, ACM, December 2015.
https://hal.inria.fr/hal-01225019 -
17A. Seznec, P. Michaud.
A case for (partially)-tagged geometric history length predictors, in: Journal of Instruction Level Parallelism, April 2006.
http://www.jilp.org/vol8 -
18A. Seznec, J. San Miguel, J. Albericio.
The Inner Most Loop Iteration counter: a new dimension in branch history , in: 48th International Symposium On Microarchitecture, Honolulu, United States, ACM, December 2015, 11 p.
https://hal.inria.fr/hal-01208347 -
19A. Seznec.
TAGE-SC-L Branch Predictors: Champion in 32Kbits and 256 Kbits category, in: JILP - Championship Branch Prediction, Minneapolis, United States, June 2014.
https://hal.inria.fr/hal-01086920 -
20A. Suresh, B. Narasimha Swamy, E. Rohou, A. Seznec.
Intercepting Functions for Memoization: A Case Study Using Transcendental Functions, in: ACM Transactions on Architecture and Code Optimization (TACO), July 2015, vol. 12, no 2, 23 p. [ DOI : 10.1145/2751559 ]
https://hal.inria.fr/hal-01178085 -
21D. D. C. Teixeira, S. Collange, F. M. Q. Pereira.
Fusion of calling sites, in: International Symposium on Computer Architecture and High-Performance Computing (SBAC-PAD), Florianópolis, Santa Catarina, Brazil, October 2015. [ DOI : 10.1109/SBAC-PAD.2015.16 ]
https://hal.archives-ouvertes.fr/hal-01410221
Doctoral Dissertations and Habilitation Theses
-
22S. Kalathingal.
Transforming TLP into DLP with the Dynamic Inter-Thread Vectorization Architecture, Université Rennes 1, December 2016.
https://tel.archives-ouvertes.fr/tel-01426915 -
23A. Sridharan.
Adaptive and Intelligent Memory Systems, Inria Rennes - Bretagne Atlantique and University of Rennes 1, France, December 2016.
https://hal.inria.fr/tel-01442465 -
24A. Suresh.
Intercepting Functions for Memoization, Université de Rennes 1, May 2016.
https://hal.inria.fr/tel-01410539
Articles in International Peer-Reviewed Journals
-
25P. Michaud.
Some mathematical facts about optimal cache replacement, in: ACM Transactions on Architecture and Code Optimization (TACO) , December 2016, vol. 13, no 4. [ DOI : 10.1145/3017992 ]
https://hal.inria.fr/hal-01411156 -
26B. Panda.
SPAC: A Synergistic Prefetcher Aggressiveness Controller for Multi-core Systems, in: IEEE Transactions on Computers, 2016. [ DOI : 10.1109/TC.2016.2547392 ]
https://hal.inria.fr/hal-01307538 -
27A. Perais, A. Seznec.
EOLE: Combining Static and Dynamic Scheduling through Value Prediction to Reduce Complexity and Increase Performance, in: TOCS - ACM Transactions on Computer Systems, February 2016, 34 p.
https://hal.inria.fr/hal-01259139 -
28A. Perais, A. Seznec.
Storage-Free Memory Dependency Prediction, in: IEEE Computer Architecture Letters, November 2016, pp. 1 - 4. [ DOI : 10.1109/LCA.2016.2628379 ]
https://hal.inria.fr/hal-01396985 -
29S. Sardashti, A. Seznec, D. A. Wood.
Yet Another Compressed Cache: a Low Cost Yet Effective Compressed Cache, in: ACM Transactions on Architecture and Code Optimization, September 2016, 25 p.
https://hal.inria.fr/hal-01354248 -
30A. Seznec, J. San Miguel, J. Albericio.
Practical Multidimensional Branch Prediction, in: IEEE Micro, 2016. [ DOI : 10.1109/MM.2016.33 ]
https://hal.inria.fr/hal-01330510
Invited Conferences
-
31R. Iakymchuk, D. Defour, S. Collange, S. Graillat.
Reproducible and Accurate Algorithms for Numerical Linear Algebra, in: PP: Parallel Processing for Scientific Computing, Paris, France, SIAM, April 2016.
https://hal-lirmm.ccsd.cnrs.fr/lirmm-01268048 -
32C. Silvano, G. Agosta, S. Cherubin, D. Gadioli, G. Palermo, A. Bartolini, L. Benini, J. Martinovič, M. Palkovič, K. Slaninová, J. Bispo, J. M. P. Cardoso, R. Abreu, P. Pinto, C. Cavazzoni, N. Sanna, A. R. Beccari, R. Cmar, E. Rohou.
The ANTAREX Approach to Autotuning and Adaptivity for Energy Efficient HPC Systems, in: ACM International Conference on Computing Frontiers 2016, Como, Italy, May 2016. [ DOI : 10.1145/2903150.2903470 ]
https://hal.inria.fr/hal-01341826
International Conferences with Proceedings
-
33A. Bonenfant, F. Carrier, H. Cassé, P. Cuenot, D. Claraz, N. Halbwachs, H. Li, C. Maiza, M. De Michiel, V. Mussot, C. Parent-Vigouroux, I. Puaut, P. Raymond, E. Rohou, P. Sotin.
When the worst-case execution time estimation gains from the application semantics, in: 8th European Congress on Embedded Real-Time Software and Systems, Toulouse, France, January 2016.
https://hal.inria.fr/hal-01235781 -
34S. Collange, M. Joldes, J.-M. Muller, V. Popescu.
Parallel floating-point expansions for extended-precision GPU computations, in: The 27th Annual IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), London, United Kingdom, July 2016.
https://hal.archives-ouvertes.fr/hal-01298206 -
37P. Michaud.
Best-Offset Hardware Prefetching, in: International Symposium on High-Performance Computer Architecture, Barcelona, Spain, March 2016. [ DOI : 10.1109/HPCA.2016.7446087 ]
https://hal.inria.fr/hal-01254863 -
38R. E. A. Moreira, S. Collange, F. M. Q. Pereira.
Function Call Re-Vectorization, in: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Austin, Texas, United States, February 2017.
https://hal.archives-ouvertes.fr/hal-01410186 -
39B. Panda, A. Seznec.
Dictionary Sharing: An Efficient Cache Compression Scheme for Compressed Caches, in: 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016, Taipei, Taiwan, IEEE/ACM, October 2016.
https://hal.archives-ouvertes.fr/hal-01354246 -
40A. Perais, F. A. Endo, A. Seznec.
Register Sharing for Equality Prediction, in: International Symposium on Microarchitecture, Taipei, Taiwan, October 2016.
https://hal.inria.fr/hal-01354267 -
41A. Perais, A. Seznec.
Cost Effective Physical Register Sharing, in: International Symposium on High Performance Computer Architecture, Barcelona, Spain, IEEE, March 2016, vol. 22. [ DOI : 10.1109/HPCA.2016.7446105 ]
https://hal.inria.fr/hal-01259137 -
42P.-Y. Péneau, R. Bouziane, A. Gamatié, E. Rohou, F. Bruguier, G. Sassatelli, L. Torres, S. Senni.
Loop Optimization in Presence of STT-MRAM Caches: a Study of Performance-Energy Tradeoffs, in: 26th International Workshop on Power and Timing Modeling, Optimization and Simulation, Bremen, Germany, Proceedings of the 26th International Workshop on Power and Timing Modeling, Optimization and Simulation, September 2016, 8 p.
https://hal.inria.fr/hal-01347354 -
43S. A. Rashid, G. Nelissen, D. Hardy, B. Akesson, I. Puaut, E. Tovar.
Cache-Persistence-Aware Response-Time Analysis for Fixed-Priority Preemptive Systems, in: 28th Euromicro Conference on Real-Time Systems (ECRTS), Toulouse, France, IEEE, July 2016. [ DOI : 10.1109/ECRTS.2016.25 ]
https://hal.inria.fr/hal-01393220 -
44S. Rokicki, E. Rohou, S. Derrien.
Hardware-Accelerated Dynamic Binary Translation, in: IEEE/ACM Design, Automation & Test in Europe Conference & Exhibition (DATE), Lausanne, Switzerland, March 2017.
https://hal.inria.fr/hal-01423639 -
47C. Silvano, G. Agosta, A. Bartolini, A. R. Beccari, L. Benini, J. Bispo, R. Cmar, J. M. P. Cardoso, C. Cavazzoni, J. Martinovič, G. Palermo, M. Palkovič, P. Pinto, E. Rohou, N. Sanna, K. Slaninová.
AutoTuning and Adaptivity appRoach for Energy efficient eXascale HPC systems: the ANTAREX Approach, in: Design, Automation, and Test in Europe, Dresden, Germany, Design, Automation, and Test in Europe, March 2016.
https://hal.inria.fr/hal-01235741 -
49A. Suresh, E. Rohou, A. Seznec.
Compile-Time Function Memoization, in: 26th International Conference on Compiler Construction, Austin, United States, February 2017.
https://hal.inria.fr/hal-01423811
National Conferences with Proceedings
-
50S. Collange.
A Synthesizable General-Purpose SIMT Processor, in: Conférence d’informatique en Parallélisme, Architecture et Système (Compas), Lorient, France, July 2016.
https://hal.inria.fr/hal-01345070 -
51S. Rokicki, E. Rohou, S. Derrien.
Hybrid-JIT : Compilateur JIT Matériel/Logiciel pour les Processeurs VLIW Embarqués, in: Conférence d’informatique en Parallélisme, Architecture et Système (Compas), Lorient, France, July 2016.
https://hal.archives-ouvertes.fr/hal-01345306
Internal Reports
-
52S. Collange.
Simty: a Synthesizable General-Purpose SIMT Processor, Inria Rennes Bretagne Atlantique, August 2016, no RR-8944.
https://hal.inria.fr/hal-01351689 -
53S. Sardashti, A. Seznec, D. A. Wood.
Yet Another Compressed Cache: a Low Cost Yet Effective Compressed Cache, Inria, February 2016, no RR-8853, 23 p.
https://hal.inria.fr/hal-01270792
-
54L. A. Belady.
A study of replacement algorithms for a virtual-storage computer, in: IBM Systems Journal, 1966, vol. 5, no 2, pp. 78-101. -
55M. Hataba, A. El-Mahdy, E. Rohou.
OJIT: A Novel Obfuscation Approach Using Standard Just-In-Time Compiler Transformations, in: International Workshop on Dynamic Compilation Everywhere, January 2015. -
56R. Kumar, D. M. Tullsen, N. P. Jouppi, P. Ranganathan.
Heterogeneous chip multiprocessors, in: IEEE Computer, nov. 2005, vol. 38, no 11, pp. 32–38. -
57R. L. Mattson, J. Gecsei, D. R. Slutz, I. L. Traiger.
Evaluation techniques for storage hierarchies, in: IBM Systems Journal, 1970, vol. 9, no 2, pp. 78-117. -
58S. Nassif, N. Mehta, Y. Cao.
A resilience roadmap, in: Design, Automation Test in Europe Conference Exhibition (DATE), 2010, March 2010, pp. 1011-1016. -
59J. Nickolls, W. J. Dally.
The GPU computing era, in: Micro, IEEE, 2010, vol. 30, no 2, pp. 56–69. -
60R. Omar, A. El-Mahdy, E. Rohou.
Arbitrary control-flow embedding into multiple threads for obfuscation: a preliminary complexity and performance analysis, in: Proceedings of the 2nd international workshop on Security in cloud computing, ACM, 2014, pp. 51–58. -
61S. Sardashti, D. A. Wood.
Decoupled compressed cache: exploiting spatial locality for energy-optimized compressed caching, in: The 46th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-46, Davis, CA, USA, December 7-11, 2013, 2013, pp. 62–73.
http://doi.acm.org/10.1145/2540708.2540715 -
62A. Seznec, N. Sendrier.
HAVEGE: A user-level software heuristic for generating empirically strong random numbers, in: ACM Transactions on Modeling and Computer Simulation (TOMACS), 2003, vol. 13, no 4, pp. 334–346.