Inria | Raweb 2015 | Presentation of the Project-Team ALF | ALF Web Site


	PDF	e-Pub

Previous |

Home

Bibliography

Major publications by the team in recent years

1M. Cornero, R. Costa, R. Fernández Pascual, A. Ornstein, E. Rohou.
An Experimental Environment Validating the Suitability of CLI as an Effective Deployment Format for Embedded Systems, in: Conference on HiPEAC, Göteborg, Sweden, P. Stenström, M. Dubois, M. Katevenis, R. Gupta, T. Ungerer (editors), Springer, January 2008, pp. 130–144.
2R. Costa, E. Rohou.
Comparing the size of .NET applications with native code, in: 3rd Intl Conference on Hardware/software codesign and system synthesis, Jersey City, NJ, USA, P. Eles, A. Jantsch, R. A. Bergamaschi (editors), ACM, September 2005, pp. 99–104.
3D. Hardy, I. Puaut.
WCET analysis of multi-level non-inclusive set-associative instruction caches, in: Proc. of the 29th IEEE Real-Time Systems Symposium, Barcelona, Spain, December 2008.
4D. Hardy, I. Puaut.
Static probabilistic Worst Case Execution Time Estimation for architectures with Faulty Instruction Caches, in: 21st International Conference on Real-Time Networks and Systems, Sophia Antipolis, France, October 2013. [ DOI : 10.1145/2516821.2516842 ]
https://hal.inria.fr/hal-00862604
5P. Michaud, Y. Sazeides, A. Seznec, T. Constantinou, D. Fetis.
A study of thread migration in temperature-constrained multi-cores, in: ACM Transactions on Architecture and Code Optimization, 2007, vol. 4, n^o 2, 9 p.
6P. Michaud, A. Seznec, S. Jourdan.
An Exploration of Instruction Fetch Requirement in Out-of-Order Superscalar Processors, in: International Journal of Parallel Programming, 2001, vol. 29, n^o 1, pp. 35-58.
7A. Perais, A. Seznec.
Practical data value speculation for future high-end processors, in: International Symposium on High Performance Computer Architecture, Orlando, FL, United States, IEEE, February 2014, pp. 428 - 439. [ DOI : 10.1109/HPCA.2014.6835952 ]
https://hal.inria.fr/hal-01088116
8N. Prémillieu, A. Seznec.
Efficient Out-of-Order Execution of Guarded ISAs, in: ACM Transactions on Architecture and Code Optimization (TACO) , December 2014, 21 p. [ DOI : 10.1145/2677037 ]
https://hal.inria.fr/hal-01103230
9E. Rohou, M. Smith.
Dynamically managing processor temperature and power, in: Second Workshop on Feedback-Directed Optimizations, 1999.
10A. Seznec, P. Michaud.
A case for (partially)-tagged geometric history length predictors, in: Journal of Instruction Level Parallelism (http://www.jilp.org/vol8), April 2006.
http://www.jilp.org/vol8

Publications of the year

Doctoral Dissertations and Habilitation Theses

11H. Li.
Extraction and traceability of annotations for WCET estimation, Université Rennes 1, October 2015.
https://tel.archives-ouvertes.fr/tel-01232613
12B. Narasimha Swamy.
Exploiting heterogeneous manycores on sequential code, UNIVERSITE DE RENNES 1, March 2015.
https://hal.inria.fr/tel-01126807
13S. N. Natarajan.
Modeling performance of serial and parallel sections of multi-threaded programs in manycore era, Inria Rennes - Bretagne Atlantique and University of Rennes 1, France, June 2015.
https://hal.inria.fr/tel-01170039
14A. Perais.
Increasing the Performance of Superscalar Processors through Value Prediction, Rennes 1, September 2015.
https://hal.inria.fr/tel-01235370
15E. Rohou.
Infrastructures and Compilation Strategies for the Performance of Computing Systems, Université de Rennes 1, November 2015, Habilitation à diriger des recherches.
https://hal.inria.fr/tel-01237164

Articles in International Peer-Reviewed Journals

16S. Collange, D. Defour, S. Graillat, R. Iakymchuk.
Numerical Reproducibility for the Parallel Reduction on Multi- and Many-Core Architectures, in: Parallel Computing, September 2015, vol. 49, pp. 83-97. [ DOI : 10.1016/j.parco.2015.09.001 ]
http://hal-lirmm.ccsd.cnrs.fr/lirmm-01206348
17M.-K. Lee, P. Michaud, J. S. Sim, D. Nyang.
A simple proof of optimality for the MIN cache replacement policy, in: Information Processing Letters, September 2015, 3 p. [ DOI : 10.1016/j.ipl.2015.09.004 ]
https://hal.inria.fr/hal-01199424
18P. Michaud, A. Mondelli, A. Seznec.
Revisiting Clustered Microarchitecture for Future Superscalar Cores: A Case for Wide Issue Clusters, in: ACM Transactions on Architecture and Code Optimization (TACO) , August 2015, vol. 13, n^o 3, 22 p. [ DOI : 10.1145/2800787 ]
https://hal.inria.fr/hal-01193178
19A. Perais, A. Seznec.
EOLE: Toward a Practical Implementation of Value Prediction, in: IEEE Micro, June 2015, vol. 35, n^o 3, pp. 114 - 124. [ DOI : 10.1109/MM.2015.45 ]
https://hal.inria.fr/hal-01193287
20A. Suresh, B. Narasimha Swamy, E. Rohou, A. Seznec.
Intercepting Functions for Memoization: A Case Study Using Transcendental Functions, in: ACM Transactions on Architecture and Code Optimization (TACO) , July 2015, vol. 12, n^o 2, 23 p. [ DOI : 10.1145/2751559 ]
https://hal.inria.fr/hal-01178085

International Conferences with Proceedings

21A. Bonenfant, F. Carrier, H. Cassé, P. Cuenot, D. Claraz, N. Halbwachs, H. Li, C. Maiza, M. De Michiel, V. Mussot, C. Parent-Vigouroux, I. Puaut, P. Raymond, E. Rohou, P. Sotin.
When the worst-case execution time estimation gains from the application semantics, in: 8th European Congress on Embedded Real-Time Software and Systems, Toulouse, France, January 2016.
https://hal.inria.fr/hal-01235781
22S. Eyerman, P. Michaud, W. Rogiest.
Revisiting Symbiotic Job Scheduling, in: IEEE International Symposium on Performance Analysis of Systems and Software, Philadelphia, United States, March 2015. [ DOI : 10.1109/ISPASS.2015.7095791 ]
https://hal.inria.fr/hal-01139807
23N. Hallou, E. Rohou, P. Clauss, A. Ketterlin.
Dynamic Re-Vectorization of Binary Code, in: International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation - SAMOS XV, Agios Konstantinos, Greece, July 2015.
https://hal.inria.fr/hal-01155207
24M. Hataba, A. El-Mahdy, E. Rohou.
OJIT: A Novel Obfuscation Approach Using Standard Just-In-Time Compiler Transformations, in: International Workshop on Dynamic Compilation Everywhere, Amsterdam, Netherlands, January 2015.
https://hal.inria.fr/hal-01162998
25R. Iakymchuk, D. Defour, S. Collange, S. Graillat.
Reproducible Triangular Solvers for High-Performance Computing, in: ITNG'2015: 12th International Conference on Information Technology - New Generations, Las Vegas, NV, United States, April 2015, pp. 353-358. [ DOI : 10.1109/ITNG.2015.63 ]
http://hal-lirmm.ccsd.cnrs.fr/lirmm-01206371
26H. Li, I. Puaut, E. Rohou.
Tracing Flow Information for Tighter WCET Estimation: Application to Vectorization, in: 21st IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, Hong-Kong, China, August 2015, 10 p.
https://hal.inria.fr/hal-01177902
28S. N. Natarajan, B. Narasimha Swamy, A. Seznec.
An Empirical High Level Performance Model For FutureMany-cores, in: Proceedings of the 12th ACM International Conference on Computing Frontiers, Ischia, Italy, 2015. [ DOI : 10.1145/2742854.2742867 ]
https://hal.inria.fr/hal-01170038
29S. N. Natarajan, A. Seznec.
Sequential and Parallel Code Sections are Different: they may require different Processors, in: PARMA-DITAM '15 - 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures, Amsterdam, Netherlands, 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and 4th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms, ACM, January 2015, pp. 13-18. [ DOI : 10.1145/2701310.2701314 ]
https://hal.inria.fr/hal-01170061
30M.-M. Papadopoulou, X. Tong, A. Seznec, A. Moshovos.
Prediction-based superpage-friendly TLB designs, in: 21st IEEE symposium on High Performance Computer Architecture, San Francisco, United States, 2015. [ DOI : 10.1109/HPCA.2015.7056034 ]
https://hal.inria.fr/hal-01193176
31A. Perais, A. Seznec, P. Michaud, A. Sembrant, E. Hagersten.
Cost-Effective Speculative Scheduling in High Performance Processors, in: International Symposium on Computer Architecture, Portland, United States, Proceedings of the International Symposium on Computer Architecture, ACM/IEEE, June 2015, vol. 42, pp. 247-259. [ DOI : 10.1145/2749469.2749470 ]
https://hal.inria.fr/hal-01193233
32A. Perais, A. Seznec.
BeBoP: A Cost Effective Predictor Infrastructure for Superscalar Value Prediction, in: International Symposium on High Performance Computer Architecture, San Francisco, United States, IEEE, February 2015, vol. 21, pp. 13 - 25 ). [ DOI : 10.1109/HPCA.2015.7056018 ]
https://hal.inria.fr/hal-01193175
33E. Rohou, D. Guyon.
Sequential Performance: Raising Awareness of the Gory Details, in: International Conference on Computational Science, Reykjavik, Iceland, June 2015. [ DOI : 10.1016/j.procs.2015.05.347 ]
https://hal.inria.fr/hal-01162336
34E. Rohou, B. Narasimha Swamy, A. Seznec.
Branch Prediction and the Performance of Interpreters - Don't Trust Folklore, in: International Symposium on Code Generation and Optimization, Burlingame, United States, February 2015.
https://hal.inria.fr/hal-01100647
35A. Sembrant, T. Carlson, E. Hagersten, D. Black-Shaffer, A. Perais, A. Seznec, P. Michaud.
Long Term Parking (LTP): Criticality-aware Resource Allocation in OOO Processors, in: International Symposium on Microarchitecture, Micro 2015, Honolulu, United States, Proceeding of the International Symposium on Microarchitecture, Micro 2015, ACM, December 2015, 11 p.
https://hal.inria.fr/hal-01225019
36A. Seznec, J. San Miguel, J. Albericio.
The Inner Most Loop Iteration counter: a new dimension in branch history , in: 48th International Symposium On Microarchitecture, Honolulu, United States, ACM, December 2015, 11 p.
https://hal.inria.fr/hal-01208347
37A. Seznec.
Bank-interleaved cache or memory indexing does not require euclidean division, in: 11th Annual Workshop on Duplicating, Deconstructing and Debunking, Portland, United States, June 2015.
https://hal.inria.fr/hal-01208356
38C. Silvano, G. Agosta, A. Bartolini, A. R. Beccari, L. Benini, J. Bispo, R. Cmar, J. M. P. Cardoso, C. Cavazzoni, J. Martinovič, G. Palermo, M. Palkovič, P. Pinto, E. Rohou, N. Sanna, K. Slaninová.
AutoTuning and Adaptivity appRoach for Energy efficient eXascale HPC systems: the ANTAREX Approach, in: Design, Automation, and Test in Europe, Dresden, Germany, Design, Automation, and Test in Europe, March 2016.
https://hal.inria.fr/hal-01235741
39C. Silvano, G. Agosta, A. Bartolini, A. Beccari, L. Benini, J. M. P. Cardoso, C. Cavazzoni, J. Martinovič, G. Palermo, M. Palkovič, E. Rohou, N. Sanna, K. Slaninova.
ANTAREX – AutoTuning and Adaptivity appRoach for Energy efficient eXascale HPC systems, in: 18th IEEE International Conference on Computational Science and Engineering, Porto, Portugal, October 2015.
https://hal.inria.fr/hal-01235713
40M. Suzana, J. Abella, D. Hardy, E. Quinones, I. Puaut, F. J. Cazorla.
Speeding up Static Probabilistic Timing Analysis, in: International Conference on Architecture of Computing Systems, Porto, Portugal, Springer Lecture Notes on Computer Science (LNCS) series, March 2015.
https://hal.inria.fr/hal-01235544

Conferences without Proceedings

41V. A. NGUYEN, D. Hardy, I. Puaut.
Scheduling of parallel applications on many-core architectures with caches: bridging the gap between WCET analysis and schedulability analysis, in: 9th Junior Researcher Workshop on Real-Time Computing (JRWRTC 2015), Lille, France, November 2015.
https://hal.inria.fr/hal-01236191

Internal Reports

42S. Kalathingal, S. Collange, B. Narasimha Swamy, A. Seznec.
Transforming TLP into DLP with the Dynamic Inter-Thread Vectorization Architecture, Inria Rennes Bretagne Atlantique, December 2015, n^o RR-8830.
https://hal.inria.fr/hal-01244938
43A. Sridharan, A. Seznec.
Discrete Cache Insertion Policies for Shared Last Level Cache Management on Large Multicores, Inria-IRISA Rennes Bretagne Atlantique, équipe ALF, December 2015, n^o RR-8816.
https://hal.inria.fr/hal-01236706

Other Publications

44S. Collange, D. Defour, S. Graillat, R. Iakymchuk.
Numerical Reproducibility for the Parallel Reduction on Multi- and Many-Core Architectures, September 2015, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-00949355
45R. Iakymchuk, S. Collange, D. Defour, S. Graillat.
ExBLAS: Reproducible and Accurate BLAS Library, July 2015, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01202396
46R. Iakymchuk, S. Collange, D. Defour, S. Graillat.
Reproducibility and Accuracy for High-Performance Computing, April 2015, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01140531
47R. Iakymchuk, D. Defour, S. Collange, S. Graillat.
Reproducible and Accurate Matrix Multiplication for GPU Accelerators, January 2015, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01102877
48R. Iakymchuk, D. Defour, S. Collange, S. Graillat.
Reproducible Triangular Solvers for High-Performance Computing, February 2015, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01116588
49R. Iakymchuk, S. Graillat, S. Collange, D. Defour.
ExBLAS: Reproducible and Accurate BLAS Library, April 2015, RAIM'2015: 7ème Rencontre Arithmétique de l'Informatique Mathématique, Poster.
https://hal.archives-ouvertes.fr/hal-01140280

References in notes

50G. M. Amdahl.
Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities, in: SJCC, 1967, pp. 483–485.
51L. A. Belady.
A study of replacement algorithms for a virtual-storage computer, in: IBM Systems Journal, 1966, vol. 5, n^o 2, pp. 78-101.
52D. Burger, T. M. Austin.
The simplescalar tool set, version 2.0, 1997.
53R. S. Chappell, J. Stark, S. P. Kim, S. K. Reinhardt, Y. N. Patt.
Simultaneous subordinate microthreading (SSMT), in: ISCA '99: Proceedings of the 26th annual international symposium on Computer architecture, Washington, DC, USA, IEEE Computer Society, 1999, pp. 186–195.
http://doi.acm.org/10.1145/300979.300995
54S. Eyerman, L. Eeckhout.
Probabilistic job symbiosis modeling for SMT processor scheduling, in: Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2010.
55C. Ferdinand, R. Wilhelm.
Efficient and Precise Cache Behavior Prediction for Real-Time Systems, in: Real-Time Systems, 1999, vol. 17, n^o 2-3, pp. 131–181.
http://dx.doi.org/10.1023/A:1008186323068
56T. S. Karkhanis, J. E. Smith.
A First-Order Superscalar Processor Model, in: Proceedings of the International Symposium on Computer Architecture, Los Alamitos, CA, USA, IEEE Computer Society, 2004, 338 p.
http://doi.ieeecomputersociety.org/10.1109/ISCA.2004.1310786
57B. Lee, J. Collins, H. Wang, D. Brooks.
CPR : composable performance regression for scalable multiprocessor models, in: Proceedings of the 41st International Symposium on Microarchitecture, 2008.
58Y. Liang, T. Mitra.
Cache modeling in probabilistic execution time analysis, in: DAC '08: Proceedings of the 45th annual conference on Design automation, New York, NY, USA, ACM, 2008, pp. 319–324.
http://doi.acm.org/10.1145/1391469.1391551
59T. Lundqvist, P. Stenström.
Timing Anomalies in Dynamically Scheduled Microprocessors, in: RTSS '99: Proceedings of the 20th IEEE Real-Time Systems Symposium, Washington, DC, USA, IEEE Computer Society, 1999.
60R. L. Mattson, J. Gecsei, D. R. Slutz, I. L. Traiger.
Evaluation techniques for storage hierarchies, in: IBM Systems Journal, 1970, vol. 9, n^o 2, pp. 78-117.
61L. Rauchwerger, Y. Zhan, J. Torrellas.
Hardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors, in: HPCA '98: Proceedings of the 4th International Symposium on High-Performance Computer Architecture, Washington, DC, USA, IEEE Computer Society, 1998, 162 p.
62K. Skadron, M. Stan, W. Huang, S. Velusamy.
Temperature-aware microarchitecture, in: Proceedings of the International Symposium on Computer Architecture, 2003.
63A. Snavely, D. M. Tullsen.
Symbiotic jobscheduling for a simultaneous multithreading processor, in: Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2000.
64J. G. Steffan, C. Colohan, A. Zhai, T. C. Mowry.
The STAMPede approach to thread-level speculation, in: ACM Transactions on Computer Systems, 2005, vol. 23, n^o 3, pp. 253–300.
http://doi.acm.org/10.1145/1082469.1082471
65V. Suhendra, T. Mitra.
Exploring locking & partitioning for predictable shared caches on multi-cores, in: DAC '08: Proceedings of the 45th annual conference on Design automation, New York, NY, USA, ACM, 2008, pp. 300–303.
http://doi.acm.org/10.1145/1391469.1391545
66D. M. Tullsen, S. Eggers, H. M. Levy.
Simultaneous Multithreading: Maximizing On-Chip Parallelism, in: Proceedings of the 22th Annual International Symposium on Computer Architecture, 1995.
67J. Yan, W. Zhan.
WCET Analysis for Multi-Core Processors with Shared L2 Instruction Caches, in: Proceedings of Real-Time and Embedded Technology and Applications Symposium, 2008. RTAS '08, 2008, pp. 80-89.

Previous |

Home