Bibliography

Major publications by the team in recent years

1M. Cornero, R. Costa, R. Fernández Pascual, A. Ornstein, E. Rohou.

An Experimental Environment Validating the Suitability of CLI as an Effective Deployment Format for Embedded Systems, in: Conference on HiPEAC, Göteborg, Sweden, P. Stenström, M. Dubois, M. Katevenis, R. Gupta, T. Ungerer (editors), Springer, January 2008, pp. 130–144.
2R. Costa, E. Rohou.

Comparing the size of .NET applications with native code, in: 3rd Intl Conference on Hardware/software codesign and system synthesis, Jersey City, NJ, USA, P. Eles, A. Jantsch, R. A. Bergamaschi (editors), ACM, September 2005, pp. 99–104.
3D. Hardy, I. Puaut.

WCET analysis of multi-level non-inclusive set-associative instruction caches, in: Proc. of the 29th IEEE Real-Time Systems Symposium, Barcelona, Spain, December 2008.
4T. Lafage, A. Seznec.

Choosing Representative Slices of Program Execution for Microarchitecture Simulations: A Preliminary Application to the Data Stream, in: Workload Characterization of Emerging Applications, Kluwer Academic Publishers, 2000, pp. 145–163.
5P. Michaud, Y. Sazeides, A. Seznec, T. Constantinou, D. Fetis.

A study of thread migration in temperature-constrained multi-cores, in: ACM Transactions on Architecture and Code Optimization, 2007, vol. 4, n^o 2, 9 p.
6P. Michaud, A. Seznec, S. Jourdan.

An Exploration of Instruction Fetch Requirement in Out-of-Order Superscalar Processors, in: International Journal of Parallel Programming, 2001, vol. 29, n^o 1, pp. 35-58.
7N. Prémillieu, A. Seznec.

Efficient Out-of-Order Execution of Guarded ISAs, Inria, November 2013, n^o RR-8406, 24 p.

http://hal.inria.fr/hal-00910335
8E. Rohou, B. Narasimha Swamy, A. Seznec.

Branch Prediction and the Performance of Interpreters - Don't Trust Folklore, Inria, November 2013, n^o RR-8405, 23 p.

http://hal.inria.fr/hal-00911146
9E. Rohou, M. Smith.

Dynamically managing processor temperature and power, in: Second Workshop on Feedback-Directed Optimizations, 1999.
10A. Seznec, P. Michaud.

A case for (partially)-tagged geometric history length predictors, in: Journal of Instruction Level Parallelism (http://www.jilp.org/vol8), April 2006.

http://www.jilp.org/vol8
11A. Seznec, N. Sendrier.

HAVEGE: a user-level software heuristic for generating empirically strong random numbers, in: ACM Transactions on Modeling and Computer Systems, October 2003.
12A. Seznec.

Analysis of the O-GEHL branch predictor, in: Proceedings of the 32nd Annual International Symposium on Computer Architecture, June 2005.

Publications of the year

Doctoral Dissertations and Habilitation Theses

13A. Oliveira Maroneze.

Certified Compilation and Worst-Case Execution Time Estimation, Université Rennes 1, June 2014.

https://tel.archives-ouvertes.fr/tel-01064869

Articles in International Peer-Reviewed Journals

14G. Arnold, S. Collange.

Options for Denormal Representation in Logarithmic Arithmetic, in: Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology, March 2014, vol. 77, n^o 1-2, pp. 207-220. [ DOI : 10.1007/s11265-014-0874-3 ]

https://hal.inria.fr/hal-01087059
15S. Eyerman, P. Michaud, W. Rogiest.

Multiprogram Throughput Metrics: A Systematic Approach, in: ACM Transactions on Architecture and Code Optimization, October 2014, vol. 11, n^o 3, 26 p. [ DOI : 10.1145/2663346 ]

https://hal.archives-ouvertes.fr/hal-01087743
16D. Hardy, I. Puaut.

Static Probabilistic Worst Case Execution Time Estimation for Architectures with Faulty Instruction Caches, in: Real-Time Systems, 2014, 25 p.

https://hal.inria.fr/hal-01086884
17T. Milanez, S. Collange, F. Magno Quintão Pereira, W. Meira, A. Ferreira.

Thread scheduling and memory coalescing for dynamic vectorization of SPMD workloads, in: Parallel Computing, October 2014, vol. 40, n^o 9, pp. 548–558. [ DOI : 10.1016/j.parco.2014.03.006 ]

https://hal.inria.fr/hal-01087054
18N. Prémillieu, A. Seznec.

Efficient Out-of-Order Execution of Guarded ISAs, in: ACM Transactions on Architecture and Code Optimization (TACO) , December 2014, 21 p. [ DOI : 10.1145/2677037 ]

https://hal.inria.fr/hal-01103230

International Conferences with Proceedings

19J. Abella, D. Hardy, I. Puaut, E. Quinones, F. J. Cazorla.

On the Comparison of Deterministic and Probabilistic WCET Estimation Techniques, in: 26th Euromicro Conference on Real-Time Systems, Madrid, Spain, July 2014.

https://hal.inria.fr/hal-01086875
20S. Elshobaky, A. El-Mahdy, E. Rohou, L. El-Sayed, N. ElDerini.

A lightweight incremental analysis and profiling framework for embedded devices, in: Proceedings of the 17th International Workshop on Software and Compilers for Embedded Systems, Sankt-Goar, Germany, June 2014, pp. 60-68. [ DOI : 10.1145/2609248.2609263 ]

https://hal.inria.fr/hal-01086903
21H. Li, I. Puaut, E. Rohou.

Traceability of Flow Information: Reconciling Compiler Optimizations and WCET Estimation, in: RTNS - 22nd International Conference on Real-Time Networks and Systems, Versailles, France, October 2014. [ DOI : 10.1145/2659787.2659805 ]

https://hal.inria.fr/hal-01072138
22P. Michaud.

Five poTAGEs and a COLT for an unrealistic predictor, in: 4th JILP Workshop on Computer Architecture Competitions (JWAC-4): Championship Branch Prediction (CBP-4), Minneapolis, United States, June 2014.

https://hal.archives-ouvertes.fr/hal-01087692
23P. Michaud, A. Seznec.

Pushing the branch predictability limits with the multi-poTAGE+SC predictor , in: 4th JILP Workshop on Computer Architecture Competitions (JWAC-4): Championship Branch Prediction (CBP-4), Minneapolis, United States, June 2014.

https://hal.archives-ouvertes.fr/hal-01087719
24B. Narasimha Swamy, A. Ketterlin, A. Seznec.

Hardware/Software Helper Thread Prefetching On Heterogeneous Many Cores, in: 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), Paris, France, October 2014. [ DOI : 10.1109/SBAC-PAD.2014.39 ]

https://hal.inria.fr/hal-01087752
25S. Narayanan, B. Narasimha Swamy, A. Seznec.

Impact of serial scaling of multi-threaded programs in many-core era, in: WAMCA - 5th Workshop on Applications for Multi-Core Architectures, Paris, France, October 2014. [ DOI : 10.1109/SBAC-PADW.2014.9 ]

https://hal.archives-ouvertes.fr/hal-01089446
26A. Oliveira Maroneze, S. Blazy, D. Pichardie, I. Puaut.

A Formally Verified WCET Estimation Tool, in: 14th International Workshop on Worst-Case Execution Time Analysis, Madrid, Spain, July 2014. [ DOI : 10.4230/OASIcs.WCET.2014.11 ]

https://hal.inria.fr/hal-01087194
27R. Omar, A. El-Mahdy, E. Rohou.

Arbitrary control-flow embedding into multiple threads for obfuscation: a preliminary complexity and performance analysis, in: Proceedings of the 2nd international workshop on Security in cloud computing, Kyoto, Japan, June 2014. [ DOI : 10.1145/2600075.2600080 ]

https://hal.inria.fr/hal-01086958
28A. Perais, A. Seznec.

EOLE: Paving the Way for an Effective Implementation of Value Prediction, in: International Symposium on Computer Architecture, Minneapolis, MN, United States, ACM/IEEE, June 2014, vol. 42, pp. 481 - 492. [ DOI : 10.1109/ISCA.2014.6853205 ]

https://hal.inria.fr/hal-01088130
29A. Perais, A. Seznec.

Practical data value speculation for future high-end processors, in: International Symposium on High Performance Computer Architecture, Orlando, FL, United States, IEEE, February 2014, pp. 428 - 439. [ DOI : 10.1109/HPCA.2014.6835952 ]

https://hal.inria.fr/hal-01088116
30E. Riou, E. Rohou, P. Clauss, N. Hallou, A. Ketterlin.

PADRONE: a Platform for Online Profiling, Analysis, and Optimization, in: DCE 2014 - International workshop on Dynamic Compilation Everywhere, Vienne, Austria, January 2014.

https://hal.inria.fr/hal-00917950
31E. Rohou, B. Narasimha Swamy, A. Seznec.

Branch Prediction and the Performance of Interpreters - Don't Trust Folklore, in: International Symposium on Code Generation and Optimization, Burlingame, United States, February 2015.

https://hal.inria.fr/hal-01100647
32S. Sardashti, A. Seznec, D. A. Wood.

Skewed Compressed Cache, in: 47th Annual IEEE/ACM International Symposium on Microarchitecture, Cambridge, United Kingdom, December 2014.

https://hal.inria.fr/hal-01088050
33A. Seznec.

TAGE-SC-L Branch Predictors, in: JILP - Championship Branch Prediction, Minneapolis, United States, June 2014.

https://hal.inria.fr/hal-01086920

Internal Reports

34G. Arnold, S. Collange.

Options for Denormal Representation in Logarithmic Arithmetic, January 2014, n^o RR-8412, 27 p.

https://hal.inria.fr/hal-00909096

Other Publications

35S. Collange, D. Defour, S. Graillat, R. Iakymchuk.

Full-Speed Deterministic Bit-Accurate Parallel Floating-Point Summation on Multi- and Many-Core Architectures, February 2014.

https://hal.archives-ouvertes.fr/hal-00949355
36R. Iakymchuk, D. Defour, S. Collange, S. Graillat.

Reproducible and Accurate Matrix Multiplication for GPU Accelerators, January 2015.

https://hal.archives-ouvertes.fr/hal-01102877

References in notes

37G. M. Amdahl.

Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities, in: SJCC, 1967, pp. 483–485.
38D. Burger, T. M. Austin.

The simplescalar tool set, version 2.0, 1997.
39R. S. Chappell, J. Stark, S. P. Kim, S. K. Reinhardt, Y. N. Patt.

Simultaneous subordinate microthreading (SSMT), in: ISCA '99: Proceedings of the 26th annual international symposium on Computer architecture, Washington, DC, USA, IEEE Computer Society, 1999, pp. 186–195.

http://doi.acm.org/10.1145/300979.300995
40C. Ferdinand, R. Wilhelm.

Efficient and Precise Cache Behavior Prediction for Real-Time Systems, in: Real-Time Systems, 1999, vol. 17, n^o 2-3, pp. 131–181.

http://dx.doi.org/10.1023/A:1008186323068
41M. D. Hill, M. R. Marty.

Amdahl's Law in the Multicore Era, in: IEEE Computer, 2008.
42T. S. Karkhanis, J. E. Smith.

A First-Order Superscalar Processor Model, in: Proceedings of the International Symposium on Computer Architecture, Los Alamitos, CA, USA, IEEE Computer Society, 2004, 338 p.

http://doi.ieeecomputersociety.org/10.1109/ISCA.2004.1310786
43B. Lee, J. Collins, H. Wang, D. Brooks.

CPR : composable performance regression for scalable multiprocessor models, in: Proceedings of the 41st International Symposium on Microarchitecture, 2008.
44Y. Liang, T. Mitra.

Cache modeling in probabilistic execution time analysis, in: DAC '08: Proceedings of the 45th annual conference on Design automation, New York, NY, USA, ACM, 2008, pp. 319–324.

http://doi.acm.org/10.1145/1391469.1391551
45T. Lundqvist, P. Stenström.

Timing Anomalies in Dynamically Scheduled Microprocessors, in: RTSS '99: Proceedings of the 20th IEEE Real-Time Systems Symposium, Washington, DC, USA, IEEE Computer Society, 1999.
46L. Rauchwerger, Y. Zhan, J. Torrellas.

Hardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors, in: HPCA '98: Proceedings of the 4th International Symposium on High-Performance Computer Architecture, Washington, DC, USA, IEEE Computer Society, 1998, 162 p.
47S. Sardashti, D. A. Wood.

Decoupled compressed cache: exploiting spatial locality for energy-optimized compressed caching, in: The 46th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-46, Davis, CA, USA, December 7-11, 2013, 2013, pp. 62–73.

http://doi.acm.org/10.1145/2540708.2540715
48T. Sherwood, E. Perelman, G. Hamerly, B. Calder.

Automatically characterizing large scale program behavior, in: In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems, 2002, pp. 45–57.
49K. Skadron, M. Stan, W. Huang, S. Velusamy.

Temperature-aware microarchitecture, in: Proceedings of the International Symposium on Computer Architecture, 2003.
50J. G. Steffan, C. Colohan, A. Zhai, T. C. Mowry.

The STAMPede approach to thread-level speculation, in: ACM Transactions on Computer Systems, 2005, vol. 23, n^o 3, pp. 253–300.

http://doi.acm.org/10.1145/1082469.1082471
51V. Suhendra, T. Mitra.

Exploring locking & partitioning for predictable shared caches on multi-cores, in: DAC '08: Proceedings of the 45th annual conference on Design automation, New York, NY, USA, ACM, 2008, pp. 300–303.

http://doi.acm.org/10.1145/1391469.1391545
52D. M. Tullsen, S. Eggers, H. M. Levy.

Simultaneous Multithreading: Maximizing On-Chip Parallelism, in: Proceedings of the 22th Annual International Symposium on Computer Architecture, 1995.
53J. Yan, W. Zhan.

WCET Analysis for Multi-Core Processors with Shared L2 Instruction Caches, in: Proceedings of Real-Time and Embedded Technology and Applications Symposium, 2008. RTAS '08, 2008, pp. 80-89.

Previous |

Home