Bibliography

Major publications by the team in recent years

1F. Belletti, S. F. Schifano, R. Tripiccione, F. Bodin, P. Boucaud, J. Micheli, O. Pene, N. Cabibbo, S. de Luca, A. Lonardo, D. Rossetti, P. Vicini, M. Lukyanov, L. Morin, N. Paschedag, H. Simma, V. Morenas, D. Pleiter, F. Rapuano.

Computing for LQCD: ApeNEXT, in: Computing in Science and Engineering, 2006, vol. 8, n^o 1, p. 18–29.

http://dx.doi.org/10.1109/MCSE.2006.4
2F. Bodin, A. Seznec.

Skewed associativity improves performance and enhances predictability, in: IEEE Transactions on Computers, May 1997.
3M. Cornero, R. Costa, R. Fernández Pascual, A. Ornstein, E. Rohou.

An Experimental Environment Validating the Suitability of CLI as an Effective Deployment Format for Embedded Systems, in: Conference on HiPEAC, Göteborg, Sweden, P. Stenström, M. Dubois, M. Katevenis, R. Gupta, T. Ungerer (editors), Springer, January 2008, p. 130–144.
4R. Costa, E. Rohou.

Comparing the size of .NET applications with native code, in: 3rd Intl Conference on Hardware/software codesign and system synthesis, Jersey City, NJ, USA, P. Eles, A. Jantsch, R. A. Bergamaschi (editors), ACM, September 2005, p. 99–104.
5J. Dusser, T. Piquet, A. Seznec.

Zero-content augmented caches, in: Proceedings of the 23rd international conference on Supercomputing, New York, NY, USA, ICS '09, ACM, 2009, p. 46–55.

http://doi.acm.org/10.1145/1542275.1542288
6D. Hardy, I. Puaut.

WCET analysis of multi-level non-inclusive set-associative instruction caches, in: Proc. of the 29th IEEE Real-Time Systems Symposium, Barcelona, Spain, December 2008.
7T. Lafage, A. Seznec.

Choosing Representative Slices of Program Execution for Microarchitecture Simulations: A Preliminary Application to the Data Stream, in: In Workload Characterization of Emerging Applications, Kluwer Academic Publishers, 2000, p. 145–163.
8P. Michaud.

Exploiting the Cache Capacity of a Single-chip Multi-core Processor with Execution Migration, in: Proceedings of the 10th International Conference on High-Performance Computer Architecture (HPCA-10 2004), IEEE Computer Society, January 2004.
9P. Michaud.

STiMuL: a Software for Modeling Steady-State Temperature in Multilayers - Description and user manual, INRIA, Apr 2010, RT-0385.

http://hal.inria.fr/inria-00474286
10P. Michaud, Y. Sazeides, A. Seznec, T. Constantinou, D. Fetis.

A study of thread migration in temperature-constrained multi-cores, in: ACM Transactions on Architecture and Code Optimization, 2007, vol. 4, n^o 2, 9 p.
11P. Michaud, Y. Sazeides, A. Seznec.

Proposition for a Sequential Accelerator in Future General-Purpose Manycore Processors and the Problem of Migration-Induced Cache Misses, in: ACM International Conference on Computing Frontiers, Italie Bertinoro, May 2010.

http://hal.inria.fr/inria-00471410
12P. Michaud, A. Seznec, S. Jourdan.

An Exploration of Instruction Fetch Requirement in Out-of-Order Superscalar Processors, in: International Journal of Parallel Programming, 2001, vol. 29, n^o 1, p. 35-58.
13E. Rohou, M. Smith.

Dynamically managing processor temperature and power, in: Second Workshop on Feedback-Directed Optimizations, 1999.
14A. Seznec, S. Felix, V. Krishnan, Y. Sazeides.

Design trade-offs on the EV8 branch predictor, in: Proceedings of the 29th International Symposium on Computer Architecture (IEEE-ACM), Anchorage, May 2002.
15A. Seznec, P. Michaud.

A case for (partially)-tagged geometric history length predictors, in: Journal of Instruction Level Parallelism (http://www.jilp.org/vol8), April 2006.

http://www.jilp.org/vol8
16A. Seznec, N. Sendrier.

HAVEGE: a user-level software heuristic for generating empirically strong random numbers, in: ACM Transactions on Modeling and Computer Systems, October 2003.
17A. Seznec.

Analysis of the O-GEHL branch predictor, in: Proceedings of the 32nd Annual International Symposium on Computer Architecture, June 2005.
18A. Seznec.

The L-TAGE Branch Predictor, in: Journal of Instruction Level Parallelism, May 2007.

http://www.jilp.org/vol9
19A. Seznec.

A Phase Change Memory as a Secure Main Memory, in: IEEE Computer Architecture Letters, Feb 2010.

http://hal.inria.fr/inria-00468866
20A. Seznec.

Decoupled sectored caches: conciliating low tag implementation cost, in: SIGARCH Comput. Archit. News, 1994, vol. 22, n^o 2, p. 384–393.

http://doi.acm.org/10.1145/192007.192072

Publications of the year

Articles in International Peer-Reviewed Journal

21D. Hardy, I. Puaut.

WCET analysis of instruction cache hierarchies, in: Journal of system architecture, August 2011, vol. 57, n^o 7. [ DOI : 10.1016/j.sysarc.2010.08.007 ]

http://hal.inria.fr/hal-00639454/en
22N. Prémillieu, A. Seznec.

SYRANT: SYmmetric Resource Allocation on Not-taken and Taken Paths, in: ACM Transactions on Architecture and Code Optimization, Special Issue:Proceedings of the 2012 International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC'12), January 2012, to appear.
23E. Rohou, K. Williams, D. Yuste.

Vectorization Techonology to Improve Interpreter Performance, in: ACM Transactions on Architecture and Code Optimization, Special Issue:Proceedings of the 2012 International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC'12), January 2012, to appear.
24H. Vandierendonck, A. Seznec.

Fairness Metrics for Multithreaded Processors, in: IEEE Computer Architecture Letters, January 2011. [ DOI : 10.1109/L-CA.2011.1 ]

http://hal.inria.fr/inria-00564560/en
25H. Vandierendonck, A. Seznec.

Managing SMT resource usage through speculative instruction window weighting, in: ACM Transactions on Architecture and Code Optimization, October 2011. [ DOI : 10.1145/2019608.2019611 ]

http://hal.inria.fr/hal-00639171/en

International Conferences with Proceedings

26A. Bouakaz, I. Puaut, E. Rohou.

Predictable Binary Code Cache: A First Step Towards Reconciling Predictability and Just-In-Time Compilation, in: The 17th IEEE Real-Time and Embedded Technology and Applications Symposium, Chicago, United States, Marco Caccamo, April 2011.

http://hal.inria.fr/inria-00589690/en
27J. Dusser, A. Seznec.

Decoupled Zero-Compressed Memory, in: Proceeding HiPEAC '11 Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers, Heraklion, Greece, HiPEAC, January 2011. [ DOI : 10.1145/1944862.1944876 ]

http://hal.inria.fr/inria-00638904/en
28D. Hardy, B. Lesage, I. Puaut.

Scalable Fixed-Point Free Instruction Cache Analysis, in: The 32nd IEEE Real-Time Systems Symposium (RTSS 2011), Vienne, Austria, November 2011.

http://hal.inria.fr/inria-00638698/en
29J. Marinho, V. Nélis, S. M. Petters, I. Puaut.

Preemption Delay Analysis for Floating Non-Preemptive Region Scheduling, in: DATE 2012: Design, Automation and Test in Europe, March 2012, To appear.
30P. Michaud.

Replacement policies for shared caches on symmetric multicores : a programmer-centric point of view, in: 6th International Conference on High-Performance and Embedded Architectures and Compilers, Heraklion, Greece, January 2011. [ DOI : 10.1145/1944862.1944890 ]

http://hal.inria.fr/inria-00531188/en
31D. Nuzman, S. Dyshel, E. Rohou, I. Rosen, K. Williams, D. Yuste, A. Cohen, A. Zaks.

Vapor SIMD: Auto-Vectorize Once, Run Everywhere, in: International Symposium on Code Generation and Optimization, Chamonix, France, Olivier Temam, April 2011.

http://hal.inria.fr/inria-00589692/en
32M. Qureshi, A. Seznec, L. Luis, M. Franceschini.

Practical and secure PCM systems by online detection of malicious write streams, in: 2011 IEEE 17th International Symposium on High Performance Computer Architecture, San Antonio, United States, IEEE, February 2011. [ DOI : 10.1109/HPCA.2011.5749753 ]

http://hal.inria.fr/inria-00638950/en
33E. Rohou, S. Dyshel, D. Nuzman, I. Rosen, K. Williams, A. Cohen, A. Zaks.

Speculatively Vectorized Bytecode, in: International Conference on High-Performance and Embedded Architectures and Compilers, Heraklion, Greece, ACM, January 2011.

http://hal.inria.fr/inria-00525139/en
34A. Seznec.

A 64 Kbytes ISL-TAGE branch predictor, in: JWAC-2: Championship Branch Prediction, San Jose, United States, JILP, June 2011.

http://hal.inria.fr/hal-00639040/en
35A. Seznec.

A 64-Kbytes ITTAGE indirect branch predictor, in: JWAC-2: Championship Branch Prediction, San Jose, United States, JILP, June 2011.

http://hal.inria.fr/hal-00639041/en
36A. Seznec.

A New Case for the TAGE Branch Predictor, in: MICRO 2011 : The 44th Annual IEEE/ACM International Symposium on Microarchitecture, Porto Allegre, Brazil, ACM (editor), ACM-IEEE, December 2011.

http://hal.inria.fr/hal-00639193/en
37A. Seznec.

Storage Free Confidence Estimator for the TAGE predictor, in: 17th High Performance Computer Architecture, San Antonio, United States, IEEE, February 2011. [ DOI : 10.1109/HPCA.2011.5749750 ]

http://hal.inria.fr/inria-00638890/en
38M. M. Waliullah, P. Stenstrom.

Classification and Elimination of Conflicts in Hardware Transactional Memory Systems, in: 23rd International Symposium on Computer Architecture and High Performance Computing - SBAC-PAD'2011, Vitoria, Brazil, IEEE, October 2011.

http://hal.inria.fr/hal-00640813/en

Internal Reports

39J. Lai, A. Seznec.

TEG: GPU Performance Estimation Using a Timing Model, INRIA, November 2011, n^o RR-7804.

http://hal.inria.fr/hal-00641726/en
40P. Michaud.

Hardware acceleration of sequential loops, INRIA, November 2011, n^o RR-7802.

http://hal.inria.fr/hal-00641350/en
41E. Rohou.

Tiptop: Hardware Performance Counters for the Masses, INRIA, November 2011, n^o RR-7789.

http://hal.inria.fr/hal-00639173/en
42R. A. Velasquez, P. Michaud, A. Seznec.

BADCO: Behavioral Application-Dependent superscalar Core Models, INRIA, November 2011, n^o RR-7795.

http://hal.inria.fr/hal-00641446/en

References in notes

43G. M. Amdahl.

Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities, in: SJCC., 1967, p. 483–485.
44D. Burger, T. M. Austin.

The simplescalar tool set, version 2.0, 1997.
45R. S. Chappell, J. Stark, S. P. Kim, S. K. Reinhardt, Y. N. Patt.

Simultaneous subordinate microthreading (SSMT), in: ISCA '99: Proceedings of the 26th annual international symposium on Computer architecture, Washington, DC, USA, IEEE Computer Society, 1999, p. 186–195.

http://doi.acm.org/10.1145/300979.300995
46C. Ferdinand, R. Wilhelm.

Efficient and Precise Cache Behavior Prediction for Real-Time Systems, in: Real-Time Syst., 1999, vol. 17, n^o 2-3, p. 131–181.

http://dx.doi.org/10.1023/A:1008186323068
47T. S. Karkhanis, J. E. Smith.

A First-Order Superscalar Processor Model, in: Proceedings of the International Symposium on Computer Architecture, Los Alamitos, CA, USA, IEEE Computer Society, 2004, 338 p.

http://doi.ieeecomputersociety.org/10.1109/ISCA.2004.1310786
48B. Lee, J. Collins, H. Wang, D. Brooks.

CPR : composable performance regression for scalable multiprocessor models, in: Proceedings of the 41st International Symposium on Microarchitecture, 2008.
49Y. Liang, T. Mitra.

Cache modeling in probabilistic execution time analysis, in: DAC '08: Proceedings of the 45th annual conference on Design automation, New York, NY, USA, ACM, 2008, p. 319–324.

http://doi.acm.org/10.1145/1391469.1391551
50T. Lundqvist, P. Stenström.

Timing Anomalies in Dynamically Scheduled Microprocessors, in: RTSS '99: Proceedings of the 20th IEEE Real-Time Systems Symposium, Washington, DC, USA, IEEE Computer Society, 1999.
51L. Rauchwerger, Y. Zhan, J. Torrellas.

Hardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors, in: HPCA '98: Proceedings of the 4th International Symposium on High-Performance Computer Architecture, Washington, DC, USA, IEEE Computer Society, 1998, 162 p.
52T. Sherwood, E. Perelman, G. Hamerly, B. Calder.

Automatically characterizing large scale program behavior, in: In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems, 2002, p. 45–57.
53K. Skadron, M. Stan, W. Huang, S. Velusamy.

Temperature-aware microarchitecture, in: Proceedings of the International Symposium on Computer Architecture, 2003.
54J. G. Steffan, C. Colohan, A. Zhai, T. C. Mowry.

The STAMPede approach to thread-level speculation, in: ACM Trans. Comput. Syst., 2005, vol. 23, n^o 3, p. 253–300.

http://doi.acm.org/10.1145/1082469.1082471
55V. Suhendra, T. Mitra.

Exploring locking & partitioning for predictable shared caches on multi-cores, in: DAC '08: Proceedings of the 45th annual conference on Design automation, New York, NY, USA, ACM, 2008, p. 300–303.

http://doi.acm.org/10.1145/1391469.1391545
56D. M. Tullsen, S. Eggers, H. M. Levy.

Simultaneous Multithreading: Maximizing On-Chip Parallelism, in: Proceedings of the 22th Annual International Symposium on Computer Architecture, 1995.
57D. M. Tullsen, S. Eggers, H. M. Levy.

Simultaneous Multithreading: Maximizing On-Chip Parallelism, in: Proceedings of the 22th Annual International Symposium on Computer Architecture, June 1995.
58J. Yan, W. Zhan.

WCET Analysis for Multi-Core Processors with Shared L2 Instruction Caches, in: Proceedings of Real-Time and Embedded Technology and Applications Symposium, 2008. RTAS '08., 2008, p. 80-89.

Previous |

Home