Overall Objectives
Bilateral Contracts and Grants with Industry
Overall Objectives
Bilateral Contracts and Grants with Industry


Publications of the year

Doctoral Dissertations and Habilitation Theses

Articles in International Peer-Reviewed Journals

International Conferences with Proceedings

  • 6A. Ahmed, N. Shervashidze, S. Narayanamurthy, V. Josifovski, A. J. Smola.

    Distributed Large-scale Natural Graph Factorization, in: IW3C2 - International World Wide Web Conference, Rio de Janeiro, Brazil, May 2013, 37 p.

  • 7F. Bach.

    Sharp analysis of low-rank kernel matrix approximations, in: International Conference on Learning Theory (COLT), United States, 2013.

  • 8F. Bach, E. Moulines.

    Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n), in: Neural Information Processing Systems (NIPS), United States, 2013.

  • 9P. Bojanowski, F. Bach, I. Laptev, J. Ponce, C. Schmid, J. Sivic.

    Finding Actors and Actions in Movies, in: ICCV 2013 - IEEE International Conference on Computer Vision, Sydney, Australia, IEEE, 2013.

  • 10M. Cuturi, A. d'Aspremont.

    Mean Reversion with a Variance Threshold, in: International Conference on Machine Learning, United States, October 2013, pp. 271-279.

  • 11M. Eickenberg, F. Pedregosa, S. Mehdi, A. Gramfort, B. Thirion.

    Second order scattering descriptors predict fMRI activity due to visual textures, in: PRNI 2013 - 3nd International Workshop on Pattern Recognition in NeuroImaging, Philadelphia, United States, Conference Publishing Services, June 2013.

  • 12F. Fogel, R. Jenatton, F. Bach, A. d'Aspremont.

    Convex Relaxations for Permutation Problems, in: Neural Information Processing Systems (NIPS) 2013, United States, August 2013.

    http://nips.cc/Conferences/2013/Program/speaker-info.php?ID=12863, http://hal.inria.fr/hal-00907528
  • 13E. Grave, G. Obozinski, F. Bach.

    Hidden Markov tree models for semantic class induction, in: CoNLL - Seventeenth Conference on Computational Natural Language Learning, Sofia, Bulgaria, 2013.

  • 14P. Gronat, G. Obozinski, J. Sivic, T. Pajdla.

    Learning and calibrating per-location classifiers for visual place recognition, in: CVPR 2013 - 26th IEEE Conference on Computer Vision and Pattern Recognition, Portland, United States, June 2013.

  • 15S. Jegelka, F. Bach, S. Sra.

    Reflection methods for user-friendly submodular optimization, in: NIPS 2013 - Neural Information Processing Systems, Lake Tahoe, Nevada, United States, 2013.

  • 16S. Lacoste-Julien, M. Jaggi, M. Schmidt, P. Pletscher.

    Block-Coordinate Frank-Wolfe Optimization for Structural SVMs, in: ICML 2013 International Conference on Machine Learning, Atlanta, United States, 2013, pp. 53-61.

  • 17S. Lacoste-Julien, K. Palla, A. Davies, G. Kasneci, T. Graepel, Z. Ghahramani.

    SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases, in: KDD 2013 - The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, United States, August 2013, pp. 572-580. [ DOI : 10.1145/2487575.2487592 ]

  • 18N. Le Roux, F. Bach.

    Local Component Analysis, in: ICLR - International Conference on Learning Representations 2013, Scottsdale, United States, 2013.

  • 19A. Nelakanti, C. Archambeau, J. Mairal, F. Bach, G. Bouchard.

    Structured Penalties for Log-linear Language Models, in: EMNLP - Empirical Methods in Natural Language Processing - 2013, Seattle, United States, Association for Computational Linguistics, October 2013, pp. 233-243.

  • 20F. Pedregosa, M. Eickenberg, B. Thirion, A. Gramfort.

    HRF estimation improves sensitivity of fMRI encoding and decoding models, in: 3nd International Workshop on Pattern Recognition in NeuroImaging, Philadelphia, United States, May 2013.

  • 21E. Richard, F. Bach, J.-P. Vert.

    Intersecting singularities for multi-structured estimation, in: ICML 2013 - 30th International Conference on Machine Learning, Atlanta, United States, 2013.

  • 22G. Rigaill, T. D. Hocking, F. Bach, J.-P. Vert.

    Learning Sparse Penalties for Change-Point Detection using Max Margin Interval Regression, in: ICML 2013 - 30 th International Conference on Machine Learning, Atlanta, United States, Supported by the International Machine Learning Society (IMLS), May 2013.

  • 23T. Schatz, V. Peddinti, F. Bach, A. Jansen, H. Hermansky, E. Dupoux.

    Evaluating speech features with the Minimal-Pair ABX task: Analysis of the classical MFC/PLP pipeline, in: INTERSPEECH 2013 : 14th Annual Conference of the International Speech Communication Association, Lyon, France, 2013, pp. 1-5.

  • 24K. S. Sesh Kumar, F. Bach.

    Convex Relaxations for Learning Bounded Treewidth Decomposable Graphs, in: International Conference on Machine Learning, Atlanta, United States, 2013, Extended version of the ICML-2013 paper..


Conferences without Proceedings

  • 25E. Grave, G. Obozinski, F. Bach.

    Domain adaptation for sequence labeling using hidden Markov models, in: New Directions in Transfer and Multi-Task: Learning Across Domains and Tasks (NIPS Workshop), Lake Tahoe, United States, 2013.


Scientific Books (or Scientific Book chapters)

  • 26F. Bach.

    Learning with Submodular Functions: A Convex Optimization Perspective, Foundations and Trends in Machine Learning, Now Publishers, 2013, 228 p. [ DOI : 10.1561/2200000039 ]


Other Publications

References in notes
  • 38F. Bach.

    Learning with Submodular Functions: A Convex Optimization Perspective, in: ArXiv e-prints, 2011.
  • 39F. Bach, M. Jordan.

    Thin junction trees, in: Adv. NIPS, 2002.
  • 40F. Bach, M. Jordan.

    Learning spectral clustering, in: Adv. NIPS, 2003.
  • 41A. Bar-Hillel, T. Hertz, N. Shental, D. Weinshall.

    Learning a mahalanobis metric from equivalence constraints, in: Journal of Machine Learning Research, 2006, vol. 6, no 1, 937 p.
  • 42C. Bishop, et al..

    Pattern recognition and machine learning, springer New York, 2006.
  • 43D. Blatt, A. O. Hero, H. Gauchman.

    A convergent incremental gradient method with a constant step size, in: SIOPT, 2007, vol. 18, no 1, pp. 29–51.
  • 44Y. Boykov, O. Veksler, R. Zabih.

    Fast approximate energy minimization via graph cuts, in: IEEE Trans. PAMI, 2001, vol. 23, no 11, pp. 1222–1239.
  • 45L. Burget, P. Matejka, P. Schwarz, O. Glembek, J. Cernocky.

    Analysis of Feature Extraction and Channel Compensation in a GMM Speaker Recognition System, in: IEEE Transactions on Audio, Speech and Language Processing, September 2007, vol. 15, no 7, pp. 1979-1986.
  • 46M. A. Carlin, S. Thomas, A. Jansen, H. Hermansky.

    Rapid evaluation of speech representations for spoken term discovery, in: Proceedings of Interspeech, 2011.
  • 47Y.-W. Chang, M. Collins.

    Exact Decoding of Phrase-based Translation Models through Lagrangian Relaxation, in: Proceedings of the Conference on Empirical Methods for Natural Language Processing, 2011, pp. 26–37.
  • 48A. Chechetka, C. Guestrin.

    Efficient Principled Learning of Thin Junction Trees, in: Adv. NIPS, 2007.
  • 49J. Chen, A. K. Gupta.

    Parametric Statistical Change Point Analysis, Birkhäuser, 2011.
  • 50S. Chen, R. Rosenfeld.

    A survey of smoothing techniques for ME models, in: IEEE Transactions on Speech and Audio Processing, 2000, vol. 8, no 1, pp. 37–50.
  • 51Y. Cheng.

    Mean shift, mode seeking, and clustering, in: IEEE Trans. PAMI, 1995, vol. 17, no 8, pp. 790–799.
  • 52C. I. Chow, C. N. Liu.

    Approximating discrete probability distributions with dependence trees, in: IEEE Trans. Inf. Theory, 1968, vol. 14.
  • 53F. De la Torre, T. Kanade.

    Discriminative cluster analysis, in: Proc. ICML, 2006.
  • 54F. Desobry, M. Davy, C. Doncarli.

    An online kernel change detection algorithm, in: IEEE Trans. Sig. Proc., 2005, vol. 53, no 8, pp. 2961–2974.
  • 55B. Efron, C. N. Morris.

    Stein's paradox in statistics, in: Scientific American, 1977, vol. 236, pp. 119–127.
  • 56T. Evgeniou, C. A. Micchelli, M. Pontil.

    Learning Multiple Tasks with Kernel Methods, in: Journal of Machine Learning Research, 2005, vol. 6, pp. 615–637.
  • 57P. Fousek, P. Svojanovsky, F. Grezl, H. Hermansky.

    New Nonsense Syllables Database – Analyses and Preliminary ASR Experiments, in: Proceedings of the International Conference on Spoken Language Processing (ICSLP), 2004, pp. 2004-29.
  • 58S. Fujishige.

    Submodular Functions and Optimization, Annals of Discrete Mathematics, Elsevier, 2005.
  • 59V. Gogate, W. Webb, P. Domingos.

    Learning Efficient Markov Networks, in: Adv. NIPS, 2010.
  • 60J. Goodman.

    A bit of progress in language modelling, in: Computer Speech and Language, October 2001, pp. 403–434.
  • 61J. C. Gower, G. J. S. Ross.

    Minimum spanning trees and single linkage cluster analysis, in: Applied statistics, 1969, pp. 54–64.
  • 62T. D. Hocking, G. Schleiermacher, I. Janoueix-Lerosey, O. Delattre, F. Bach, J.-P. Vert.

    Learning smoothing models of copy number profiles using breakpoint annotations, in: HAL, archives ouvertes, 2012.
  • 63L. Jacob, F. Bach, J.-P. Vert.

    Clustered Multi-Task Learning: A Convex Formulation, in: Computing Research Repository, 2008, pp. -1–1.
  • 64W. James, C. Stein.

    Estimation with quadratic loss, in: Proceedings of the fourth Berkeley symposium on mathematical statistics and probability, 1961, vol. 1, no 1961, pp. 361–379.
  • 65R. Jenatton, J. Mairal, G. Obozinski, F. Bach.

    Proximal Methods for Hierarchical Sparse Coding, in: Journal of Machine Learning Research, 2011, pp. 2297-2334.
  • 66R. Kneser, H. Ney.

    Improved backing-off for m-gram language modeling, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 1995, vol. 1.
  • 67D. Koller, N. Friedman.

    Probabilistic graphical models: principles and techniques, MIT press, 2009.
  • 68V. Kolmogorov, T. Schoenemann.

    Generalized sequential tree-reweighted message passing, in: ArXiv e-prints, May 2012.
  • 69A. Krause, C. Guestrin.

    Submodularity and its Applications in Optimized Information Gathering, in: ACM Transactions on Intelligent Systems and Technology, 2011, vol. 2, no 4.
  • 70H. Lin, J. Bilmes.

    A Class of Submodular Functions for Document Summarization, in: Proc. NAACL/HLT, 2011.
  • 71D. Luenberger, Y. Ye.

    Linear and nonlinear programming, Springer Verlag, 2008.
  • 72N. A. Macmillan, C. D. Creelman.

    Detection theory: A user's guide, Lawrence Erlbaum, 2004.
  • 73F. Malvestuto.

    Approximating discrete probability distributions with decomposable models, in: IEEE Trans. Systems, Man, Cybernetics, 1991, vol. 21, no 5.
  • 74A. F. T. Martins, N. A. Smith, A. M. Q. Pedro, M. A. T. Figueiredo.

    Structured sparsity in structured prediction, in: Proceedings of the Conference on Empirical Methods for Natural Language Processing, 2011, pp. 1500–1511.
  • 75M. Narasimhan, J. Bilmes.

    PAC-learning bounded tree-width graphical models, in: Proc. UAI, 2004.
  • 76G. Nemhauser, L. Wolsey, M. Fisher.

    An analysis of approximations for maximizing submodular set functions–I, in: Mathematical Programming, 1978, vol. 14, no 1, pp. 265–294.
  • 77A. Nemirovski, A. Juditsky, G. Lan, A. Shapiro.

    Robust stochastic approximation approach to stochastic programming, in: SIOPT, 2009, vol. 19, no 4, pp. 1574–1609.
  • 78A. Nemirovski.

    Efficient methods in convex programming, in: Lecture notes, 1994.
  • 79Y. Nesterov.

    Introductory lectures on convex optimization: A basic course, Springer, 2004.
  • 80A. Y. Ng, M. Jordan, Y. Weiss.

    On spectral clustering: Analysis and an algorithm, in: Adv. NIPS, 2002.
  • 81B. Roark, M. Saraclar, M. Collins, M. Johnson.

    Discriminative language modeling with conditional random fields and the perceptron algorithm, in: Proceedings of the Association for Computation Linguistics, 2004.
  • 82L. Saul, M. Jordan.

    Exploiting Tractable Substructures in Intractable Networks, in: Adv. NIPS, 1995.
  • 83H. D. Sherali, W. P. Adams.

    A Hierarchy of Relaxations Between the Continuous and Convex Hull Representations for Zero-One Programming Problems, in: SIAM J. Discrete Math., 1990.
  • 84J. Shi, J. Malik.

    Normalized Cuts and Image Segmentation, in: IEEE Trans. PAMI, 1997, vol. 22, pp. 888–905.
  • 85GSVS. Sivaram, H. Hermansky.

    Sparse Multilayer Perceptron for Phoneme Recognition, in: IEEE Transactions on Audio, Speech, and Language Processing, 2012, vol. 20, no 1, pp. 23-29.
  • 86M. Solnon, S. Arlot, F. Bach.

    Multi-task Regression using Minimal Penalties, in: Journal of Machine Learning Research, September 2012, vol. 13, pp. 2773-2812.
  • 87M. Solodov.

    Incremental gradient algorithms with stepsizes bounded away from zero, in: Computational Optimization and Applications, 1998, vol. 11, no 1, pp. 23–35.
  • 88C. Stein.

    Inadmissibility of the usual estimator for the mean of a multivariate normal distribution, in: Proceedings of the Third Berkeley symposium on mathematical statistics and probability, 1956, vol. 1, no 399, pp. 197–206.
  • 89T. Szántai, E. Kovács.

    Discovering a junction tree behind a Markov network by a greedy algorithm, in: ArXiv e-prints, April 2011.
  • 90P. Tseng.

    An incremental gradient(-projection) method with momentum term and adaptive stepsize rule, in: SIOPT, 1998, vol. 8, no 2, pp. 506-531.
  • 91I. Tsochantaridis, T. Hofmann, T. Joachims, Y. Altun.

    Support Vector Machine Learning for Interdependent and Structured Output Spaces, in: Proc. ICML, 2004.
  • 92S. Vargas, P. Castells, D. Vallet.

    Explicit relevance models in intent-oriented information retrieval diversification, in: Proceedings of the 35th ACM SIGIR International Conference on Research and development in information retrieval, Portland, Oregon, USA, SIGIR'12, ACM, 2012, pp. 75-84.

  • 93M. Wainwright, M. Jordan.

    Graphical models, exponential families, and variational inference, in: Found. and Trends in Mach. Learn., 2008, vol. 1, no 1-2.
  • 94F. Wood, C. Archambeau, J. Gasthaus, J. Lancelot, Y.-W. Teh.

    A Stochastic Memoizer for Sequence Data, in: Proceedings of the 26th International Conference on Machine Learning, 2009.
  • 95E. P. Xing, A. Y. Ng, M. Jordan, S. Russell.

    Distance metric learning with applications to clustering with side-information, in: Adv. NIPS, 2002.
  • 96P. Zhao, G. Rocha, B. Yu.

    The composite absolute penalties family for grouped and hierarchical variable selection, in: The Annals of Statistics, 2009, vol. 37(6A), pp. 3468-3497.