Bibliography
Publications of the year
Doctoral Dissertations and Habilitation Theses
-
1E. Grave.
A Markovian approach to distributional semantics, Université Pierre et Marie Curie - Paris VI, January 2014.
http://hal.inria.fr/tel-00940575 -
2M. Solnon.
Apprentissage statistique multi-tâches, Université Pierre et Marie Curie - Paris VI, November 2013.
http://hal.inria.fr/tel-00911498
Articles in International Peer-Reviewed Journals
-
3Z. Harchaoui, F. Bach, O. Cappé, E. Moulines.
Kernel-Based Methods for Hypothesis Testing: A Unified View, in: IEEE Signal Processing Magazine, June 2013, vol. 30, no 4, pp. 87-97. [ DOI : 10.1109/MSP.2013.2253631 ]
http://hal.inria.fr/hal-00841978 -
4B. Mishra, G. Meyer, F. Bach, R. Sepulchre.
Low-rank optimization with trace norm penalty, in: SIAM Journal on Optimization, 2013, vol. 23, no 4, pp. 2124-2149. [ DOI : 10.1137/110859646 ]
http://hal.inria.fr/hal-00924110 -
5A. d'Aspremont, N. E. Karoui.
Weak Recovery Conditions from Graph Partitioning Bounds and Order Statistics, in: Mathematics of Operations Research, July 2013, vol. 38, no 2, Final version.
http://pubsonline.informs.org/doi/abs/10.1287/moor.1120.0581, http://hal.inria.fr/hal-00907541
International Conferences with Proceedings
-
6A. Ahmed, N. Shervashidze, S. Narayanamurthy, V. Josifovski, A. J. Smola.
Distributed Large-scale Natural Graph Factorization, in: IW3C2 - International World Wide Web Conference, Rio de Janeiro, Brazil, May 2013, 37 p.
http://hal.inria.fr/hal-00918478 -
7F. Bach.
Sharp analysis of low-rank kernel matrix approximations, in: International Conference on Learning Theory (COLT), United States, 2013.
http://hal.inria.fr/hal-00723365 -
8F. Bach, E. Moulines.
Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n), in: Neural Information Processing Systems (NIPS), United States, 2013.
http://hal.inria.fr/hal-00831977 -
9P. Bojanowski, F. Bach, I. Laptev, J. Ponce, C. Schmid, J. Sivic.
Finding Actors and Actions in Movies, in: ICCV 2013 - IEEE International Conference on Computer Vision, Sydney, Australia, IEEE, 2013.
http://hal.inria.fr/hal-00904991 -
10M. Cuturi, A. d'Aspremont.
Mean Reversion with a Variance Threshold, in: International Conference on Machine Learning, United States, October 2013, pp. 271-279.
http://hal.inria.fr/hal-00939566 -
11M. Eickenberg, F. Pedregosa, S. Mehdi, A. Gramfort, B. Thirion.
Second order scattering descriptors predict fMRI activity due to visual textures, in: PRNI 2013 - 3nd International Workshop on Pattern Recognition in NeuroImaging, Philadelphia, United States, Conference Publishing Services, June 2013.
http://hal.inria.fr/hal-00834928 -
12F. Fogel, R. Jenatton, F. Bach, A. d'Aspremont.
Convex Relaxations for Permutation Problems, in: Neural Information Processing Systems (NIPS) 2013, United States, August 2013.
http://nips.cc/Conferences/2013/Program/speaker-info.php?ID=12863, http://hal.inria.fr/hal-00907528 -
13E. Grave, G. Obozinski, F. Bach.
Hidden Markov tree models for semantic class induction, in: CoNLL - Seventeenth Conference on Computational Natural Language Learning, Sofia, Bulgaria, 2013.
http://hal.inria.fr/hal-00833288 -
14P. Gronat, G. Obozinski, J. Sivic, T. Pajdla.
Learning and calibrating per-location classifiers for visual place recognition, in: CVPR 2013 - 26th IEEE Conference on Computer Vision and Pattern Recognition, Portland, United States, June 2013.
http://hal.inria.fr/hal-00934332 -
15S. Jegelka, F. Bach, S. Sra.
Reflection methods for user-friendly submodular optimization, in: NIPS 2013 - Neural Information Processing Systems, Lake Tahoe, Nevada, United States, 2013.
http://hal.inria.fr/hal-00905258 -
16S. Lacoste-Julien, M. Jaggi, M. Schmidt, P. Pletscher.
Block-Coordinate Frank-Wolfe Optimization for Structural SVMs, in: ICML 2013 International Conference on Machine Learning, Atlanta, United States, 2013, pp. 53-61.
http://hal.inria.fr/hal-00720158 -
17S. Lacoste-Julien, K. Palla, A. Davies, G. Kasneci, T. Graepel, Z. Ghahramani.
SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases, in: KDD 2013 - The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, United States, August 2013, pp. 572-580. [ DOI : 10.1145/2487575.2487592 ]
http://hal.inria.fr/hal-00918671 -
18N. Le Roux, F. Bach.
Local Component Analysis, in: ICLR - International Conference on Learning Representations 2013, Scottsdale, United States, 2013.
http://hal.inria.fr/inria-00617965 -
19A. Nelakanti, C. Archambeau, J. Mairal, F. Bach, G. Bouchard.
Structured Penalties for Log-linear Language Models, in: EMNLP - Empirical Methods in Natural Language Processing - 2013, Seattle, United States, Association for Computational Linguistics, October 2013, pp. 233-243.
http://hal.inria.fr/hal-00904820 -
20F. Pedregosa, M. Eickenberg, B. Thirion, A. Gramfort.
HRF estimation improves sensitivity of fMRI encoding and decoding models, in: 3nd International Workshop on Pattern Recognition in NeuroImaging, Philadelphia, United States, May 2013.
http://hal.inria.fr/hal-00821946 -
21E. Richard, F. Bach, J.-P. Vert.
Intersecting singularities for multi-structured estimation, in: ICML 2013 - 30th International Conference on Machine Learning, Atlanta, United States, 2013.
http://hal.inria.fr/hal-00918253 -
22G. Rigaill, T. D. Hocking, F. Bach, J.-P. Vert.
Learning Sparse Penalties for Change-Point Detection using Max Margin Interval Regression, in: ICML 2013 - 30 th International Conference on Machine Learning, Atlanta, United States, Supported by the International Machine Learning Society (IMLS), May 2013.
http://hal.inria.fr/hal-00824075 -
23T. Schatz, V. Peddinti, F. Bach, A. Jansen, H. Hermansky, E. Dupoux.
Evaluating speech features with the Minimal-Pair ABX task: Analysis of the classical MFC/PLP pipeline, in: INTERSPEECH 2013 : 14th Annual Conference of the International Speech Communication Association, Lyon, France, 2013, pp. 1-5.
http://hal.inria.fr/hal-00918599 -
24K. S. Sesh Kumar, F. Bach.
Convex Relaxations for Learning Bounded Treewidth Decomposable Graphs, in: International Conference on Machine Learning, Atlanta, United States, 2013, Extended version of the ICML-2013 paper..
http://hal.inria.fr/hal-00763921
Conferences without Proceedings
-
25E. Grave, G. Obozinski, F. Bach.
Domain adaptation for sequence labeling using hidden Markov models, in: New Directions in Transfer and Multi-Task: Learning Across Domains and Tasks (NIPS Workshop), Lake Tahoe, United States, 2013.
http://hal.inria.fr/hal-00918371
Scientific Books (or Scientific Book chapters)
-
26F. Bach.
Learning with Submodular Functions: A Convex Optimization Perspective, Foundations and Trends in Machine Learning, Now Publishers, 2013, 228 p. [ DOI : 10.1561/2200000039 ]
http://hal.inria.fr/hal-00645271
Other Publications
-
27F. Bach.
Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression, October 2013.
http://hal.inria.fr/hal-00804431 -
28F. Bach.
Convex relaxations of structured matrix factorizations, September 2013.
http://hal.inria.fr/hal-00861118 -
29F. Fogel, I. Waldspurger, A. d'Aspremont.
Phase retrieval for imaging problems, 2013.
http://hal.inria.fr/hal-00907529 -
30R. Gribonval, R. Jenatton, F. Bach, M. Kleinsteuber, M. Seibert.
Sample Complexity of Dictionary Learning and other Matrix Factorizations, December 2013, submitted.
http://hal.inria.fr/hal-00918142 -
31R. Lajugie, S. Arlot, F. Bach.
Large-Margin Metric Learning for Partitioning Problems, March 2013.
http://hal.inria.fr/hal-00796921 -
32M. Schmidt, N. Le Roux, F. Bach.
Minimizing Finite Sums with the Stochastic Average Gradient, September 2013.
http://hal.inria.fr/hal-00860051 -
33M. Schmidt, N. Le Roux.
Fast Convergence of Stochastic Gradient Descent under a Strong Growth Condition, August 2013.
http://hal.inria.fr/hal-00855113 -
34K. S. Sesh Kumar, F. Bach.
Maximizing submodular functions using probabilistic graphical models, September 2013.
http://hal.inria.fr/hal-00860575 -
35M. Solnon.
Comparison bewteen multi-task and single-task oracle risks in kernel ridge regression, 2013, Submitted to the Electronic Journal of Statistics.
http://hal.inria.fr/hal-00846715 -
36I. Waldspurger, A. d'Aspremont, S. Mallat.
Phase Recovery, MaxCut and Complex Semidefinite Programming, 2013, Submitted revision.
http://hal.inria.fr/hal-00907535 -
37A. d'Aspremont, M. Jaggi.
An Optimal Affine Invariant Smooth Minimization Algorithm, 2013.
http://hal.inria.fr/hal-00907547
-
38F. Bach.
Learning with Submodular Functions: A Convex Optimization Perspective, in: ArXiv e-prints, 2011. -
39F. Bach, M. Jordan.
Thin junction trees, in: Adv. NIPS, 2002. -
40F. Bach, M. Jordan.
Learning spectral clustering, in: Adv. NIPS, 2003. -
41A. Bar-Hillel, T. Hertz, N. Shental, D. Weinshall.
Learning a mahalanobis metric from equivalence constraints, in: Journal of Machine Learning Research, 2006, vol. 6, no 1, 937 p. -
42C. Bishop, et al..
Pattern recognition and machine learning, springer New York, 2006. -
43D. Blatt, A. O. Hero, H. Gauchman.
A convergent incremental gradient method with a constant step size, in: SIOPT, 2007, vol. 18, no 1, pp. 29–51. -
44Y. Boykov, O. Veksler, R. Zabih.
Fast approximate energy minimization via graph cuts, in: IEEE Trans. PAMI, 2001, vol. 23, no 11, pp. 1222–1239. -
45L. Burget, P. Matejka, P. Schwarz, O. Glembek, J. Cernocky.
Analysis of Feature Extraction and Channel Compensation in a GMM Speaker Recognition System, in: IEEE Transactions on Audio, Speech and Language Processing, September 2007, vol. 15, no 7, pp. 1979-1986. -
46M. A. Carlin, S. Thomas, A. Jansen, H. Hermansky.
Rapid evaluation of speech representations for spoken term discovery, in: Proceedings of Interspeech, 2011. -
47Y.-W. Chang, M. Collins.
Exact Decoding of Phrase-based Translation Models through Lagrangian Relaxation, in: Proceedings of the Conference on Empirical Methods for Natural Language Processing, 2011, pp. 26–37. -
48A. Chechetka, C. Guestrin.
Efficient Principled Learning of Thin Junction Trees, in: Adv. NIPS, 2007. -
49J. Chen, A. K. Gupta.
Parametric Statistical Change Point Analysis, Birkhäuser, 2011. -
50S. Chen, R. Rosenfeld.
A survey of smoothing techniques for ME models, in: IEEE Transactions on Speech and Audio Processing, 2000, vol. 8, no 1, pp. 37–50. -
51Y. Cheng.
Mean shift, mode seeking, and clustering, in: IEEE Trans. PAMI, 1995, vol. 17, no 8, pp. 790–799. -
52C. I. Chow, C. N. Liu.
Approximating discrete probability distributions with dependence trees, in: IEEE Trans. Inf. Theory, 1968, vol. 14. -
53F. De la Torre, T. Kanade.
Discriminative cluster analysis, in: Proc. ICML, 2006. -
54F. Desobry, M. Davy, C. Doncarli.
An online kernel change detection algorithm, in: IEEE Trans. Sig. Proc., 2005, vol. 53, no 8, pp. 2961–2974. -
55B. Efron, C. N. Morris.
Stein's paradox in statistics, in: Scientific American, 1977, vol. 236, pp. 119–127. -
56T. Evgeniou, C. A. Micchelli, M. Pontil.
Learning Multiple Tasks with Kernel Methods, in: Journal of Machine Learning Research, 2005, vol. 6, pp. 615–637. -
57P. Fousek, P. Svojanovsky, F. Grezl, H. Hermansky.
New Nonsense Syllables Database – Analyses and Preliminary ASR Experiments, in: Proceedings of the International Conference on Spoken Language Processing (ICSLP), 2004, pp. 2004-29. -
58S. Fujishige.
Submodular Functions and Optimization, Annals of Discrete Mathematics, Elsevier, 2005. -
59V. Gogate, W. Webb, P. Domingos.
Learning Efficient Markov Networks, in: Adv. NIPS, 2010. -
60J. Goodman.
A bit of progress in language modelling, in: Computer Speech and Language, October 2001, pp. 403–434. -
61J. C. Gower, G. J. S. Ross.
Minimum spanning trees and single linkage cluster analysis, in: Applied statistics, 1969, pp. 54–64. -
62T. D. Hocking, G. Schleiermacher, I. Janoueix-Lerosey, O. Delattre, F. Bach, J.-P. Vert.
Learning smoothing models of copy number profiles using breakpoint annotations, in: HAL, archives ouvertes, 2012. -
63L. Jacob, F. Bach, J.-P. Vert.
Clustered Multi-Task Learning: A Convex Formulation, in: Computing Research Repository, 2008, pp. -1–1. -
64W. James, C. Stein.
Estimation with quadratic loss, in: Proceedings of the fourth Berkeley symposium on mathematical statistics and probability, 1961, vol. 1, no 1961, pp. 361–379. -
65R. Jenatton, J. Mairal, G. Obozinski, F. Bach.
Proximal Methods for Hierarchical Sparse Coding, in: Journal of Machine Learning Research, 2011, pp. 2297-2334. -
66R. Kneser, H. Ney.
Improved backing-off for m-gram language modeling, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 1995, vol. 1. -
67D. Koller, N. Friedman.
Probabilistic graphical models: principles and techniques, MIT press, 2009. -
68V. Kolmogorov, T. Schoenemann.
Generalized sequential tree-reweighted message passing, in: ArXiv e-prints, May 2012. -
69A. Krause, C. Guestrin.
Submodularity and its Applications in Optimized Information Gathering, in: ACM Transactions on Intelligent Systems and Technology, 2011, vol. 2, no 4. -
70H. Lin, J. Bilmes.
A Class of Submodular Functions for Document Summarization, in: Proc. NAACL/HLT, 2011. -
71D. Luenberger, Y. Ye.
Linear and nonlinear programming, Springer Verlag, 2008. -
72N. A. Macmillan, C. D. Creelman.
Detection theory: A user's guide, Lawrence Erlbaum, 2004. -
73F. Malvestuto.
Approximating discrete probability distributions with decomposable models, in: IEEE Trans. Systems, Man, Cybernetics, 1991, vol. 21, no 5. -
74A. F. T. Martins, N. A. Smith, A. M. Q. Pedro, M. A. T. Figueiredo.
Structured sparsity in structured prediction, in: Proceedings of the Conference on Empirical Methods for Natural Language Processing, 2011, pp. 1500–1511. -
75M. Narasimhan, J. Bilmes.
PAC-learning bounded tree-width graphical models, in: Proc. UAI, 2004. -
76G. Nemhauser, L. Wolsey, M. Fisher.
An analysis of approximations for maximizing submodular set functions–I, in: Mathematical Programming, 1978, vol. 14, no 1, pp. 265–294. -
77A. Nemirovski, A. Juditsky, G. Lan, A. Shapiro.
Robust stochastic approximation approach to stochastic programming, in: SIOPT, 2009, vol. 19, no 4, pp. 1574–1609. -
78A. Nemirovski.
Efficient methods in convex programming, in: Lecture notes, 1994. -
79Y. Nesterov.
Introductory lectures on convex optimization: A basic course, Springer, 2004. -
80A. Y. Ng, M. Jordan, Y. Weiss.
On spectral clustering: Analysis and an algorithm, in: Adv. NIPS, 2002. -
81B. Roark, M. Saraclar, M. Collins, M. Johnson.
Discriminative language modeling with conditional random fields and the perceptron algorithm, in: Proceedings of the Association for Computation Linguistics, 2004. -
82L. Saul, M. Jordan.
Exploiting Tractable Substructures in Intractable Networks, in: Adv. NIPS, 1995. -
83H. D. Sherali, W. P. Adams.
A Hierarchy of Relaxations Between the Continuous and Convex Hull Representations for Zero-One Programming Problems, in: SIAM J. Discrete Math., 1990. -
84J. Shi, J. Malik.
Normalized Cuts and Image Segmentation, in: IEEE Trans. PAMI, 1997, vol. 22, pp. 888–905. -
85GSVS. Sivaram, H. Hermansky.
Sparse Multilayer Perceptron for Phoneme Recognition, in: IEEE Transactions on Audio, Speech, and Language Processing, 2012, vol. 20, no 1, pp. 23-29. -
86M. Solnon, S. Arlot, F. Bach.
Multi-task Regression using Minimal Penalties, in: Journal of Machine Learning Research, September 2012, vol. 13, pp. 2773-2812. -
87M. Solodov.
Incremental gradient algorithms with stepsizes bounded away from zero, in: Computational Optimization and Applications, 1998, vol. 11, no 1, pp. 23–35. -
88C. Stein.
Inadmissibility of the usual estimator for the mean of a multivariate normal distribution, in: Proceedings of the Third Berkeley symposium on mathematical statistics and probability, 1956, vol. 1, no 399, pp. 197–206. -
89T. Szántai, E. Kovács.
Discovering a junction tree behind a Markov network by a greedy algorithm, in: ArXiv e-prints, April 2011. -
90P. Tseng.
An incremental gradient(-projection) method with momentum term and adaptive stepsize rule, in: SIOPT, 1998, vol. 8, no 2, pp. 506-531. -
91I. Tsochantaridis, T. Hofmann, T. Joachims, Y. Altun.
Support Vector Machine Learning for Interdependent and Structured Output Spaces, in: Proc. ICML, 2004. -
92S. Vargas, P. Castells, D. Vallet.
Explicit relevance models in intent-oriented information retrieval diversification, in: Proceedings of the 35th ACM SIGIR International Conference on Research and development in information retrieval, Portland, Oregon, USA, SIGIR'12, ACM, 2012, pp. 75-84.
http://doi.acm.org/10.1145/2348283.2348297 -
93M. Wainwright, M. Jordan.
Graphical models, exponential families, and variational inference, in: Found. and Trends in Mach. Learn., 2008, vol. 1, no 1-2. -
94F. Wood, C. Archambeau, J. Gasthaus, J. Lancelot, Y.-W. Teh.
A Stochastic Memoizer for Sequence Data, in: Proceedings of the 26th International Conference on Machine Learning, 2009. -
95E. P. Xing, A. Y. Ng, M. Jordan, S. Russell.
Distance metric learning with applications to clustering with side-information, in: Adv. NIPS, 2002. -
96P. Zhao, G. Rocha, B. Yu.
The composite absolute penalties family for grouped and hierarchical variable selection, in: The Annals of Statistics, 2009, vol. 37(6A), pp. 3468-3497.