EN FR
EN FR
Overall Objectives
New Software and Platforms
Bilateral Contracts and Grants with Industry
Bibliography
Overall Objectives
New Software and Platforms
Bilateral Contracts and Grants with Industry
Bibliography


Section: New Results

Weakly supervised named entity classification

Participant : Edouard Grave.

In this paper, we describe a new method for the problem of named entity classification for specialized or technical domains, using distant supervision. Our approach relies on a simple observation: in some specialized domains, named entities are almost unambiguous. Thus, given a seed list of names of entities, it is cheap and easy to obtain positive examples from unlabeled texts using a simple string match. Those positive examples can then be used to train a named entity classifier, by using the PU learning paradigm, which is learning from positive and unlabeled examples. We introduce a new convex formulation to solve this problem, and apply our technique in order to extract named entities from financial reports corresponding to healthcare companies.