I’ve been starting work on a problem related to active learning, and I wanted to get caught up on the literature. Luckily for me, Sanjoy Dasgupta has a nice survey (non-paywall version here) from 2011 on the subject. It’s a nice read, although I didn’t know “aggressive” and “mellow” were terms of art in active learning.

In active learning you have to query unlabeled points and ask for their labels — the goal is usually to learn something like a classifier, so you want to query a small number of points by being judicious about which ones to ask for. A mellow algorithm queries any informative point, where as an aggressive algorithm queries the “most informative point.” The former are often easier to analyze, because the latter end up sampling a “nonrepresentative” set of labeled points — if the points come i.i.d. from some distribution, the set of points you would label in an aggressive strategy will not look like they came from that distribution. Future work may look at semi-aggressive strategies. Perhaps we could call this line of research “harshing the mellow” by developing “harsh functions” which score points according to informativeness…

About these ads