Outline for

Tong, Simon, and Daphne Koller. "Support vector machine active learning with applications to text classification." The Journal of Machine Learning Research 2 (2002): 45-66. 3/4/15

• Objective: - Active learning for SVM classification. Querying the “best” instances. Returning a classifier after some constant queries have been made. They consider binary text classification for their experiments.

• Motivation for active learning: - Getting labelled data for training set is costly. Having the learner actively choose a data point and request the label for them reduces human effort. - Pool based active learning: Request the label for instances in a pool of unlabelled data.

• Existing querying algorithms: - Query by committee: Have some number of hypotheses which are all consistent with the labelled data, vote for the label of unlabelled data. Query the instance that they disagree the most. There are infinitely many such hypotheses. - Another form of this is to choose two hypotheses at random, choose an unlabelled instance at random, if the two hypotheses disagree, request the label for the instance. If not, choose another instance and repeat. - Uncertainty sampling: Use Bayes rule to find the instances that their label is most unclear and query that instance. - The methods in this paper outperform these methods.

• Inductive SVM vs. transductive SVM Inductive SVM finds hyperplanes that separates training data (Assuming the feature space is high dimensional and thus data is linearly separable). Transductive SVM also considers unlabelled data and finds hyperplanes that maximize the margin considering unlabelled data as well.

• Version space: The set of all hypotheses that classify all data points right. Every hypothesis is defined with the parameters (w). So, version space can be defined in terms of parameters w.

- Version space “duality”: Points in feature space correspond to hyperplanes in parameter space.

- In parameter space, version space is parts of the surface of hypersphere restricted by labelled data hyperplanes.

- SVMs find the center of the largest hypersphere that can fit in version space and doesn’t intersect with labelled data hyperplanes. The radius of this sphere is proportional to the margin.

• Active learning querying: - Goal: Reduce the size of version space with each query as much as possible. - Lemma 4: The classifier that chooses a query that halves the size of the version space at each point, minimizes the size of the version space the most. Thus we want to query instances that approximately halve the area of version space. - Calculating the exact size of version space after labelling each queried data point is computationally expensive. Three ways to approximate: 1. Simple Margin: idea: SVM unit vector is centered in the version space. Every unlabelled data has a hyperplane. The closest hyperplane to this center is going to approximately halve the version space. Choose the data point that its hyperplane is closest to the SVM center. Caveat: Requires the version space to be symmetric. Figure 3 b shows that this method will query a, however b is a better choice. 2. MaxMin Margin: idea: SVM hypersphere radius is proportional to the margin. Use the radius as an approximation of version space size. For each possible query, find the SVM for the cases the query is labelled as positive and negative. Use the found SVM radius(m+ and m-) to approximate the resulting version space sizes. Choose the datapoint that has the maximum min(m+,m-). This is computationally expensive since we need to calculate for each possible query. Caveat: Won’t work if the version space is elongated. Figure 4. 3. Ratio Margin: Same as MaxMin except looks at the relative size of m+ and m-. Choose the datapoint that has maximum min (m+/m-,m-/m+). This is also computationally expensive.

• Experiments on Reuters data: - The three methods work approximately the same. (Figure 5) - Active learning is more beneficial for infrequent classes (Table 2) - Increasing the pool size improves results (Figure 7)

- Active learning provides more benefits than learning a transductive SVM (Figure 8) • Experiments on Newsgroup data: - Simple method fails badly. It queries labelled data. (Figure 9) - Simple method is fast but mostly fails in the first queries. MaxMin and Ratio are slow when the number of labelled data is high. Hybrid method uses MaxMin or Ratio for the first queries and then switches to Simple. (Table 3)

"Support vector machine active learning with ... -

"Support vector machine active learning with applications ... The Journal of Machine Learning Research 2 ... Increasing the pool size improves results (Figure 7) ...

63KB Sizes 2 Downloads 246 Views

Recommend Documents

Video Concept Detection Using Support Vector Machine with ...
Video Concept Detection Using Support Vector Machine with Augmented. Features. Xinxing Xu. Dong Xu. Ivor W. Tsang ... port Vector Machine with Augmented Features (AFSVM) for video concept detection. For each visual ..... International Journal of Comp

Support Vector Echo-State Machine for Chaotic ... - Semantic Scholar
Dalian University of Technology, Dalian ... SVESMs are especially efficient in dealing with real life nonlinear time series, and ... advantages of the SVMs and echo state mechanisms. ...... [15] H. Jaeger, and H. Haas, Harnessing nonlinearity: Predic

Improving Support Vector Machine Generalisation via Input ... - IJEECS
[email protected]. Abstract. Data pre-processing always plays a key role in learning algorithm performance. In this research we consider data.

Support Vector Echo-State Machine for Chaotic Time ...
Keywords: Support Vector Machines, Echo State Networks, Recurrent neural ... Jordan networks, RPNN (Recurrent Predictor Neural networks) [14], ESN ..... So the following job will be ...... performance of SVESM does not deteriorate, and sometime it ca

Exploiting Geometry for Support Vector Machine Indexing
describing its key operations: index creation, top-k ..... 4.1.3 Constructing intra-ring index For each ring, KDX ..... index = Bin search( Arr[R ][ inverted index[x]], τ ).

Support vector machine based multi-view face detection and recognition
theless, a new problem is normally introduced in these view- ...... Face Recognition, World Scientific Publishing and Imperial College. Press, 2000. [9] S. Gong ...

Improving Support Vector Machine Generalisation via Input ... - IJEECS
for a specific classification problem. The best normalization method is also selected by SVM itself. Keywords: Normalization, Classification, Support Vector.

Fuzzy Logic and Support Vector Machine Approaches to ... - IEEE Xplore
IEEE TRANSACTIONS ON PLASMA SCIENCE, VOL. 34, NO. 3, JUNE 2006. 1013. Fuzzy Logic and Support Vector Machine Approaches to Regime ...

Improving Support Vector Machine Generalisation via ...
Abstract. Data pre-processing always plays a key role in learning algorithm performance. In this research we consider data pre-processing by normalization for Support Vector. Machines (SVMs). We examine the normalization effect across 112 classificat

Efficient Active Learning with Boosting
compose the set Dn. The whole data set now is denoted by Sn = {DL∪n,DU\n}. We call it semi-supervised data set. Initially S0 = D. After all unlabeled data are labeled, the data set is called genuine data set G,. G = Su = DL∪u. We define the cost

Support Vector Echo-State Machine for Chaotic ... - Semantic Scholar
1. Support Vector Echo-State Machine for Chaotic Time. Series Prediction ...... The 1-year-ahead prediction and ... of SVESM does not deteriorate, and sometime it can improve to some degree. ... Lecture Notes in Computer Science, vol.

Support vector machine based multi-view face ... - Brunel University
determine the bounding boxes on which face detection is performed. .... words, misalignment in views may lead to a significant drop in performance.

Support Vector Machine Fusion of Multisensor Imagery ...
support vector machine (SVM) fusion for the classification of multisensors images ... These tools are also critical for biodiversity science and conservation [2].

Efficient Active Learning with Boosting
unify semi-supervised learning and active learning boosting. Minimization of ... tant, we derive an efficient active learning algorithm under ... chine learning and data mining fields [14]. ... There lacks more theoretical analysis for these ...... I

Support Vector Machines
Porting some non-trivial application to SVM tool and analyze. OR а. Comparison of Neural Network and SVM using tools like SNNS and. SVMLight. : 30 ...

Machine Learning with OpenCV2 - bytefish.de
Feb 9, 2012 - 7.3 y = sin(10x) . ... support and OpenCV 2.3.1 now comes with a programming interface to C, C++, Python and Android. OpenCV is released ...

Efficient Active Learning with Boosting
[email protected], [email protected]} handle. For each query, a ...... can be easily generalized to batch mode active learn- ing methods. We can ...

Interacting with VW in active learning - GitHub
Nikos Karampatziakis. Cloud and Information Sciences Lab. Microsoft ... are in human readable form (text). ▷ Connects to the host:port VW is listening on ...