ICML2009, Montreal

Robust Feature Extraction via Information Theoretic Learning Xiao-Tong Yuan

Bao-Gang Hu

June 17, 2009

Institute of Automation Chinese Academy of Sciences

National Laboratory of Pattern Recognition

Supervised Feature Extraction „ Training Data

„ Goal:Search for a projection matrix

„ Criterion: to describe certain desired or undesired statistical or geometric properties

Training Outliers „ Feature outliers: Image occlusion, image noise, illumination

„ Label outliers: mislabeling of training data

3 or 9 ?

„ Robust feature extraction from noisy feature and labels is of particular interest in practice

Renyi’s Quadratic Entropy Renyi’s Entropy:

Renyi’s Quadratic Entropy: Gaussian Kernel Density Estimation

Information Potential and Correntropy „ Information Potential [Principe et al. 2000]

„ Correntropy [Liu et al. 2007]

Problem Formulation Map

into a

matrix

Information Potential term

Tikhonov Correntropy term Regularization term

Robustness Justification IP regularization term

Redescending M-estimator of SRDA [Cai et al., 2008]

Optimization Half-Quadratic Optimization

An augmented Objective Function

Alternate Maximization Renyi’s Entropy Discriminative Analysis(REDA)

On convergence:

Special Case

REDA-LPP

LPP [He and Niyogi, 2004]

Special Case

REDA-SRDA

SRDA [Cai et al., 2008]

Special Case

REDA-LapRLS

LapRLS [Belkin et al., 2006]

Algorithmic Connections

Extensions „

Learning of Response

„

Global Optima: Deterministic Annealing

„

Kernel Extension

[Cai, et al, 2008]

Experiments „ „Dataset: Data SetsYaleB, „

MNIST, TDT2

Outlier Generation z

For Yale B:Randomly select training sample images and partially occlude in them some key facial features.

z

For MNIST and TDT2:Randomly select samples and then label each of them as one of the other classes with equal probabilities.

Experiments (cont.) „

„

On YaleB and MNIST z

LPP vs. REDA-LPP

z

SRDA vs. REDA-SRDA

z

LapRLS vs. REDA-LapRLS

z

Baselines:RLDA, Robust PCA

On TDT2 z

Compare the kernel extensions of the above algorithms

Experiments (cont.) } MNIST {3,8,9}

REDA-LPP, t=1

REDA-LPP, t=6

REDA-SRDA, t=1

REDA-SRDA, t=6

REDA-LapRLS, t=1

REDA-LapRLS, t=6

Experiments (cont.)

Performance comparison on MNIST set

Experiments (cont.) Performance comparison on YaleB set

Performance comparison on TDT2 set

Conclusion We present a robust feature extraction framework by information potential and correntropy maximization.

„

Robustness against training outliers for both features and labels. „

„

Connections with LPP, SRDA and LapRLS.

„

Very simple to implement.

Future research: apply REDA to robust semisupervised learning. „

Thank you !

ICML2009, Montreal

Robust Feature Extraction via Information Theoretic Learning Xiao-Tong Yuan

Bao-Gang Hu

June 17, 2009

Institute of Automation Chinese Academy of Sciences

National Laboratory of Pattern Recognition

Robust Feature Extraction via Information Theoretic ...

Jun 17, 2009 - Training Data. ▫ Goal:Search for a ... Label outliers: mislabeling of training data. ▫ Robust feature ... Performance comparison on MNIST set ...

825KB Sizes 2 Downloads 215 Views

Recommend Documents

Robust Feature Extraction via Information Theoretic ...
function related to the Renyi's entropy of the data fea- tures and the Renyi's .... ties, e.g., highest fixed design breakdown point (Miz- era & Muller, 1999). Problem ...

Robust Information Extraction with Perceptrons
First, we define a new large-margin. Perceptron algorithm tailored for class- unbalanced data which dynamically ad- justs its margins, according to the gener-.

Robust Information Extraction with Perceptrons
... by “building” is the mention of an entity of type FACILITY and sub- ..... We call this algo- rithm the ..... 24. 90. 116. 5.6. 38.5. 2.4. 53.5. 88.0. 59.1. 70.7. PHYS. 428. 76. 298. 113. 8.7. 69.1. 6.2 .... C-SVC SVM type) takes over 15 hours

Efficient and Robust Feature Selection via Joint l2,1 ...
1 p and setting ui = ∥ai∥r and vi = ∥bi∥r, we obtain. (. ∑ i. ∥ai∥p r. )1 p. + ... the second inequality follows the triangle inequality for ℓr norm: ∥ai∥r+∥bi∥r ...

Learning a Selectivity-Invariance-Selectivity Feature Extraction ...
Since we are interested in modeling spatial features, we removed the DC component from the images and normalized them to unit norm before the learning of the features. We compute the norm of the images after. PCA-based whitening. Unlike the norm befo

feature extraction & image processing for computer vision.pdf ...
feature extraction & image processing for computer vision.pdf. feature extraction & image processing for computer vision.pdf. Open. Extract. Open with. Sign In.

robust image feature description, matching and ...
Jun 21, 2016 - Y. Xiao, J. Wu and J. Yuan, “mCENTRIST: A Multi-Channel Feature Generation Mechanism for Scene. Categorization,” IEEE Transactions on Image Processing, Vol. 23, No. 2, pp. 823-836, 2014. 110. I. Daoudi and K. Idrissi, “A fast and

Feature Selection Via Simultaneous Sparse ...
{yxliang, wanglei, lsh, bjzou}@mail.csu.edu.cn. ABSTRACT. There is an ... ity small sample size cases coupled with the support of well- grounded theory [6].

Feature Selection via Regularized Trees
Email: [email protected]. Abstract—We ... ACE selects a set of relevant features using a random forest [2], then eliminates redundant features using the surrogate concept [15]. Also multiple iterations are used to uncover features of secondary

Feature Selection via Regularized Trees
selecting a new feature for splitting the data in a tree node when that feature ... one time. Since tree models are popularly used for data mining, the tree ... The conditional mutual information, that is, the mutual information between two features

Regret Minimization-based Robust Game Theoretic ... - CNSR@VT
provides the system model, including the network setting, and the PU, SU, and ..... the action support, but it will have further advantages when play becomes ...

Regret Minimization-based Robust Game Theoretic ... - CNSR@VT
the action support, but it will have further advantages when play becomes ..... [2] A. B. MacKenzie and L. DaSilva, Game Theory for Wireless Engineers,. Morgan ...

A Random Field Model for Improved Feature Extraction ... - CiteSeerX
Center for Biometrics and Security Research & National Laboratory of Pattern Recognition. Institute of ... MRF) has been used for solving many image analysis prob- lems, including .... In this context, we also call G(C) outlier indicator field.

A Review: Study of Iris Recognition Using Feature Extraction ... - IJRIT
analyses the Iris recognition method segmentation, normalization, feature extraction ... Keyword: Iris recognition, Feature extraction, Gabor filter, Edge detection ...

Constrained Information-Theoretic Tripartite Graph Clustering to ...
bDepartment of Computer Science, University of Illinois at Urbana-Champaign. cMicrosoft Research, dDepartment of Computer Science, Rensselaer ...

Constrained Information-Theoretic Tripartite Graph Clustering to ...
1https://www.freebase.com/. 2We use relation expression to represent the surface pattern of .... Figure 1: Illustration of the CTGC model. R: relation set; E1: left.

Information-Theoretic Identities, Part 1
Jan 29, 2007 - Each case above has one inequality which is easy to see. If. X − Y − Z forms a Markov chain, then, I(X; Z|Y ) = 0. We know that I(X; Z) ≥ 0. So, I(X; Z|Y ) ≤ I(X; Z). On the other hand, if X and Z are independent, then I(X; Z)

Matlab FE_Toolbox - an universal utility for feature extraction of EEG ...
Matlab FE_Toolbox - an universal utility for feature extraction of EEG signals for BCI realization.pdf. Matlab FE_Toolbox - an universal utility for feature extraction ...

Adaptive spectral window sizes for feature extraction ...
the spectral window sizes, the trends in the data will be ... Set the starting point of the 1st window to be the smallest ... The area under the Receiver Operating.