Large-Scale Graph-Guided Feature Selection with Maximum Flows Chlo´e-Agathe Azencott∗ Knowledge discovery from structured data is one of the central topics in data mining. In particular, graphs, or networks, have attracted considerable attention in the community, as they may represent molecular, biological, social, or other types of systems whose functionality and mechanisms are far from being completely understood. A crucial concern when studying such systems is to determine which part of the graph is responsible for performing a particular function. Hence the general problem of feature selection on graphs, where features coincide with vertices and the graph topology can be viewed as a priori knowledge about the relationships between features, is of broad interest across disciplines. As one example among many, identifying sets of mutations in interacting genes that may influence heritable traits is a core concern of association genetics. The common approach to this problem is to use Lasso-based regression [3] with an 1 -regularizer of the weight vector and additional structured regularizers that represent relationships between features. In spite of their success, we see a number of drawbacks to regression-based approaches in this context. First, they do not easily scale to millions or even hundreds of thousands of features, although such a setting is common, for instance, in genetics. Second, regression-based approaches concentrate on optimizing a prediction loss, while the problem to solve is often formulated in terms of finding features that are relevant for, correlated to or associated with a property of interest. These two issues have been addressed by our recent work in statistical genetics, which proposes a new formulation of graph-constrained feature selection called SConES [1]. This method directly maximizes a score of association rather than minimizing a prediction error. Its optimization scheme is exact and efficient, thanks to a maximum flow reformulation, and it has been empirically shown to recover more causal features than its regression-based counterparts. We have also proposed a new formulation of SConES in a multi-task setting, to improve feature selection in each task by combining and solving multiple tasks simultaneously. Multi-SConES [2] is flexible enough to allow selecting overlapping but non-identical sets of features across related tasks, and incorporating different structural constraints for different tasks. We propose to discuss this recently published work, and related issues in the context of the framework we developed, pertaining in particular to the choice of graph regularizer (SConES being particularly suitable for selecting modules of a modular graph), the choice of regularization parameters (currently based on stability/consistency criteria, but which could for instance rely on p-values), and the incorporation of non-linear models.

References [1] C.-A. Azencott, D. Grimm, M. Sugiyama, Y. Kawahara, and K. Borgwardt. Efficient network-guided multi-locus association mapping with graph cuts. Bioinformatics, 29(13):i171–i179, 2013. [2] M. Sugiyama, C.-A. Azencott, D. Grimm, Y. Kawahara, and K. Borgwardt. Multi-task feature selection with multiple networks via maximum flows. In Proceedings of the 14th SIAM International Conference on Data Mining, 2014. [3] R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58(1):267–288, 1996.

∗ Mines ParisTech, Centre for Computational Biology (CBIO), 73300 Fontainebleau, France – Institut Curie, 75248 Paris Cedex 05, France – INSERM U900, 75248 Paris Cedex 05, France

Large-Scale Graph-Guided Feature Selection with ...

Large-Scale Graph-Guided Feature Selection with Maximum Flows. Chloé-Agathe Azencott∗. Knowledge discovery from structured data is one of the central topics in data mining. In par- ticular, graphs, or networks, have attracted considerable attention in the community, as they may represent molecular, biological, social, ...

51KB Sizes 3 Downloads 256 Views

Recommend Documents

Unsupervised Maximum Margin Feature Selection with ...
p XLXT vp. (14). s.t. ∀i ∈ {1,...,n},r ∈ {1,...,M} : d. ∑ k=1. (vyik−vrk)xik ..... American Mathematical. Society, 1997. 3. [4] C. Constantinopoulos, M. Titsias, and A.

Feature Selection for SVMs
в AT&T Research Laboratories, Red Bank, USA. ttt. Royal Holloway .... results have been limited to linear kernels [3, 7] or linear probabilistic models [8]. Our.

Reconsidering Mutual Information Based Feature Selection: A ...
Abstract. Mutual information (MI) based approaches are a popu- lar feature selection paradigm. Although the stated goal of MI-based feature selection is to identify a subset of features that share the highest mutual information with the class variabl

Unsupervised Feature Selection for Biomarker ... - Semantic Scholar
Feature selection and weighting do both refer to the process of characterizing the relevance of components in fixed-dimensional ..... not assigned.no ontology.

Application to feature selection
[24] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. N.Y.: Dover, 1972. [25] T. Anderson, An Introduction to Multivariate Statistics. N.Y.: Wiley,. 1984. [26] A. Papoulis and S. U. Pillai, Probability, Random Variables, and. Stoch

Orthogonal Principal Feature Selection - Electrical & Computer ...
Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, 02115, USA. Abstract ... tures, consistently picks the best subset of.

Features in Concert: Discriminative Feature Selection meets ...
... classifiers (shown as sample images. 1. arXiv:1411.7714v1 [cs.CV] 27 Nov 2014 ...... ImageNet: A large-scale hierarchical im- age database. In CVPR, 2009. 5.

Unsupervised Maximum Margin Feature Selection ... - Semantic Scholar
Department of Automation, Tsinghua University, Beijing, China. ‡Department of .... programming problem and we propose a cutting plane al- gorithm to ...

Unsupervised Feature Selection Using Nonnegative ...
trix A, ai means the i-th row vector of A, Aij denotes the. (i, j)-th entry of A, ∥A∥F is ..... 4http://www.cs.nyu.edu/∼roweis/data.html. Table 1: Dataset Description.

Unsupervised Feature Selection for Biomarker ...
factor analysis of unlabeled data, has got different limitations: the analytic focus is shifted away from the ..... for predicting high and low fat content, are smoothly shaped, as shown for 10 ..... Machine Learning Research, 5:845–889, 2004. 2.

Feature Selection via Regularized Trees
selecting a new feature for splitting the data in a tree node when that feature ... one time. Since tree models are popularly used for data mining, the tree ... The conditional mutual information, that is, the mutual information between two features

Unsupervised Feature Selection for Biomarker ...
The proposed framework allows to apply custom data simi- ... Recently developed metabolomic and genomic measuring technologies share the .... iteration number k; by definition S(0) := {}, and by construction |S(k)| = k. D .... 3 Applications.

SEQUENTIAL FORWARD FEATURE SELECTION ...
The audio data used in the experiments consist of 1300 utterances,. 800 more than those used in ... European Signal. Processing Conference (EUSIPCO), Antalya, Turkey, 2005. ..... ish Emotional Speech Database,, Internal report, Center for.

Feature Selection Via Simultaneous Sparse ...
{yxliang, wanglei, lsh, bjzou}@mail.csu.edu.cn. ABSTRACT. There is an ... ity small sample size cases coupled with the support of well- grounded theory [6].

Feature Selection via Regularized Trees
Email: [email protected]. Abstract—We ... ACE selects a set of relevant features using a random forest [2], then eliminates redundant features using the surrogate concept [15]. Also multiple iterations are used to uncover features of secondary

Feature Selection for Ranking
uses evaluation measures or loss functions [4][10] in ranking to measure the importance of ..... meaningful to work out an efficient algorithm that solves the.

Implementation of genetic algorithms to feature selection for the use ...
Implementation of genetic algorithms to feature selection for the use of brain-computer interface.pdf. Implementation of genetic algorithms to feature selection for ...

AMIFS: Adaptive Feature Selection by Using Mutual ...
small as possible, to avoid increasing the computational cost of the learning algorithm as well as the classifier complexity, and in many cases degrading the ...