Experiments with Random Projections for Machine Learning Dmitriy Fradkin and David Madigan Rutgers University, Piscataway, NJ

Inductive supervised learning infers a functional relation y = f (x) from a set of training examples T = {(x1, y1), . . . , (xn, yn)}. In what follows the inputs are vectors xi = i h xi1, . . . , xip in
3

The Need for Dimensionality Reduction

Data with large dimensionality presents problems for many machine learning algorithms, since their computational complexity can be superlinear in p and they may need complexity control to avoid overfitting. Traditional methods (PCA/SVD) are computationally expensive: • PCA is O(p2n) + O(p3) [Golub and van Loan, 1983] • SVD is somewhat more efficient: for sparse matrices with r non-zero entries there are O(prn) algorithms [Papadimitriou et. al. 1998].

4

Random Projections

A theorem due to Johnson and Lindenstrauss (JL Theorem) states that for a set of points of size n in p-dimensional Euclidean space there exists a linear transformation of the data into a q-dimensional space, q ≥ O(−2log(n)) that preserves distances up to a factor 1 ±  [Johnson and Lindenstrauss, 1984]. Theorem 1 [Achlioptas, 2001] Given n points in

0 and q ≥ 4+2∗β ln(n), and let 2 3 √1 XP , q

2

−3

E = for projection matrix P . Then, mapping from X to E preserves distances up to factor 1 ±  for all rows in X with probability (1 − n−β ). Projection matrix P , p × q, can be constructed in one of the following ways: • rij = ±1 with probability 0.5 each √ • rij = 3 ∗ (±1 with probability 1/6 each, or 0 with probability 2/3)

The above projections are easy to implement and to compute. Constructing a p × q random matrix is O(pq). Performing the projection for n points is O(npq). We chose to implement the first of the methods suggested by Achlioptas: rij = ±1 with

5

Related Work

• Theoretical Approximate Nearest Neighbor algorithm with polynomial preprocessing and query time polynomial in p and log n [Indyk and Motwani, 1998]. Also, the first tight bounds on the quality of randomized dimensionality reduction. • Learning mixtures of Gaussians in high dimensions [Dasgupta 1999], [Dasgupta, 2000]. Combination of RP with EM algorithm gives good classification results on a handwritten digit dataset. • Preservation of volumes and affine distances [Magen 2002]. • Deterministic algorithm for constructing JL mappings [Engebretsen, Indyk and O’Donnell 2002], used to derandomize several randomized algorithms. • Approximate kernel computations [Achliop¨ tas, McSherry and Scholkopf, 2001], similarity computations for histogram models [Thaper et. al 2002]. [Bingham and Mannila, 2001] experimentally show that RP preserve similarity (inner products) well even when dimensionality of projection is moderate. (Also compared RP to PCA, SVD and DCT). Their data had p = 5000, n = 2262 for text data, and p = 2500, n = 1000 for image data. Projections were done to q ∈ [1, 800].

6

Description of Data

Ionosphere, Spambase and Internet Ads were taken from UCI repository Colon and Leukemia were first used in [Alon et. al 1999] and [Golub et. al. 1999] respectfully. Table 1:

Name # Instances # Attributes Ion 351 34 Spam 4601 57 Ads 3279 1554 Colon 62 2000 Leukemia 72 3571 • Colon and Leukemia datasets are of a high dimensionality but have few points. Thus we would expect RP to high dimensions to lead to good results, while PCA results should stop changing after some point. For these dataset we perform projections into spaces of dimensionality 5, 10, 25, 50, 100, 200 and 500. • Ionosphere and Spam are relatively lowdimensional but have many more points

• Ads dataset is both large and highdimensional. We perform projections are done to 5, 10, 25, 50, 100, 200 and 500.

7

Experimental Setup

We compare PCA and RP using a number of standard machine learning tools: • decision trees (C4.5 - [Quinlan, 1993])

• linear SVM (SVMLight - [Joachims, 1999])

• nearest neighbor (NN)

Test set sizes were kept constant over different splits: Ionosphere - 51, Spambase - 1601, Colon - 12, Leukemia - 12, Ads - 1079. C4.5

Ion

Supervised Learning Problem

1NN

5NN

SVM

100

100

100

100

95

95

95

95

90

90

90

90

85

85

85

85

80

80

80

80

75

75

75

75

70

70

70

Original

65

Original

65

PCA

10

15

20

25

30

PCA

RP

RP

60 5

10

15

20

25

30

60 5

10

15

20

25

30

100

100

100

100

95

95

95

95

90

90

90

90

85

85

85

85

80

80

80

80

75

75

75

75

70

70

70

Original

65

Original

65

PCA RP 10

15

20

RP

25

30

10

15

20

25

30

10

15

20

100

100

95

95

90

90

90

90

85

85

85

85

80

80

80

80

75

75

75

75

70

70

70

Original

RP 100

150

200

250

300

350

400

RP 450

500

100

150

200

250

300

350

400

450

500

100

150

200

250

300

350

400

PCA RP 500

50

100

100

100

95

95

95

90

90

90

90

85

85

85

85

80

80

80

80

75

75

75

75

70

70

70

Original

65

PCA RP 100

150

200

250

300

350

400

RP 450

500

100

150

200

250

300

350

400

450

500

50

100

150

200

250

300

350

400

100

100

95

95

90

90

90

90

85

85

85

85

80

80

80

80

75

75

75

75

70

70

70

Original

PCA

150

200

250

300

350

400

450

500

100

150

200

250

300

350

400

50

100

150

200

250

300

350

400

50

100

150

200

250

300

350

400

450

500

450

500

450

500

Original

65

PCA

RP

RP

60 50

500

Original

PCA

RP 60

100

450

70

Original

65

PCA

RP 50

400

RP 500

95

60

350

PCA

450

100

65

300

60

95

Original

250

65

100

65

200

RP 60

50

150

PCA

60 50

100

70

Original

65

PCA

60

30

Original

65

450

95

Original

Original

60 50

100

65

25

RP 60

50

20

PCA

60 50

15

70

Original

65

PCA

60

10

RP

95

65

5

30

PCA

30

100

PCA

25

65

25

95

Original

20

60 5

100

65

15

RP 60

5

10

PCA

60 5

5

70

Original

65

PCA

60

Original

65

PCA

RP 60

5

70

Original

65

PCA

RP 60

Spam

2

than Colon and Leukemia datasets. Such combination in theory leaves little space for RP to improve, while PCA should be able to do well. We project to dimensions 5, 10, 15, 20, 25 and 30.

Ads

To evaluate the effectiveness of Random Projections (RPs) compared with PCA for machine learning.

probability 0.5 each. Since we are not concerned with preserving distances per se, but only with preserving separation between points, we do not scale our projection: E = XP instead of E = √1q XP

Colon

Purpose

Leukemia

1

60 50

100

150

200

250

300

350

400

450

500

Table 2: Accuracy (Y-axis) using PCA and RP, compared to performance in the original dimension, plotted against the projection dimension (X-axis)

8

Conclusions

• RPs performance was (predictably) below the level of PCA • But: RPs performance was improving noticeably with increasing dimensionality • RPs seem well suited for use with Nearest Neighbor methods • Decision tree did not combine with RP in a satisfactory way.

9

Directions for Further Study

• Train multiple classifier on several different projections and combine their decisions – different projections to the same dimension – projections to different dimensions • Explore performance on significantly larger datasets

10

Acknowledgments

We would like to thank Andrei Anghelescu for providing the kNN code.

Dmitriy Fradkin and David Madigan Rutgers University ...

Data with large dimensionality presents prob- lems for many ... exists a linear transformation of the data into ... mapping from X to E preserves distances up.

76KB Sizes 3 Downloads 183 Views

Recommend Documents

Rutgers University Press
Apr 1, 1987 - By internet, you ... When going to take the experience or ideas types others, publication Selected Letters Of Fyodor Dostoyevsky ... So many books can be found in this internet site. So, this ... Sales Rank: #2098209 in Books q.

A Test for Geoengineering? - Alan Robock - Rutgers University
Jan 29, 2010 - but if a viable technology is produced in the ... consider the best case for conducting experi- .... interests involving many jobs, would lobby to.

A Test for Geoengineering? - Alan Robock - Rutgers University
Jan 29, 2010 - between neighboring filaments ( 9). Most experimental data on ... Cambridge, 2006). 6. H. Cui et al., Science 327, 555 (2010); published online.

The Rutgers University Newark Department of Political Science seeks ...
representations; political economy, financial crises, or labor studies. ... policy, and government; qualitative research methods; or global governance, foreign and.

rutgers -
glycine, alanine, proline, and other small amino acids in self assembled superstructures. The intrinsic chiralities of many ordered phases are to be investigated ...

Ball v. Madigan 037 Memorandum Opinion and Order.pdf ...
The legalization of medical cannabis is a controversial subject. While the use. of cannabis remains illegal under federal law, see 21 U.S.C. § 812, many states ...

Fradkin, Tseytlin, Quantum String Theory Effective Action.pdf ...
Fradkin, Tseytlin, Quantum String Theory Effective Action.pdf. Fradkin, Tseytlin, Quantum String Theory Effective Action.pdf. Open. Extract. Open with. Sign In.

Ball v. Madigan 037 Memorandum Opinion and Order.pdf ...
LISA M. MADIGAN, Attorney General ). of Illinois, CHARLES W. SCHOLZ, ). Chairman, Illinois Board of Elections, ). ERNEST L. GOWEN, Vice Chairman, ).

Summarizing and Mining Skewed Data Streams - DIMACS - Rutgers ...
ces. In Workshop on data mining in resource constrained en- vironments at SIAM Intl Conf on Data mining, 2004. [33] E. Kohler, J. Li, V. Paxson, and S. Shenker.

David Roth Terminal Project, Spring 2007 - Scholars' Bank - University ...
Jun 2, 2007 - specialty walking and running retailers, locations on the. University of ...... Sidewalks at the edge of busy streets with telephone poles taking up.

David Roth Terminal Project, Spring 2007 - Scholars' Bank - University ...
Jun 2, 2007 - 8 Stinson, Monique A. and Bhat, Chandra R. A Comparison of the Route ...... traffic - cell phone use in cars - parking on right hand side of bike ...

Private Browsing: an Inquiry on Usability and ... - Rutgers WINLAB
having inaccurate mental models of the software and ordi- nary users were reluctant to ...... http://support.apple.com/kb/ph5000, 2014. [16] D. J. Ohana and N.

rutgers new brunswick map pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. rutgers new brunswick map pdf. rutgers new brunswick map pdf. Open. Extract. Open with. Sign In. Main menu.

Fradkin, Palchik, Recent Developments in Conformal Invariant ...
Fradkin, Palchik, Recent Developments in Conformal Invariant Quantum Field Theory.pdf. Fradkin, Palchik, Recent Developments in Conformal Invariant ...

'12 Month EFX VPS 2' by Dmitriy Reviews
Hi, in case you've found this article it is very quite likely you have been looking for where to buy 12 Month EFX VPS 2 cheap, otherwise you were searching for 12 ... Note: This Monthly SERVER statistics report contains mostly server statistics for l

Rutgers-Newark School of Public Affairs and Administration - PPMRN
All e-mails from the system will be sent to this address. ... Phone. CAPTCHA. This question is for testing whether or not you are a human visitor and to prevent ...

Private Browsing: an Inquiry on Usability and ... - Rutgers WINLAB
computer from viewing the browsing history and other re- .... Table 1: Table shows the comparison of private browsing mode in five popular ... an Associate (two year) degree. ...... tional Science Foundation under Grant Numbers 1223977.

pdf-1374\introduction-to-management-by-rutgers-business-school ...
... of the apps below to open or edit this item. pdf-1374\introduction-to-management-by-rutgers-business-school-dept-of-management-global-business.pdf.

[[LIVE STREAM]] Illinois vs Rutgers Live Streaming ...
9 hours ago - Streaming Online, Odds, TV Channel, and TV Coverage. ... BEST LINKS TO WATCH Illinois vs Rutgers LIVE STREAM FREE .... BTN, ESPN ,ESPN 2, ESPN UK, Fox, ABC , Sec Network, ESPN U and Sirius XM ... on NCAA mobile app or DirecTVs NCAA app

Alev, David Peat on David Bohm and Krishnamurti.pdf
Page 3 of 17. Alev, David Peat on David Bohm and Krishnamurti.pdf. Alev, David Peat on David Bohm and Krishnamurti.pdf. Open. Extract. Open with. Sign In.

pdf-1299\white-scholars-african-american-texts-from-rutgers ...
pdf-1299\white-scholars-african-american-texts-from-rutgers-university-press.pdf. pdf-1299\white-scholars-african-american-texts-from-rutgers-university-press.

pdf-08107\acts-of-possession-collecting-in-america-from-rutgers ...
Connect more apps... Try one of the apps below to open or edit this item. pdf-08107\acts-of-possession-collecting-in-america-from-rutgers-university-press.pdf.