IMAGE PATCH ANALYSIS AND CLUSTERING OF SUNSPOTS: A DIMENSIONALITY REDUCTION APPROACH Kevin R. Moon1, Jimmy J. Li1, Véronique Delouille2, Fraser Watson3, Alfred O. Hero III1 1. EECS Dept., University of Michigan; 2. SIDC, Royal Observatory of Belgium; 3. National Solar Observatory, USA

Introduction

Clustering

• Sunspots are associated with active regions (areas of locally increased magnetic flux on the Sun) • Sunspot and active region morphology is correlated with solar flares i.e. sudden increases of the photon flux • Large flares can be disruptive to technology on Earth • Prediction is desirable • Current sunspot classification scheme is the Mt. Wilson scheme • Based on global features identified by eye • Suffers from bias • Recent work has focused on supervised classification techniques • Reduces human bias, but still potentially suboptimal • Goal: build a spatially adaptive descriptive model of the sunspot and active region image modalities for predictive modeling • i.e. perform unsupervised classification on the images • Use both global and local features

• Evidence Accumulating Clustering with MST (EAC-DC) [5] • Forms a metric from the hitting time of two Minimal Spanning Trees grown sequentially from a pair of points • Apply spectral clustering to the resulting dissimilarity matrix • Found to be robust and competitive 1. Learn a dictionary from each image based on information about the intrinsic dimension and spatial and modal correlations 2. Cluster the dictionaries using EAC-DC

Results Intrinsic Dimension Estimation Results

• Current Contributions: a local feature based analysis • We answer the following questions: 1. How many intrinsic parameters are required to describe spatial and modal dependencies? 2. What dependencies exist between the two image modalities and are they captured by linear correlation? 3. What phenomena exist at different scales within the images? • We use this information to cluster the images

• Used 3 × 3 patches • PCA results are within 1 Background Penumbra Umbra STD of the 𝑘𝑘–nn results Single 𝒌𝒌-nn 8.9 4.5 3.4 (except for the umbra in Single PCA (97%) 10.1 4.3 6.3 the single sunspot) Multiple 𝒌𝒌-nn 8.6 4.8 4.0 • Linear decomposition Multiple PCA (97%) 8.9 4.8 3.4 methods are sufficient • Sunspots and magnetic fragments (regions outside of sunspot with magnetic activity) exhibit stronger spatial and modal dependencies (𝑚𝑚 � ≤ 7) • Multiresolution analysis also applied (see paper) Figure 4: Estimated local intrinsic dimension of

Data

CCA Results

Figure 2: Examples of continuum (top) and magnetogram (bottom) images

Methods • Image patches are used to extract local features • Extract patches from both modalities at each pixel into a data matrix (see figure to the right) • #image pixels = 𝑛𝑛, 𝑚𝑚 × 𝑚𝑚 patch size 2 ⇒ Data matrix is 2𝑚𝑚 × 𝑛𝑛

Intrinsic dimension estimation

example images

• 𝜌𝜌1 ≥ 0.9 within sunspot using 3 × 3 patches • Magnetic fragments and edge regions have highest correlation • Magnetic fragments have high positive or negative correlations • ⇒ Processing should include both modalities

• Two image modalities [1] • Continuum (white light) • Magnetogram (mag. field strength) • Masks mark penumbra and umbra [2] • Sunspot group information [3]; e.g. Mt. Wilson label, longitudinal extent, etc.

Figure 1: Masks for the images in Fig. 2

Table 1: Avg. est. intrinsic dimension for 20 images with single sunspots and multiple sunspots using both methods.

Figure 5: Canonical variable images using different patch sizes of the single sunspot image

Clustering Results

• Image dictionaries learned using linear methods (PCA shown) • Some clusters are correlated with sunspot group longitudinal extent • Cluster 5 only includes groups with longitudinal extent ≤ 10° • Cluster 2 has many groups with extent between 2° and 4° • Multidimensional scaling (MDS) used to visualize the results

Figure 3: Mapping of image patches to a single vector

• Used to answer question 1 • Linear method (PCA) • The estimate is the number of eigenvalues of the covariance matrix required to account for a certain percentage of the variance (e.g. 97%) • Non-linear method (𝑘𝑘-nn local dimension estimator) [4] • Intuition: the 𝑘𝑘-nn graph approximates the shape of the data manifold 1. Construct the 𝑘𝑘-nn graph of the set of 𝑛𝑛 points 𝒁𝒁𝑛𝑛 𝛾𝛾 𝑛𝑛 2. Calculate the total edge length: 𝐿𝐿𝛾𝛾,𝑘𝑘 𝒁𝒁𝑛𝑛 = ∑𝑖𝑖=1 ∑𝑧𝑧∈𝑁𝑁𝑘𝑘,𝑖𝑖 𝑧𝑧 − 𝑧𝑧𝑖𝑖 • 𝛾𝛾 > 0 and 𝑁𝑁𝑘𝑘,𝑖𝑖 is the 𝑘𝑘-nn neighborhood of 𝑧𝑧𝑖𝑖 3. Then for large 𝑛𝑛, 𝐿𝐿𝛾𝛾,𝑘𝑘 𝒁𝒁𝑛𝑛 = 𝑛𝑛𝛼𝛼 𝑚𝑚 𝑐𝑐 + 𝜖𝜖𝑛𝑛 𝑚𝑚−𝛾𝛾 • 𝜖𝜖𝑛𝑛 → 0 as 𝑛𝑛 → ∞, 𝛼𝛼 = , 𝑚𝑚 is the intrinsic dimension, 𝑐𝑐 a constant 𝑚𝑚 4. Use non-linear least squares over different values of 𝑛𝑛 to estimate 𝑚𝑚 � 5. Estimate local intrinsic dimension by using smaller neighborhoods • Useful when data lie on different manifolds

Canonical Correlation Analysis (CCA) • Used to answer questions 2 and 3 • Finds vectors 𝑎𝑎𝑖𝑖 and 𝑏𝑏𝑖𝑖 that maximize 𝜌𝜌𝑖𝑖 = 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑎𝑎𝑖𝑖𝑇𝑇 𝒙𝒙, 𝑏𝑏𝑖𝑖𝑇𝑇 𝒚𝒚 • 𝑢𝑢𝑖𝑖 = 𝑎𝑎𝑖𝑖𝑇𝑇 𝒙𝒙, 𝑣𝑣𝑖𝑖 = 𝑏𝑏𝑖𝑖𝑇𝑇 𝒚𝒚 are uncorrelated from all other 𝑢𝑢𝑗𝑗 and 𝑣𝑣𝑗𝑗 • Apply with 𝒙𝒙 = continuum patch and 𝒚𝒚 = magnetogram patch

Figure 6: Scatter plot of the first two MDS projections. Colors indicate the cluster assignment (left) and Mt. Wilson label (right)

• 3 clearly separable regions are visible • Compare Mt. Wilson • NMI=0.1, ARI=0.07 • Consistent with global approach of Mt Wilson vs. our local approach • Small groupings of Mt. Wilson labels

Conclusion • There are strong spatial and modal correlations within sunspots • Linear methods are sufficient to capture these correlations • Magnetic fragments and transition regions are the most coupled • Image patch dictionary clustering results in clearly separable regions Acknowledgments This work was partially supported by NSF grant CCF-1217880 and a NSF Graduate Research Fellowship to the first author under Grant No. F031543.

References [1] P.H. Scherrer et al, “The Solar Oscillations Investigation—Michelson Doppler Imager,” Solar Physics, vol. 162, pp. 129—188, Dec. 1995. [2] F.T. Watson, L. Fletcher, and S. Marshall, “Evolution of sunspot properties during solar cycle 23,” Astronomy & Astrophysics, vol. 533, pp. A14, Sept. 2011. [3] http://www.swpc.noaa.gov/ftpdir/forecasts/SRS [4] K.M. Carter, R. Raich, and A.O. Hero III, “On local intrinsic dimension estimation and its applications,” IEEE Tran. on Signal Processing, vol. 58, no. 2, pp. 650—663, 2010. [5] L. Gallucio, O. Michel, P. Comon, M. Kliger, and A.O. Hero III, “Clustering with a new distance measure based on a dual-rooted tree,” Information Sciences, vol. 251, pp. 96—113, 2013.

Introduction Data Results Conclusion Methods

Large flares can be disruptive to technology on Earth. • Prediction is ... images? • We use this information to cluster the images ... This work was partially supported by NSF grant CCF-1217880 and a NSF Graduate Research. Fellowship to the ...

283KB Sizes 4 Downloads 227 Views

Recommend Documents

methods abstract results discussion introduction ...
Research, Project Monitoring Unit, Project Directorate on Animal Disease ... availability, weak infrastructure, and unique cultural beliefs in India emphasize the.

introduction methods conclusions purpose results
composition predictor variable produced valid VO2 Peak estimates. (compared to VO2 Peak obtained from the Bruce treadmill (TM) test protocol) for average fit ...

discussion results objectives introduction methods ...
17 sites with abundant Vaccinium sp. within the UAF. Bonanza ... 15-20 m or 25-40 m from the centers of each site. ... Citizen science monitoring of Melilotus sp.

introduction methods conclusions purpose results
Center (BRIC), School of Medicine, University of N. Carolina at Chapel Hill. ACKNOWLEDGEMENTS ... (ParvoMedics, Salt Lake City, Utah ). Blood pressure and ...

abstract conclusion results reference motivation method ...
Traditional 3×3 grid authentication for Android is prone to attacks such as shoulder ... Experiment Design: 17×2 mixed factor design (17 gesture repetitions with ...

Abstract Experiments 1 & 2 Conclusion Introduction ...
Naturalness of lexical alternatives predicts time course of scalar ... Some utterances are underinformative: The onset and time course of scalar inferences. Journal of ... 3b 37 Click on men- tioned gumballs if statement cor- rect, on central button

2. Background 5. Conclusion 1. Introduction 3 ...
1. Introduction. With the advent of the photonic crystal a new concept in fiber optics called photonic crystal fiber. (PCF) has come to forefront in fiber research.

Basis for Conclusion - ISA 560 (Redrafted) - IFAC
Dec 1, 2008 - (a) The firm and its personnel comply with professional standards and .... ED-ISA 220 and ED-ISQC 1 retained the definition of “firm” as per the ...

Beer's law conclusion questions.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Beer's law ...

of East Palaearctic: IV. (Conclusion)
of East Palaearctic: IV. (Conclusion) ...... nus Urophora R.-D. (Diptera, Tephritidae) of East Pa- laearctic: I. Key to Subgenera .... Europe, Tech. Bull. Commonw.

Literature Conclusion & Discussion Gait sensitivity ...
Hobbelen, D. G. E. and M. Wisse (2007). "A disturbance rejection measure for limit cycle walkers: The Gait Sensitivity Norm." IEEE Transactions on Robotics 23,. 1213-1224. Hobbelen, D.G.E.(2008) "Limit Cycle Walking", PhD Thesis, Delft University of.

References Results Conclusions Data Methodology
of relevance in the training set on the performance of ... set, validation and testing set. • select a total number of ... ~30,000 web queries. • 136 features for each ...

Basis for Conclusion - ISA 560 (Redrafted) - IFAC
Dec 1, 2008 - FileDL.php?FID=4426. 2 .... the IFAC Code and ISA 220 (Redrafted) and ISQC 1 (Redrafted) in key definitions. The respondent also suggested ...

Writing Your Conclusion Big Fish.pdf
Sign in. Loading… Whoops! There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying.