A three-layer model of natural image statistics Michael U. Gutmann Dept of Mathematics and Statistics University of Helsinki [email protected]

Michael U. Gutmann

University of Helsinki - p. 1

The presentation is based on the paper: M. Gutmann and A. Hyvärinen, A three-layer model of natural image statistics, Journal of Physiology-Paris, 2013, in press.

Michael U. Gutmann

University of Helsinki - p. 2

Introduction

Michael U. Gutmann

University of Helsinki - p. 3

Natural scenes contain regularities

Introduction ● Regularities in images ● Usage of regularities ● Research topic ● Selectivity & tolerance ● Tolerant selectivities ● Emergence of higher-level

"Apgar 10/10; Feet", by Jacquelyn Berl.

tolerant selectivities ● Research question



Methods

■ Results Conclusions



Michael U. Gutmann

Dimensions of the image: 150 × 360. (54000 pixels). There are 254000 > 1016000 different binary 150 × 360 images. Only a very small fraction depicts scenes that we may see in our natural environment.

University of Helsinki - p. 4

The regularities are used by the visual system

Introduction ● Regularities in images ● Usage of regularities ● Research topic ● Selectivity & tolerance ● Tolerant selectivities ● Emergence of higher-level tolerant selectivities ● Research question Methods Results Conclusions

They serve as prior information in perception.

Michael U. Gutmann

University of Helsinki - p. 5

Natural environment and the brain

Introduction



● Regularities in images ● Usage of regularities ● Research topic



● Selectivity & tolerance ● Tolerant selectivities ● Emergence of higher-level tolerant selectivities



● Research question

Natural scenes contain a lot of structure (regularities). Basic assumption: The sensory system is adapted to its sensory environment (ecological adaptation). Research topic in general: Relate properties of the natural environment to properties of the sensory (visual) system.

Methods Results Conclusions

Michael U. Gutmann



This talk: Its relation to neural selectivity and invariance (tolerance).

University of Helsinki - p. 6

Neural selectivity and tolerance ■ Introduction ● Regularities in images ● Usage of regularities ● Research topic ● Selectivity & tolerance ● Tolerant selectivities ● Emergence of higher-level tolerant selectivities ● Research question Methods



Some “definitions” of neural selectivity and tolerance: ◆ Neurons are selective to certain properties of the stimulus if their response increases strongly when the stimulus properties become present. ◆ Neurons are tolerant to them if their response does not change much. Example for cells in the primary visual cortex:

Results Conclusions

Simple cells: Selective to orientation and location of the bar Michael U. Gutmann

Complex cells: Tolerant to exact location

University of Helsinki - p. 7

Tolerant selectivities ■ Introduction ● Regularities in images ● Usage of regularities ● Research topic ● Selectivity & tolerance ● Tolerant selectivities ● Emergence of higher-level tolerant selectivities ● Research question



Combining selectivity with tolerance (tolerant selectivities) is helpful in higher visual tasks. Example: To recognize a face, we need to find visual clues that are ◆ specific for the person at hand (selectivity), and ◆ somewhat invariant to the facial expressions (tolerance).

Methods Results Conclusions

(Figure from “Facial Expressions – A Visual Reference for Artists” by M. Simon.)

Michael U. Gutmann

University of Helsinki - p. 8

Emergence of higher-level tolerant selectivities (1/3) ■ Introduction ● Regularities in images ● Usage of regularities ● Research topic ● Selectivity & tolerance ● Tolerant selectivities ● Emergence of higher-level tolerant selectivities ● Research question



Basic hypothesis: Higher-level tolerant selectivities emerge through a sequence of elementary selectivity and tolerance computations. Hypothesis goes back to Kunihiko Fukushima’s “neocognitron”, which is a multi-layer extension of Hubel& Wiesel’s simple-cell, complex-cell cascade.

Methods Results Conclusions

Selectivity Tolerance

Selectivity Tolerance

Selectivity

Tolerance

Figure adapted from “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position”, Biol Cybernetics, 1980.

Michael U. Gutmann

University of Helsinki - p. 9

Emergence of higher-level tolerant selectivities (2/3) Similar idea was put forward by Riesenhuber and Poggio, Nature 1999, and others.

Representation that is suitable for object recognition tasks Tolerance Selectivity

Sequence of elementary computations

Tolerance

Reformating of the neural representation of the stimulus

Selectivity

Tolerance Selectivity

(Adapted from Koh and Poggio, Neural Computation, 2008)

Michael U. Gutmann

University of Helsinki - p. 10

Emergence of higher-level tolerant selectivities (3/3)



There is (indirect) experimental evidence for an increase in selectivity and tolerance along the ventral pathway Rust and DiCarlo, J. Neurosci., 2010



What remains poorly understood is the nature of the tolerance and selectivity computations along the hierarchy.

Tolerance

to what?

?

?

Selectivity

to what?

?

?

Tolerance

?

to what?

?

?

?

?

Selectivity

?

to what?

? ?

?

? ?

?

? ?

?

Tolerance Selectivity (Adapted from Koh and Poggio, Neural Computation, 2008)

Michael U. Gutmann

University of Helsinki - p. 11

Question asked and methodology

Introduction



● Regularities in images ● Usage of regularities ● Research topic ● Selectivity & tolerance ● Tolerant selectivities ● Emergence of higher-level tolerant selectivities



● Research question Methods Results Conclusions



Basic hypothesis: Higher level tolerant selectivities emerge through a sequence of elementary selectivity and invariance computations. Question asked: In a visual system with three processing layers, what should be selected and tolerated at each level of the hierarchy? Methodology: Learn the selectivity and invariance computations from natural images. Learning = fitting a statistical model to natural image data.

Michael U. Gutmann

University of Helsinki - p. 12

Methods

Michael U. Gutmann

University of Helsinki - p. 13

Data We learn the computations for two kinds of image data sets: 1. Image patches of size 32 by 32, extracted from larger images (left). 2. “Tiny images” dataset, converted to gray scale: complete scenes downsampled to 32 by 32 images (right) (Torralba et al, TPAMI 2008)

Michael U. Gutmann

University of Helsinki - p. 14

The three processing layers (1/2) ■ Introduction

■ Methods ● Data ● Processing layers Results Conclusions

Let x be a vectorized image after preprocessing (luminance and contrast gain control, low-pass filtering). The three processing layers are:   (1) (1) yi = max wi · x, 0 , i = 1 . . . 600   (2) (2) yi = ln wi · (y(1) )2 + 1 , i = 1 . . . 100   (2) (2) , z = gain control y   (3) (3) (2) yi = max wi · z , 0 , i = 1 . . . 50 Gain control is similar to the preprocessing: centering, normalizing the norm after whitening, possibly dimension reduction

Michael U. Gutmann

(1)

(2)

(3)



Free parameters: wi , wi , wi . They govern the computations of the three layers.



Constraint: the wi

(2)

(2)

have nonnegative elements, wki ≥ 0. University of Helsinki - p. 15

The three processing layers (2/2) ■ Introduction Methods ● Data ● Processing layers Results



Conclusions





(1) (1) First and third layer: y = max w · x, 0 i i Linear projection followed by rectification. This is a (very) simple model for the steady-state firing rate of neurons.   (2) (2) (1) 2 Second layer: y = ln w · (y ) +1 i i Functional form of the energy model for complex cells

(Adelson, J Opt Soc Am, A, 1985) ■





Michael U. Gutmann

Linear projections/pooling patterns are not yet specified, but learned from the data. (1) (2) (3) The outputs yi , yi , yi are used to define the statistical model (probability density function) of the natural images. (see paper for details) (1)

Fitting the model allows us to learn the parameters wi , (2) (3) wi , wi .

University of Helsinki - p. 16

Results

Michael U. Gutmann

University of Helsinki - p. 17

Computations on the first two layers (in brief)

(2) yi



= ln

P

(1) (2) (w w k ki k

·

x)2

+1



First layer: Selectivity to localized oriented (“Gabor-like”) image structure. (“simple cells”, similar to prev work)





The learned computation on the second layer resembles a max operation over selected first-layer outputs. Second layer: Selectivity to localized oriented image structure. Tolerance to exact localization. (“complex cells”, similar to prev work)

Michael U. Gutmann

University of Helsinki - p. 18

Layer three: example unit for patch data z(2) ■

=

gain control y(2)



(3) yi

= max



(3) wi

·

z(2) , 0



Black frame: space-orientation receptive field. Visualizes the response to local gratings of different orientations. (Anzai et al, Neurons in monkey visual area V2 encode combinations of orientations, Nat Neurosci, 2007)



Red frame: “inhibitory” space-orientation receptive field. Shows the location and orientation of local gratings which inhibit the units most. Receptive field (RF)

Inhibitory RF

Strongly activating images

07

Michael U. Gutmann

University of Helsinki - p. 19

Layer three results: more examples for patch data Receptive field (RF)

Inhibitory RF

Strongly activating images

04

12

28

Michael U. Gutmann

University of Helsinki - p. 20

Layer three results: examples for tiny image data Receptive field (RF)

Inhibitory RF

Strongly activating images

07

09

35

Michael U. Gutmann

University of Helsinki - p. 21

Qualitative observations

Introduction



Methods



Results ● First two layers



● Layer three example ● More examples ● Qualitative observations ● Homogeneity



● Orientation inhibition ● Sparsity Conclusions

Michael U. Gutmann



Receptive fields are well structured and often localized. Emergence of non-classical receptive fields. For tiny images, the receptive fields are more inhomogeneous than for patch data. Excitatory and inhibitory gratings form large angles (orientation inhibition). Selectivity on the third layer: ◆ For patch data: longer contours and texture ◆ For tiny images: longer contours, curvatures

University of Helsinki - p. 22

Population analysis of homogeneity ■

Introduction Methods Results



Maximal difference δ in orientation tuning within a RF on L3: δ < 30◦ : 70%; δ > 60◦ : 10% (patches), 20% (tiny images) Experimental findings (V2 in Macaque monkeys): ◆ Anzai, 2007: δ < 30◦ : 60 − 70%; δ > 60◦ : 30% ◆ Tao, 2012: δ < 30◦ : 80%; δ > 60◦ : 5%

● First two layers

r = 0.75 1

● Layer three example ● More examples ● Qualitative observations

0.9

● Homogeneity ● Orientation inhibition

0.8

Conclusions

Empirical distribution function

● Sparsity

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Michael U. Gutmann

Image patches Tiny images 0

10

20

30 40 50 60 Max orientation difference (deg)

70

80

90

University of Helsinki - p. 23

Population analysis of orientation inhibition ■

Introduction



Methods



Results ● First two layers

We computed the angle between preferred and least preferred orientation for all third-layer units. The mode of the distribution is at 83◦ ± 7◦ . Strongest inhibition occurs for local gratings which are (roughly) orthogonal to the preferred orientation.

● Layer three example ● More examples ● Qualitative observations

0.3

● Homogeneity

Image patches

● Sparsity Conclusions

Fraction of occurence

● Orientation inhibition

Tiny images

0.25 0.2 0.15 0.1 0.05 0 0

Michael U. Gutmann

10

20 30 40 50 60 70 Orientation difference (deg)

80

90

University of Helsinki - p. 24

Lifetime sparsity across the three layers

Introduction



Methods Results ● First two layers



● Layer three example ● More examples ● Qualitative observations ● Homogeneity ● Orientation inhibition ● Sparsity



We use three different indices S1 , S2 , S3 to measure lifetime sparsity (see paper for details). Sparsity on layer one (“L1”) and three (“L3”) are about the same. Squaring (“sq”) increases sparsity. Pooling (“pool”) and taking the logarithm (“L2”) reduces it. Iterating between selectivity and tolerance computations balances sparsity (no net increase).

Sparsity index

Conclusions

1

1

0.8

0.8 Sparsity index



0.6 0.4 S1 S

0.2

0.6 0.4 S1 S

0.2

2

2

S3 0

L1

S3 sq

pool

Patch data Michael U. Gutmann

L2

L3

0

L1

sq

pool

L2

L3

Tiny images University of Helsinki - p. 25

Conclusions

Michael U. Gutmann

University of Helsinki - p. 26

What the talk was about

Introduction



Methods Results Conclusions ● What the talk was about



● What we found



Michael U. Gutmann

Basic hypothesis of our work is: Higher level tolerant selectivities emerge through a sequence of elementary selectivity and invariance computations. We asked: In a visual system with three processing layers, what should be selected and tolerated at each level of the hierarchy? Our approach was: Learn the selectivity and invariance computations from natural images by fitting a statistical model.

University of Helsinki - p. 27

What we found ■ Introduction Methods Results Conclusions ● What the talk was about ● What we found





Michael U. Gutmann

Computations in the first two layers are in line with previous research. For both patch data and tiny images: ◆ First layer: Emergence of selectivity to Gabor-like image structure (“simple cells”) ◆ Second layer: Emergence of tolerance to exact orientation or localization of the stimulus (“complex-cells”) Computations on the third layer: ◆ Patch data: Emergence of selectivity to longer contours and, to some extent, texture. ◆ Tiny images: Emergence of selectivity to longer contours and, to some extent, curvature. ◆ The receptive fields are mostly homogeneous, in line with experimental results. They are more inhomogeneous for tiny images than for patch data. ◆ Emergence of (orientation) inhibition to facilitate the selectivity computations. No net increase of sparsity as we go from layer one to layer three. University of Helsinki - p. 28

A three-layer model of natural image statistics

We use three different indices S1,S2,S3 to measure lifetime sparsity (see paper for details). □ Sparsity on layer one (“L1”) and three (“L3”) are about the same.

2MB Sizes 2 Downloads 224 Views

Recommend Documents

Image statistics underlying natural texture selectivity of ... - PNAS
Dec 17, 2014 - Our daily visual experiences are inevitably linked to recognizing the rich variety of textures. However, how the brain encodes and differentiates a plethora of natural textures remains poorly understood. Here, we show that many neurons

Image statistics underlying natural texture selectivity of ... - PNAS
Dec 17, 2014 - Because the statistics of each image can be com- puted from responses of upstream neurons in ... Email: [email protected]. This article contains supporting information ...... Victor JD, Conte MM (2012) Local image statistics: Maxi

Natural Image Colorization
function by taking local decisions only. ..... Smoothness map plays an important role in integrating .... actions, resulting in many visible mis-colored regions par-.

A Revisit of Generative Model for Automatic Image ...
visit the generative model by addressing the learning of se- ... [15] proposed an asymmetrical support vector machine for region-based .... site i for the kth image.

Learning Natural Image Structure with a Horizontal ...
2 Helsinki Institute for Information Technology. 3 Department of Mathematics ... assumed to be a known supergaussian probability density function (pdf). Due to.

LARGE SCALE NATURAL IMAGE ... - Semantic Scholar
1MOE-MS Key Lab of MCC, University of Science and Technology of China. 2Department of Electrical and Computer Engineering, National University of Singapore. 3Advanced ... million natural image database on different semantic levels defined based on Wo

Image Compression of Natural Images Using Artificial ...
frequency domain using a two-dimensional discrete cosine transform (DCT). The fifth step achieves the actual data reduction (quantisation), by dividing each of ...

Pennsylvania Natural Gas Shopping Statistics As of 4/1/2016
555 Walnut Street. Forum Place, 5th Floor. Harrisburg, PA 17101- ... Pennsylvania Office of Consumer Advocate. ** National Fuel Gas data as of December 2015.

Pennsylvania Natural Gas Shopping Statistics As of 4/1/2016
PA Office of Consumer Advocate. 555 Walnut Street. Forum Place, 5th Floor. Harrisburg, PA 17101-1923. (717) 783-5048 Telephone. (800) 684-6560.

Interactive Natural Image Segmentation via Spline ...
Dec 31, 2010 - approach is popularly used in Photoshop products as a plus tool. However ... case that the data distribution of each class is Gaussian. ...... Conference on Computer Vision and Pattern Recognition, New York, USA, 2006, pp.

Interactive Natural Image Segmentation via Spline ...
Dec 31, 2010 - The computational complexity of the proposed algorithm ... existing algorithms developed in statistical inference and machine learning ... From the second to the ninth are the segmentations obtained by Linear Discriminative Analysis (L

A Spectrum of Natural Texts
Jul 20, 2006 - A Spectrum of Natural Texts: Measurements of their Lexical Demand Levels. Donald P. Hayes. Department of Sociology. Cornell University.

Evaluating a Visualisation of Image Similarity - rodden.org
University of Cambridge Computer Laboratory. Pembroke Street ... very general classes of image (such as “surfing” or “birds”), that do not depend on a user's ...

Papoutsaki - Image of a nation.pdf
Having limited promotional resources of their own, Pacific islands depend heavily on the. image of them portrayed by others (Crocombe, 2001, p 279). Therefore, they are vocal. about whether they are accurately and adequately portrayed by their bigger

Papoutsaki - Image of a nation.pdf
negative and the Australian press coverage of PNG could be used as an example. Page 3 of 16. Papoutsaki - Image of a nation.pdf. Papoutsaki - Image of a ...

Predictions of a Recurrent Model of Orientation
Jan 3, 1997 - linear and an analytic solution to the network can be found. The biases for certain numbers of peaks in the responses become evident once the ...

Predictions of a Recurrent Model of Orientation
Jan 3, 1997 - run on a network of 5 12 units whose output represents the activity of .... on the initial state of the network. .... (O'Toole & Wenderoth, 1977).

Bipartite Graph Reinforcement Model for Web Image ...
Sep 28, 2007 - retrieving abundant images on the internet. ..... 6.4 A Comparison with HITS Algorithm ... It is a link analysis algorithm that rates web pages.

An Improved Hybrid Model for Molecular Image ...
1 Introduction. Image denoising is an active area of interest for image ... e-mail: [email protected]. K. Kannan e-mail: [email protected]. M.R. Kaimal.

HSA-Natural Science - Model Questions and Answers.pdf ...
www.keralapsctips.blogspot.in. Page 1 of 1. HSA-Natural Science - Model Questions and Answers.pdf. HSA-Natural Science - Model Questions and Answers.

Web Image Retrieval Re-Ranking with Relevance Model
ates the relevance of the HTML document linking to the im- age, and assigns a ..... Database is often used for evaluating image retrieval [5] and classification ..... design plan group veget improv garden passion plant grow tree loom crop organ.

Multi-Model Similarity Propagation and its Application for Web Image ...
Figure 1. The two modalities, image content and textual information, can together help group similar Web images .... length of the real line represents the degree of similarities. The ..... Society for Information Science and Technology, 52(10),.