Active sensing in the categorization of visual patterns Scott Cheng-Hsin Yang, Máté Lengyel, and Daniel M. Wolpert Computational and Biological Learning Lab, Department of Engineering, University of Cambridge The information that can be extracted from a visual scene depends on the sequence of eye movements chosen to scan the scene. Passive precomputed scan paths are suboptimal as the optimal eye movement will depend on past information gathered about the actual scene and prior knowledge about scene statistics and categories. Previous studies have identified active sensing in tasks in which the variable about which information needs to be sought is defined in the image space itself, such as the spatial location of a target in a scene during simple visual search (Najemnik & Geisler 2005) or when memorizing a shape (Renninger et al. 2007). However, it is still unknown whether eye movements are optimized to gather information about more abstract features of the visual scene such as visual categories. We examined eye movement in a visual categorization task. Fur-like images were generated of three types: patchy, horizontal stripy, and vertical stripy (Fig. 1a). Participants had to categorize each image pattern as patchy or stripy (disregarding whether a stripy image was horizontal or vertical). The images were generated by Gaussian processes so that the individual pixel values varied widely even within a type and only higher order statistical information (ie. spatial correlation scales) could be used for categorization. We first presented the participants examples of full images to familiarize them with the statistics of the image. We then used a gaze-contingent display in which the entire pattern was initially occluded by a black mask and the underlying image was revealed with a small aperture at each fixation location (Fig. 1b). This allowed us to control the visual information available on each fixation so that the display would give the impression of viewing fur partially occluded by foliage. We developed a Bayesian active sensor (BAS) algorithm for the task that uses knowledge of the statistics of the different visual patterns and the evidence accumulated from previous saccades to compute the probability of each pattern category and, hence, where to look next to achieve the maximal reduction in categorization error. We constructed an ideal observer which computed a posterior distribution, , over image category given the data (collection of previous revealing locations and revealed pixel values in the trial) and knowledge of the length scales corresponding to the different image types. The aim of BAS is to choose the next fixation location, , so as to maximally reduce uncertainty in the category. This objective is formalized by the BAS scoring function which expresses the expected information gain when choosing , and which can be conveniently computed as: [ ] 〈 [ ]〉 where denotes entropy (a measure of uncertainty), is the possible pixel value at and 〈 〉 denotes averaging over the two categories weighted by their posterior probabilities (subscript). This expresses a trade-off between two terms. The first term encourages the selection of locations, , where we have the most overall uncertainty about the pixel value, , while the second term prefers locations for which our expected pixel value for each category is highly certain. We first examined whether our participants used an active eye movement strategy. We show that their eye movement patterns, that is the density of fixation locations, depend on the underlying image patterns (Fig 2a first four rows and 2b), which could not be the case if a passive sensing strategy was used. To assess whether their active strategy contributed to performance improvement, we examined the same participants in a passive revealing condition. When the revealings were drawn randomly from an isotropic Gaussian centered on the image, performance was substantially impaired; however, when revealings were generated by noiseless BAS, performance improved substantially (Fig. 3a). This may seem to suggest that participants employed an active but suboptimal strategy to select their fixation locations, or, alternatively, this may be due to more trivial factors upstream or downstream of the process responsible for selecting the next fixation, such as noise and variability in perception or execution, respectively. To address this issue, we used the ideal observer model to quantify the amount of information accumulated about image category over subsequent revealings in a trial for different revealing strategies. Importantly, when we computed the information gain provided by BAS when operating with participants' perceptual noise, obtained by fitting our ideal observer model to their category choices, and typical saccadic variability reported in the literature, the discrepancy between the informativeness of BAS-generated revealings compared to participants disappeared (Fig. 3b). This suggests that the central component of choosing where to fixate was near-optimal in our participants, and suboptimality arose due to peripheral processes. Furthermore, eye movement patterns derived by BAS for the same images shown to our participants closely matched those in our participants: they were positively correlated with participants' eye movements for the same image type, but negatively correlated with those for different image types (Fig. 2a last row and 2c). In conclusion, using our novel task analyzed in the framework of a Bayesian active sensor algorithm, we were able to show that participants were near-optimal in selecting each individual fixation location, so as to maximize information about image category, with performance only limited by low-level perceptual and motor variability.

a

Patchy

b

Stripy horizontal vertical

Fixation + Image revealings: free-scan — at each fixation location passive — computer-controlled

… aft Ran er do 5- m 25 st re opp ve in ali g ng s



P

S

Categorization choice P = Patchy S = Stripy Feedback

Fig. 1 Image categorization task. a. Example stimuli for each of the three image types sampled from two-dimensional Gaussian processes. b. Experimental design. Participants started each trial by fixating the center cross. In the freescan condition, an aperture of the underlying image was revealed at each fixation location. In the passive condition, revealing locations were chosen by the computer. In both conditions, after a random number of revealings, participants were required to make a category choice (P vs. S) and were given feedback.



a

b

Mean-corrected density maps Stripy horizontal vertical

Participant-self correlation

Patchy

Mean density

d 2 Participant

Participant 1

Participant 1

Average participant

Participant 3

1

1

1

1

0

0

0

0

−1

−1

−1

−1

1

1

1

1

0

0

0

0

−1

−1

−1

−1

c Participant-BAS correlation

Participant 2

Participant 3

Average participant

BAS



0

Fraction correct

a

1

-0.1

Participant 1

Participant 2

Participant 3

1

1

1

0.5

0.5

0.5

0

5 10 15 20 25

0

5 10 15 20 25

0.1

Fig. 2 a. Revealing density maps for participants and BAS. Last three columns show mean-corrected revealing densities for each of the three underlying image types (removing the mean density across image types, first column). Bottom: color scales used for all mean densities (left), and for all mean-corrected densities (right). The color scale for the mean density does not include the initial fixation location (green dots). All density maps use the same scale, such that a density of 1 corresponds to the peak mean density across all maps. b-c. Correlations between meancorrected revealing maps. Orange shows within image type correlation, ie. correlation between revealing densities obtained for images of the same type, while purple shows across image type correlation. b. Correlations for individual participants and their average (pooling data from all participants). c. Correlations between individual participants' (and the average participant's) revealing densities and those generated by BAS.

0

Averageparticipant

Noiseless BAS Free-scan Random

5 10 15 20 25

1

0.5 0

5 10 15 20 25

Information (bits)

b 0.5

Noiseless BAS 0.5

0.5

0.5

BAS Free-scan

Random 0 0

5 10 15 20 25 Revealing number

0 0

5 10 15 20 25 Revealing number

0 0

5 10 15 20 25 Revealing number

0 0

5 10 15 20 25 Revealing number

Fig. 3 Participants' performance in the task. a. Categorization performance as a function of revealing number for each of the three participants, and their average, under the free-scan and passive conditions corresponding to different revealing strategies. Error bars show s.e.m. across trials. b. Cumulative information gain of an ideal observer with different revealing strategies. Error bars show s.e.m. across trials.

Active sensing in the categorization of visual patterns

image space itself, such as the spatial location of a target in a scene during .... Noiseless BAS. Random. Free-scan. BAS. Random. Free-scan. 0. 5 10 15 20 25.

531KB Sizes 1 Downloads 197 Views

Recommend Documents

Active Contour Detection of Linear Patterns in ...
tour algorithm for the detection of linear patterns within remote sensing and vibration data. The proposed technique uses an alternative energy force, overcom-.

The categorization of natural scenes
EOG activity above T 50 AV were excluded from further analysis. ... pass filtering effect of the head as volume conductor—predomi- .... (A) Illustration of the early differential ERP averaged over frontal, centro-parietal and occipito-temporal ...

Active Behavior Recognition in Beyond Visual Range Air ... - Ron Alford
May 31, 2015 - Finite set of predictive agent models. Used in ... models as a probability distribution θ φ d. Alford ... Need both to make a confident prediction.

Active Behavior Recognition in Beyond Visual Range Air ... - Ron Alford
May 31, 2015 - Meta-goal reasoning. We require a utility function! Possible mission success functions: Number of “kills”. Air space denied. Reconnaissance.

Active Learning Methods for Remote Sensing Image ...
Dec 22, 2008 - This is the case of remote sensing images, for which active learning ...... classification”, International Conference on Image Processing ICIP,.

The Kiddy Carousel: Visual Exploration During Active ...
forward-facing carrier near caregivers' eye level. • Head-mounted eye-trackers recorded gaze direction. • 2 tasks: goal-directed target retrieval & free exploratory walking. 3rd person view of caregiver's locomotor path. Caregiver's field of view

Remote sensing of hydrologic recharge in the Peace ...
Apr 18, 2008 - 1996], large-scale storage of soil carbon [Sheng et al., 2004; ..... Cao, M., S. Marshall, and K. Gregson (1996), Global carbon exchange and.

Neural mechanisms of rapid natural scene categorization in ... - Nature
Jun 7, 2009 - This is even true when subjects do not pay attention to the scenes and simultaneously perform an unrelated attentionally demanding task3, a ...

Object Categorization in the Sink: Learning Behavior ...
of Electrical and Computer Engineering, Iowa State University. {shaneg, sukhoy, twegter .... from Lowe's (a home improvement store). The sink fixture contains a ...

Overlapping multivoxel patterns for two levels of visual ... - ScienceOpen
Apr 25, 2013 - Christopher Summerfield*. Department ... e-mail: christopher.summerfield@ ..... during block-wise presentation of black-and-white photographs.

Overlapping multivoxel patterns for two levels of visual ... - ScienceOpen
Apr 25, 2013 - Christopher Summerfield*. Department of ... e-mail: christopher.summerfield@ ..... ison with a distribution of this value generated under the “null.

The Logic of Categorization - Temple CIS - Temple University
In NARS, knowledge is represented in the framework of a categorical logic, so that all relations are ..... from the environment), but also includes “internal experi-.

Vagueness and Order Effects in Color Categorization - David Ripley
Dec 13, 2013 - Springer Science+Business Media Dordrecht 2013. Abstract This paper ... relatively insensitive to small differences: if a predicate like “tall” applies to an individ- ual, then it ..... on the number they were assigned as participa

Vagueness and Order Effects in Color Categorization - David Ripley
Dec 13, 2013 - two color sets (for instance we did not try to ensure that the distance between stimulus. 10 and 11 ... used a constant time lag between each answer and the next stimulus, of 200 ms, with a time lag ..... yellowness in it, even if it i

Onset of treelike patterns in negative streamers
Dec 14, 2012 - As an alternative approach (the one we follow in this work), the ..... limitation and the model does not take into account the energy radiated, the heat exchange, and ... Ciencia e Innovación under projects AYA2009-14027-C07-.

Beating patterns of filaments in viscoelastic fluids
Oct 21, 2008 - time fading-memory model for a polymer solution 6 . We consider two ... solutions 3–5,13,14 . ...... T.R.P. and H.C.F. thank the Aspen Center for.

Polyamine patterns in the cerebrospinal fluid of patients - GitHub
Jun 1, 2010 - ... of Molecular Science and Technology, Ajou University School, Suwon, ..... suggest that altered PAs are related to development of MSA or PD.