A spatial variant approach for vergence control in ...

Viewer
Transcript

Image and Vision Computing 29 (2011) 64–77

Contents lists available at ScienceDirect

Image and Vision Computing j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / i m av i s

A spatial variant approach for vergence control in complex scenes Xuejie Zhang, Leng Phuan Tay ⁎ School of Computer Engineering, Nanyang Technological University, Blk N4, Nanyang Avenue, Singapore 639798

a r t i c l e

i n f o

Article history: Received 13 July 2009 Received in revised form 10 May 2010 Accepted 5 August 2010 Keywords: Vergence control Log polar transformation Image pyramid Disparity estimation

a b s t r a c t The ﬂexibility in primate vision that utilizes active binocular vision is unparalleled even with modern ﬁxedstereo-based vision systems. However to follow the path of active binocular vision, the difﬁculty of attaining ﬁxative capabilities is of primary concern. This paper presents a binocular vergence model that utilizes the retino-cortical log polar mapping in the primate vision system. Individual images of the binocular pair were converted to multi-resolution pyramids bearing a coarse-to-ﬁne architecture (low resolution to high resolution) and disparity estimation on these pyramidal resolutions was performed using normalized cross correlation on the log polar images. The model was deployed on an actual binocular vergence system with independent pan-tilt controls for each camera and the system was able to robustly verge on objects even in cluttered environment with real time performance. This paper even presents the experimental results of the system functioning in unbalanced contrast exposures between the two cameras. The results proved favorable for real world robotic vision applications where noise is prevalent. The proposed vergence control model was also compared with a standard window based Cartesian stereo matching method and showed superior performance. © 2010 Elsevier B.V. All rights reserved.

1. Introduction Active binocular vision systems mimic primate biological vision by using motorized cameras to emulate the extra-ocular muscles attached to the eyeball. While other methods such as ﬁxed-camera stereopsis can retrieve the needed depth information, active binocular systems can potentially be more robust, removing the delicate reliance on lens parameters to determine depth information. Active binocular systems can also provide a wider ﬁeld of vision with fewer concerns over image distortions at the lens periphery. From the psychological aspect of communication, human body-language interactions are usually complemented with visual ﬁxations that function as cognitive cues to project one's focus of attention. It is from this perspective that we lean our emphasis on vergence control, the pillar of ﬁxations. The comprehensive primate vision system includes several distinct subsystems that govern the processes such as the attention mechanism, binocular fusion, saccade and smooth pursuit. Without vergence, many of these primal functions will be impaired. Vergence is the process of directing both eyes to foveate singularly on a target, providing visual information for 3D perception. The combination of vergence and other eye movements such as saccade allows the visual system to examine the scene at a fast speed through multiple ﬁxation points [1]. It is thus believed that this process of multiple saccadic

observations reduces the processing load on the early visual areas and through organized pathways, the brain is able to multiplex thought processes. It has been discovered that the retino-cortical mapping in the primate vision system can be modeled by a log polar geometry [2] and it motivated investigations to determine any apparent advantage that can be derived from this geometric formation. A vergence control model is proposed using a coarse-to-ﬁne disparity estimation in the log polar space. This computational architecture utilizes the inherent foveal magniﬁcation properties of the log polar transformation to systematically focus towards the foveal information, eventually providing a stable vergence. This proposed method can be used reliably for real time vergence control. This paper discusses the rationale for the system implementation and further illustrates the performance of the proposed vergence control model through the experimental results obtained. The understanding of three essential components is necessary for conceptualization of the robust vergence. These include the signiﬁcance of the log polar transformation, the log polar based normalized cross correlation and the coarse-to-ﬁne image pyramidal search strategy. Section 2 is a review of some existing vergence control methods. Section 3 presents the proposed log polar correlation model using image pyramid. Section 4 presents experimental results and Section 5 concludes the paper. 2. Strategic overview of the proposed vergence control

⁎ Corresponding author. Tel.: + 65 6790 4604/6790 6965; fax: + 65 6792 6559. E-mail addresses: [email protected] (X. Zhang), [email protected] (L.P. Tay). 0262-8856/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.imavis.2010.08.005

Primate raw binocular images consist of two slightly offset perspectives which reveal the presence of a mismatched double

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

image when overlaid across each other. 3D perception is achieved when these images are processed in the binocular fusion system located higher in the visual cortex [3]. Such fusion of perspectives within the cortex has been actively studied by Grossberg et al. in 3D LAMINART [4]. When a binocular system verges on an object, the spatial reference position of the object can be geometrically derived from the camera baseline and motor angles, providing a spatial referencing for the point of ﬁxation with respect to the camera platform. A continuous observation of the environment, as discovered by Yarbus in his study of human saccades [1], provides means to observe the environment over a quick sequence of short duration micro saccades in the order of milliseconds. There are four sources of vergence in the human vision system: binocular disparity, accommodation, tonic, and proximal vergence [5]. Most existing vergence control models in the literature, such as those presented in Refs. [6–9], belong in the category of binocular disparity estimation. Disparity refers to the positional difference between corresponding points in the left and right images. Disparity can be used as a motor differential signal to verge the cameras. There are many disparity estimation models based on biological and physiological studies of the primate vision system. The disparity tuning responses of binocular simple cells and complex cells have been investigated and disparity energy model was proposed to simulate the responses of binocular visual neurons [7,10–18]. The spatial organization of visual information in the visual cortex was also studied and utilized for the design of disparity estimation mechanisms. In the primary visual cortex, the visual information from the left eye and the right eye is interlaced as sliced image segments, forming a structure called the ocular dominance columns. This gave rise to the cepstrum based disparity estimation model proposed by Yeshurun and Schwartz [8,10,19]. Apart from the biologically inspired methods, a direct way of establishing correspondence is to use explicit matching to ﬁnd correspondences in the binocular images. One such method uses the correlation based method to search for matching regions in the binocular pair of images [20–26]. Despite the performance of correlation measures, a major issue of searching based disparity estimation methods is the working range limit. When the disparity existing in the image exceeds the working range the algorithms return erroneous disparity results. This occurs especially when the ﬁxation point is in the process of switching between a near focus to a signiﬁcantly distant position. An approach to solve this is to use multiresolution image pyramid and process from coarser to ﬁner levels [6– 8,26]. The multi resolution image pyramid increases the working range but is still limited by the pyramid levels and localized range at each level. Although exhaustive search can be employed to resolve the range issue, it is not efﬁcient and would likely introduce noise to the estimation due to the increase in likelihood of uncorrelated visual information present in perspective views. Inspired by the visual mapping geometry in the visual cortex, we propose to enlarge the working range through the use of spatial variant transforms of the original image, such as log polar transformation [2,27–30]. In this transformation, a large portion of the primary visual cortex is mapped to the small, central portion of the visual ﬁeld. This is sometimes referenced as a cortical magniﬁcation. This magniﬁcation provides a higher resolution in the fovea of the retina and is considered important for many vision components such as our visual attention and vergence control. Comparing with the Cartesian images, log polar images magnify the image center and diminish the periphery of the original image, creating a natural bias for information residing within the fovea. This creates a natural possibility of estimating the disparity in cluttered environments as there is a foveal focusing priority over the periphery. The spatial variant transformation is expected to show advantages over alternative Cartesian images for the vergence control task since the latter attempts to resolve the entire image with an equal

65

weight. The log polar transformation has been used widely and successfully for disparity estimation and vergence control [20,21,23,24]. In this paper we combine the pyramidal approach and log polar transformation to build a robust and efﬁcient model for vergence control which can survive cluttered environments. The left and right images are initially converted into image pyramids. The search for corresponding position is carried out through the pyramid architecture beginning from the coarser and working towards the ﬁner levels. At each level, the reference image (the left camera image) is transformed to log polar image with the origin as the reference position. The backward log polar transformation maps each log polar pixel to the nearest pixel in Cartesian space. At each candidate position in the candidate image (the right image), log polar transformation is performed with this position as origin. Normalized cross correlation between the left reference log polar image and the right candidate log polar images is used to derive the disparity. The disparity at the image center for each pyramid level was used iteratively for adjustment of vergence angle until the vergence of the binocular vision system was achieved. 3. The pyramidal log polar correlation model Gravitating from binocular vergence being the primary building block for primate visual perception, the precise objective of this paper is to provide a robust and reliable vergence control systems. The vergence control in this paper refers to the manipulation of two cameras to ﬁxate on a single point of interest within a scene. Since our objective is to address the low level vision faculty, the point of ﬁxation can therefore be any dynamically deﬁned point that is visible in both camera images. The system consists of a coarse-to-ﬁne model with varying image resolutions of the image and these images from the different resolution layers that resemble the stacked pyramidal layers. On each layer, a normalized cross correlation of the log polar transformed images is used to determine disparity. The log polar image inherently empowers the cross correlation disparity estimator with a global abstraction for region matching. At the same time the log polar transformation magniﬁes the center of image, creating an emphasis to the central region of ﬁxation for cross correlation computations. This combined disparity of information from both the center and extreme edges of the binocular image pair is used for vergence control. As illustrated in Fig. 1, the system maintains two CCD cameras mounted on two independent pan-tilt control units. Each camera captures 1000 × 1000 resolution images with 46° angle of view. In the proposed vergence control model, the 1000 × 1000 images were converted to 200 × 200 images and used for the disparity estimation. The baseline distance between the two cameras was set at 24 cm. Each pan-tilt control has two stepper motors for panning and tilting, which provides rigid and repeatable positioning. The motors have the capability to rotate at a speed of 300°/s with a resolution of 3.086 arc min (0.0514°). 3.1. Signiﬁcance of the log polar transformation Referencing the visual plane in the typical X–Y plane and setting the coordinates (x0, y0) at the middle of the image as the origin, the log polar transformation can be represented by Eq. (1). 8 qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ > > ðx−x0 Þ2 + ðy−y0 Þ2 ρ = log > a < > > −1 y−y0 > : φ = tan x−x0

ð1Þ

where ρ is the logarithm of the Euclidean distance between the Cartesian coordinates of a pixel (x, y) and the origin (x0, y0) and φ is the polar angle of the point (x, y).

66

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

Fig. 1. The binocular cameras and the pan-tilt controls.

Given the size of the original image as (W, H) and the size of log polar image as (ρmax, φmax), the logarithmic base a can be determined by the original image size and the transformed log polar image size [31] according to Eq. (2). 0

= H=2

W

2 ; Bln min B lnðrmax Þ a = exp = exp B B ρmax ρmax @

!!1 C C C C A

ð2Þ

where rmax is the maximum radius (distance from origin) of the log polar pixels in spatial domain. A more general treatment of this transformation can be found in Ref. [2]. An ideal log polar image would be the one obtained from a log polar image sensor. However, this is not always available and requires special manufacturing process at potentially much higher cost than the normal Cartesian CCD sensors. Therefore most existing studies into vergence control have relied on transforming the normal Cartesian images to log polar images [20,23]. The transformation or mapping of corresponding pixels between Cartesian and log polar space is non-linear. There has been research on how to optimally

transform a Cartesian image to a log polar image [32,33]. In this paper, for the purpose of simplicity, a log polar sampling method was adopted. We will show in Section 3.2.1 that the sampling method shows almost equivalent performance with a log polar bilinear interpolation method in disparity estimation. This sampling method ensures sufﬁcient image quality while providing adequate performance with signiﬁcantly reduced computational cost. The log polar sampling process requires a backward log polar transformation which can be pre-computed as a mapping table. The value of a pixel in the log polar image is set to the value of the nearest corresponding pixel in the Cartesian image. The backward log polar mapping was modeled using Eq. (3) and this provides an inverse transform of Eq. (1). 8 > > > > > > > > > > > > > > > > > :

V

sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ þ a2ρ 1 + tan2 ðφÞ sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ − a2ρ 1 + tan2 ðφÞ

h πi 3π ; 2π if φ∈ 0; or 2 2 : otherwise

y = x tanðφÞ

Fig. 2. Log polar transformation on the Baboon image [34]: (a) original image of size 512 × 512 and (b) log polar image of size 200 × 360.

ð3Þ

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

Fig. 2 illustrates the effects of the backward log polar transformation, which is similar to a spatial variant circular sampling of the input image. Using an image centered origin and starting from the positive x-axis, the backward log polar transformation samples point in an anticlockwise sweep direction. The resultant log polar transformed image magniﬁes the center, reducing the contributions from the image periphery. In such a transformation, all points in the original image are in some way represented in the transformed image, though in varying degrees of contribution. In Fig. 2, the red nose of the baboon appearing in the middle of the original image occupies about 3 quarters of the transformed log polar space, while the eye that is further from the original image center has a reduced representation and appears rather beady on the top right quadrant of the transformed image. Such properties of the log polar transformation provide ideal conditions for object registration in paired perspective images. Objects that lie in the periphery are subjected to a many-to-one correspondence of the visual-cortical mapping and this compensates the typically adverse disparity experience at the periphery by a factor proportional to the distance from the point of ﬁxation. In essence, it reduces the contribution of the periphery regions, though not entirely omitting their presence. For pixels at the fovea, a one-to-many correspondence of the visual-cortical space results in a cortical magniﬁcation, thus providing a heavier emphasis on the ﬁxation point of the image. Since the log polar transformation magniﬁes any disparity at the fovea, it provides a means for high resolution matching at the fovea. It is with these two properties that this paper leverages to provide an efﬁcient and effective means of binocular vergence through a foveal-centric matching. 3.2. Normalized cross correlation in the log polar domain Normalized cross correlation (NCC) has often been used in disparity estimation [35,36]. The NCC, expressed in Eq. (4), deﬁnes the cross correlation between two normalized image patches f(x,y) and g(x,y). Its typical application to the spatial domain has been successful through the use of a sliding window matching process and this provides an effective stability in area-based stereo matching. The NCC method for disparity estimation was adopted and applied in the log polar domain, utilizing the cortical magniﬁcation property of the globally inclusive transformation to provide a better vergence [20,23].

NCC =

1 ∑ n−1 x;y

f ðx; yÞ− ¯f g ðx; yÞ− ¯g σf σg

:

ð4Þ

Unlike the Cartesian domain transforms where matches can be derived through linear shifts of the image, NCC cannot be done by linear shifts and matching in log polar domain. The correlation based method becomes more complex in log polar domain because the matching window is non-linearly transformed. However, regardless of the shift, the unique matching positions will preserve the highest correlation [23]. It is thus expected that the transformed image of the sliding window will possess the highest correlation in log polar domain only when the two perspectives coincide under the log polar transformation. To illustrate the effectiveness of using log polar image for correlation based disparity estimation, an experiment was conducted to plot the NCC against the ground truth disparity. Fig. 3(a) shows a sample pair of binocular images used in the experiment. The foveas of the left and right image reside on the same point in space, targeting the bottle. To derive a disparity tuning curve using the log polar transformation and normalized cross correlation, the left image was initially transformed into a log polar representation with the image center as the origin. The right camera was panned at 0.5° intervals to the left and right of the original position and at

67

each position, the image was transformed to the corresponding log polar image and the NCC between the left log polar image and the right log polar image was calculated. The resolution of the original image pair was 200 × 200. The resolution of log polar images was set at 25 × 45, 50 × 90, 100 × 180 and 200 × 360 respectively, to evaluate the affect of log polar resolution to the NCC measure. The NCC values were plotted against the true disparity, shown in Fig. 3(b). There is an indication of high consistency, indicating that the estimation is stable and a low resolution log polar image can be used for effective vergence control. The center position at zero disparity exhibits the highest NCC, indicating that both image perspectives possess the maximum log polar correlation at this position of the 0.5°-interval sliding window. This illustrates the feasibility and efﬁcacy of the log polar based NCC computations. Subsequently, the resolution of log polar images was ﬁxed to 50 × 90 and the original binocular image pair was scaled at three ratios (1, 0.5, and 0.25). The disparity tuning curves were produced using different resolutions of the Cartesian images and shown in Fig. 3(c). The results showed that a reduction of Cartesian image resolution affects little of the disparity tuning curves. This provides a possibility of designing a multi-level pyramidal algorithm for disparity estimation, which provides a basis for the pyramidal log polar correlation model presented in this paper. As log polar image is a cortical magniﬁcation of the original image, the correlation in log polar space can be classiﬁed as a weighted correlation in Cartesian space. Each Cartesian pixel is considered in the correlation measure associated with a weight factor. The weight for each pixel in the correlation measure is determined by the cortical magniﬁcation factor produced by the log polar transformation. This cortical magniﬁcation factor is deﬁned as the number of pixels in log polar space that corresponds to 1 pixel at a certain location in Cartesian space. This factor can be decomposed into two components: the magniﬁcation along ρ-axis and along φ-axis. In a continuous transformation, 1 pixel in Cartesian domain at a radial distance r occupies a step from r−0.5 to r + 0.5 along the direction of ρ-axis. This interval corresponds to the following distance on axis ρ: Rp = loga ðr + 0:5Þ− loga ðr−0:5Þ =

dðloga ðr ÞÞ 1 = : dr r lnðaÞ

ð5Þ

Along the φ-axis, one ring of pixels with diameter r corresponds to φmax pixels in log polar space. Therefore the magniﬁcation ratio on φ-axis is: Rφ =

φmax : 2πr

ð6Þ

The overall magniﬁcation is the multiplication of these two components. R = Rρ Rφ =

φmax 1 φmax 1 = : 2π lnðaÞ r 2 2π lnðaÞ ðx−x0 Þ2 + ð y−y0 Þ2

ð7Þ

The magniﬁcation factor is therefore inversely proportional to the squared distance from the origin of the log polar transformation. Assuming the system is ﬁxating on a foreground object, the correlation measure can be considered as summations of the correlation from foreground region and background region. The weight for the foreground region is signiﬁcantly larger than the background as the foreground is at the fovea of the input image for vergence control tasks. When the foreground regions of the binocular image pair match and produce a high correlation, the difference in the background contributes little to the correlation measure because of the exponential decrease of the magniﬁcation. However, when the foreground regions do not fuse, both the

68

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

Fig. 3. (a) The binocular image pair, (b) disparity tuning curves with log polar images of different resolutions, (c) disparity tuning curves with log polar images of ﬁxed resolution (50 × 90) transformed from different resolutions of Cartesian images.

foreground correlation and background correlation values are small. Thus the disparity tuning curve has a distinct peak only when the correct disparity is achieved. This signiﬁcantly stabilizes the vergence on foreground object, especially when the foreground is small and has large disparity changes with the background. In Cartesian space, due to the even contribution from the whole image, vergence is directed to an intermediate depth, depending on the sizes of foreground and background regions in the matching window.

3.2.1. Comparison with a log polar bilinear interpolation approach In the proposed model, a log polar sampling method was used. Here we show that the sampling method provides almost equivalent performance with a bilinear interpolation method. Utilizing the binocular image series in Fig. 3(a), the transformed log polar images and disparity tuning curves are shown in Fig. 4. We can see although the image quality was slightly degraded in the sampling approach, the disparity tuning curves were generally identical to that produced by bilinear interpolation. In a real time robot application, the backward

Fig. 4. The disparity tuning curves with log polar images by (a) nearest neighbor sampling and (b) bilinear interpolation.

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

log polar sampling reduces computation because it is merely a mapping process with a pre-computed map. Thus the backward log polar sampling approach was adopted. 3.2.2. Comparison with a Gaussian weighted approach Similar to the log polar fovea magniﬁcation, a spatial variant weight such as Gaussian weight can be applied to the correlation measure in Cartesian space to focus on the center of the image. Fig. 5 shows the comparison of the log polar magniﬁcation and Gaussian magniﬁcation. The log polar transformation signiﬁcantly magniﬁes the center of the image. The Gaussian weighted approach also diminishes the background but in a smoother manner and the magniﬁcation of center is not as sharp as the log polar approach. To compare the backward log polar sampling approach with a Gaussian magniﬁcation approach, another experiment was conducted using the experimental images in Fig. 6. The two cameras were initially verged on a foreground object at 1.8 m depth in front of the cluttered background which is about 5 m depth. The right camera was subsequently paned to the left and right in the range of [−10°, 10°] at a step of 1°. Twenty-one pairs of binocular images were captured and used for generating the disparity tuning curve. The Gaussian weighted approach is realized by biasing the original binocular image pair with a Gaussian magniﬁcation function and then computing the NCC between the weighted binocular pair, shown in Eq. (8). ′ f ðx; yÞ−¯f ′ g′ðx; yÞ−¯g′ 1 GNCC = ∑ σf ′ σg ′ n−1 x;y ′

ð8Þ

′

where f = f *G and g = g*G are the Gaussian weighted binocular images 2

and Gðx; yÞis a Gaussian function centered at ðx0 ; y0 Þ with variance σ :

Fig. 7 shows the disparity tuning curves produced by the Gaussian magniﬁcation approach and the log polar approach. It can be seen that with a smaller or larger Gaussian width (σ), the disparity tuning curves was incorrect, lacking a peak at position 0. With σ = 40, the disparity tuning curve has a peak at the correct position. When σ was small (σ = 5), the disparity tuning curve experienced many confusing peaks. As σ tended towards 100, the peak of the tuning curve shifted to the background depth, indicated by the dashed vertical line. It is

69

natural that a larger σ will attempt to match the whole image instead of fovea region and a small σ tends to follow local variations. As expected, the Gaussian weighted approach resembles the standard window based methods with the standard deviation σ being the matching window size. Therefore Gaussian magniﬁcation is a smoother varying weight while the log polar magniﬁcation decreases much faster. In Fig. 7, the disparity tuning curves for different radial resolutions of log polar images are quite consistent, indicating the robust and stable performance of the log polar approach for vergence control tasks in cluttered scenes. 3.2.3. Epipolar geometry The searching range in a stereo matching method can be limited to a straight line by the epipolar geometry, given both the intrinsic and extrinsic parameters of the binocular system. In this system, the slope of the epipolar line of the center of the left image can be estimated given the camera conﬁgurations (baseline 0.24 m, focal length 8 mm, 6.45 μm square pixels). When the two cameras are verging on at a distance of 1 m away right in front of the platform, the epipolar line of the center of the left camera has a slope of 0.05 which is a fairly small slope. Thus the vertical correspondence does not deviate too much from the horizontal axis, especially in the fovea region. This slope decreases with the increase of verging distance or target depth. When the two cameras are parallel, the epipolar line is just a horizontal scan line and a horizontal direction search process is sufﬁcient. Considering the task of vergence control, we adopted a horizontalthen-vertical search approach instead of searching on the epipolar line. This is possible for the following reasons. Firstly, in a ﬁxed-stereo system, the epipolar geometry can limit the searching range to a single line, hence reducing the problem from 2D to 1D and thus reducing computational cost. However, in an active vision system, the epipolar geometry constantly changes and each time the vergence angle is adjusted, the fundamental matrix for epipolar geometry should be recalculated. In a real time system, the vergence is adjusted at frame rate and this will increase the computational load. A better approach is to constrain the search in an epipolar stripe region considering all the possible epipolar lines due to the verging operation [37]. Secondly, the binocular system may undergo vertical misalignment in the process of vergence control. In this case, the geometry is more complex and the estimation of epipolar line is more difﬁcult. Thirdly, the fundamental matrix relies on the intrinsic parameter

Fig. 5. The magniﬁcation of log polar and Gaussian weight function. Left column: the magniﬁcation (ρmax = 50, φmax = 90) along φ-axis, along ρ-axis, and the total log polar magniﬁcation. Right column: the Gaussian magniﬁcation (σ = 10, 40, and 100).

70

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

Fig. 6. (a) The left view and (b) the right view. The right camera was panned to the left and right in the range of [−10°, 10°] at a step of 1°. The angle of view of the cameras is about 46°.

of the cameras. This is not ﬂexible when the lenses or cameras are changed. In the horizontal-then-vertical approach, the correspondence search is ﬁrst applied on the horizontal direction. The vertical searching is started at the column where the horizontal searching gives the highest correlation. This is an approximation of the 2D searching and at the same time saves computation. A small vertical searching range is enough to cover the possible epipolar line constraints. Through iterative vergence control, the disparity at the

center of the image can be minimized. We make sure that our vertical searching range covers the epipolar correspondence range. 3.3. Coarse-to-ﬁne disparity estimation using image pyramid The coarse-to-ﬁne (multi-resolution) search strategy is the third component in this paper that is essential for robust vergence. This strategy utilizes multi-level processing to enlarge the working range of the disparity estimation under a reduced computational complexity

Fig. 7. Disparity tuning curves by the Gaussian magniﬁcation approach (upper row) and the log polar approach (lower row). The pan angle of the right camera corresponding to the foreground (depth = 1.8 m) is shown as a vertical solid line and the pan angle corresponding to the background (depth = 5 m) is shown as a vertical dashed line.

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

[7,38]. The coarse resolution results are used as an initial estimation of the foveal match and it funnels the search space into a smaller search region, funneling the ﬁner level search on the subsequent cascading process to a more speciﬁc region. In the proposed model, image pyramids are used on a binocular pair of perspective images to estimate disparity in a coarse-to-ﬁne progressive manner. The original binocular image pair is sub-sampled to form a multi-level image pyramid through Eq. (9). The images are sub-sampled in a 2:1 ratio between consecutive levels of the pyramid. The lowest level (highest resolution) is level 0 which is the original image of size H × W and the resolution for level n in the pyramid is H/2n × W/2n.

Ln ðx; yÞ = L0 x*2n ; y*2n H W where x ∈ 0; =2 n −1 and y ∈ 0; =2 n −1

ð9Þ

A pyramid of 3 levels, level 0 to 2, as shown in Fig. 8 was used. The resolution at level 0 was 200 × 200. The resolutions of level 1 and 2 were 100 × 100 and 50 × 50 respectively. The disparity estimation process increases from the coarsest resolution to the ﬁnest resolution. As each level is processed, the best NCC position of that level sets the next point of ﬁxation for processing at the next level. The log base a of the log polar transform is determined by the top level (coarsest) image size (HT, WT) and the transformed log polar image size (ρmax, φmax) using Eq. (10). The transformed log polar image covers a maximum inner circle of the top level image, shown in Fig. 8. In the experiments, log base a was calculated from the top level image of size 50 × 50 and converted onto a 50 × 90 log polar image using the backward log polar transformation. The value of a is maintained for subsequent resolutions, implying that the area of pixel coverage is constant, but since each subsequent levels increases the pixel resolution, the search for the point of ﬁxation focuses only on a best match position obtained from the previous resolution. Thus, the funneling process gives rise to the concept of a pyramidal growth of resolutions as the focus toward the point of ﬁxation proceeds. As shown in Fig. 8, due to the ﬁxed log base a, the log polar transformation covers a region that is becoming smaller downward the Cartesian domain image pyramid. 0 HT ln min =2 ; B a = exp B @ ρmax

WT

=2

1 C C A

71

based on the shift of the origins of log polar transformation in the candidate image. This is biologically synonymous to the excitatory and antagonistic competitions that occur between neighbouring receptive ﬁelds. Instead of a parallel implementations of such receptive ﬁeld processes, sequential implementations resort to the sliding window method as proposed. For a pixel (xL, yL) in the left image, the objective of disparity estimation is to ﬁnd the corresponding pixel (xL + dH, yL + dV) in the right image, where the (dH, dV) pair is the horizontal and vertical disparity. The left image is transformed to log polar image with (xL, yL) as the origin. For the right image, the origin for log polar transformation is shifted horizontally with respect to the reference position. A series of log polar images with shifted origins are thus obtained and this is illustrated in Fig. 9(b). The right log polar image with the maximum normalized cross correlation coinciding with the left log polar image indicates the best match. This results in the centering of the two images at the current resolution and the next resolution-level processing is performed until the ﬁnest resolution is achieved. In the case when the desired point of ﬁxation is near the corner or boundary of the image, the corresponding positions of some of the log polar pixels may be out of the original image. In this case, the values of these pixels are set to 0. In the process, the permissible search space at the subsequent lower level is restricted in a narrower range compared to the initial level. Let n be the number of levels in the pyramid and these are enumerated from 0 to n−1, i.e. from ﬁnest to coarsest resolution. The disparity estimation is initially conducted at the coarsest n−1 level. The initial search radius of 5 pixels was used for level n−1 and a radius of 2 pixels was used for the subsequent levels. According to Eq. (11), the 3-level pyramid model can cover a maximum disparity of ±26 pixels. The computation time is only (5+ 2 + 2)/26 = 35% compared to using a direct searching method covering the same range of disparity. n−1

Disparity range D = ∑ dl *2

l

l=0

n−1

Computational time T = ∑ dl

ð11Þ

l=0

where dl is the searching radius for level l: ð10Þ

Taking the left image as the reference image (master eye) and the right image as the candidate image, the disparity estimation process is

Vertical disparity estimation at each level follows after the completion of the horizontal disparity estimation. The position with maximum log polar normalized cross correlation determines the vertical disparity. The searching range in our implementation

Fig. 8. The log polar transform on the image pyramid.

72

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

Fig. 9. Disparity estimation using log polar images: black pixels denote the reference position in the left image and the candidate positions in the right image. (a) Transformation of the left image with the reference position as origin and (b) transformation of the right image with the candidate positions as origins.

is ±2 pixels for vertical disparity because the binocular system has a horizontal baseline which has smaller disparity in vertical direction. 3.4. Vergence control The disparity computations derived at different levels of the pyramid can directly be used for vergence control in both horizontal and vertical direction. The controlling criteria used are the weighted sum of the disparities of all levels, shown in Eq. (12). A multiplicative factor of 2 is used to compensate the sub-sampling of the original image at a scaling ratio of 2 for consecutive levels. n−1

l

C = ∑ 2 Dl l=0

ð12Þ

where Dl is the estimated disparity at level l:

experiment provides a quantitative depth of a ﬁxated object from the cyclopean eye of the binocular pair and this value is compared with the ground truth. The cyclopean depth can be mathematically calculated from the geometry of the binocular setup given a ﬁxed width baseline and the angular rotations of individual cameras at ﬁxation. In the experiment conducted, the horizontal searching range in the three level pyramids is ±5, ±2, ±2 for level 2, 1, 0 respectively. The vertical searching range is ±2 for all three levels. The log polar resolution used was 50 × 90. The proposed pyramidal log polar correlation method was essentially compared with the Pyramidal Cartesian Correlation (PCC) model [39] to show its superiority. The PCC model performs matching between corresponding Cartesian points using a sliding window based normalized cross correlation. Apart from different transform spaces, all other experimental parameters were identical to

The left camera was assigned as the master camera and the right camera as the slave camera. To omit the complexities of the cognitive attention selection process, manual target selection was used. The master camera was manually directed towards a ﬁxation point and the slave camera performed an equal angular magnitude movement in the same direction. The vergence control process subsequently shifts the slave camera to move towards the ﬁxation point of the master camera. The controlling of slave eye is an iterative process through the levels until (CH, CV) b (10, 10) (both less than 10). As shown in Fig. 10, the input signal to the controller is an angle θ corresponding to 2nDn pixels at level n. The disparity 2nDn was converted to θ, assuming a pinhole camera model. This angular signal was sent to a proportional controller to control the motors. The controller gain was set to 0.8 to achieve fast performance and avoid overshooting. The threshold 10 is equivalent to an average disparity of ±1 at each level of the pyramid. Once the threshold is achieved, the process moves to the next level. The control will stop if either (CH, CV) b (10, 10) or oscillation occurs. Oscillation occurs when opposite disparities in consecutive vergence adjustments occur and a speciﬁcation of three of such consecutive occurrences gives rise to a termination. The ﬁnal state of the vergence control is either verged or oscillation. The detailed control process is shown in Fig. 10. 4. Experimental results One of the greatest difﬁculties in assessing the success of vergence control is in the system of measurements. In order to appreciate the effectiveness of this work, a series of qualitative and quantitative experiments were devised and described. The ﬁrst qualitative test of vergence is to qualitatively judge the accuracy of the ﬁxation point placements attained by the algorithm that physically located the point of ﬁxations on each pair of corresponding binocular images. A second

Fig. 10. Iterative vergence control.

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

73

Fig. 11. Vergence control results: (a–f) captured frames of binocular image pairs after vergence control.

the log polar method. In the PCC method, we used a 3-level image pyramid (resolution being 200 × 200, 100 × 100, and 50 × 50) and the matching window has a size of 21 × 21. The searching radius is also ±5, ±2, ±2 for level 2, 1, 0 respectively. The vertical searching range is ± 2 for all three levels. The window size is ﬁxed for all three levels so that the equivalent matching window is actually 84 × 84, 42 × 42 and 21 × 21 at the original resolution. 4.1. Vergence control Vergence control was tested in a lab environment with a wide range of disparities in the scene. Three objects were put in front of the cameras at a distance about 1 m and the background environment varied between 2 to 7 m. The two cameras were initially parallel. The master (left) camera was directed manually to a position and the slave camera automatically verged to ﬁxate on the same position. A series of binocular image pairs after vergence control were captured and presented in Fig. 11. The system successfully verged on the foreground objects, the background objects and the ceiling. It was able to achieve successful stable ﬁnal state for all runs of vergence control with no oscillation. The system showed robust performance when depth changed signiﬁcantly in the local area. This is illustrated in Fig. 11(c) where the cameras were ﬁxating on the bottle and the

background of the two views varied drastically. The plants and other objects seen through the door had signiﬁcant disparity changes and yet the system was still able to verge on the bottle. This is possible through the cortical magniﬁcation of log polar transformation. The same experiment was repeated in the normal Cartesian space with the PCC model but this time the PCC model became unstable, swinging into large oscillations which often led to failure in vergence. This is the major advantage of the log polar transformation over the normal Cartesian domain based methods. In order to simulate resilience to noisy drifts in camera parameters, the right camera was deliberately adjusted to different level of brightness. The resultant system performance was robust enough to accommodate these changes and showed promising performance. This is very important for real world robotic application where the hardware may suffer more disturbances from the environment and should be able to adapt to different conditions. The results are shown in Fig. 12. 4.2. Depth estimation For lack of direct methods of evaluating vergence performance, the indirect way of depth estimation was harnessed. The basic concept of this experiment hinges on the duality of depth estimation and

74

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

Fig. 12. Vergence control results when the two cameras have different brightness: (a–f) captured frames of binocular image pairs after vergence control.

vergence. This stems from the concept that if the depth of the foveated objects can be estimated correctly, derived geometry for that estimate proves that both cameras are ﬁxating on the same point. The log polar model was also compared with the Cartesian space PCC model.

The depth can be estimated from the geometry of the binocular system, shown in Fig. 13. The depth and height of the target can be estimated given the motor parameters. The system can be divided into two planes. The horizontal plane associated with the motor panning is

Fig. 13. The geometry of the system: (a) two cameras verging on a target, (b) top view for depth estimation (α and β are the pan angles) and (c) side view for height estimation (θ is the tilt angle).

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

Fig. 14. The experimental environment: ﬁxation positions were marked with crosshairs.

relevant to the depth estimation and the vertical plane for each camera determines the vertical position or height of the target. Note that no matter how the tilt angle is, the depth is only determined by the pan angles of the base motors. When the two cameras are verging on a target, the depth of the target can be directly estimated using Eq. (13). D=

B tan α + tan β

H = H1 +

ð13Þ

H2 −D tan θ cos θ

The experiment was carried out in a lab environment where speciﬁc objects were selected for the system to foveate. Once ﬁxation was achieved, the pan-tilt conﬁgurations of the two cameras were used to calculate the depth of the target. Fig. 14 shows a synthesized view of the experimental setup and the objects were marked with crosshairs. The Mug and Green Bottle on the DELL Box were also depth-wise adjusted to attain varying depth settings to the camera frame. Objects in the experiments were scattered over a wide range of depths ranging from less than 1 m to 8 m. Distances less than 0.5 m were not considered because the system with a baseline distance of 0.24 m cannot accommodate depths of 0.5 m. Table 1 shows the performance of the two models. The average absolute error and the standard deviation of average absolute error are tabulated. The pyramidal log polar correlation model exhibited a performance on depth estimation with average accuracy above 90%

75

and consistently small standard deviation. There are two positions where the system indicated a larger estimation error and this occurred at the Plant and the Speaker L2. These errors tend to occur when both cameras rotate beyond 30° from its optical axis. A certain angular error of vergence will cause a larger depth estimation error when the panning is large from the center of the platform. The estimation is thus worse when the target deviated too much from the center. Another source of error may be the perspective and scale change of the target in the binocular images. If the panning angle is large (for example, beyond 30 degrees), both cameras suffer from a perspective distortion and scale change. Therefore the projected image of the objects on the two frames may be different in both perspective and scale. This will increase the error of disparity estimation. This incidentally agrees with the human vision system where human compensates the error by rotating the head about the neck. The PCC model fails for distance less than 1.5 m. For the Green Bottle at 2.06 m and Speaker R at 3.5 m, it also failed several times. These two objects have very different background in two camera views. This model does not work consistently when the local area has several major components of depth. In comparison, the log polar model seems to be positively superior. Fig. 15 shows the analysis plot of error and standard deviation of error against the ground truth depth. The proposed log polar model shows a consistent performance throughout the whole range of depths while the PCC model fails several times and showed unstable performance. However, the PCC model shows superior performance on ﬂat surfaces such as the Poster and the Dell Box. Under such conditions, the binocular matching for PCC was very accurate where the matching window of the PCC model covered accurately regions with uniform depth. On the other hand, this is not possible for the log polar method because the actual areas for disparity estimation are much larger than the local window of the PCC method. From all the above observations, it is possible to conclude that the proposed pyramidal log polar correlation model possesses a stable performance over a wider working range. This is important for real world robot applications because reliability and robustness are important in the noisy environment. The working range of the pyramidal log polar correlation method is wider and deeper than the PCC method when the two methods have the same settings of searching range. In the conducted experiments, the computation of one round of disparity estimation takes about 25 milliseconds. This leads to a real time control frequency above 30 Hz. The computation of the PCC method takes a shorter period of 10 milliseconds because the matching window size (21 × 21) is less than the log polar image size (50 × 90). One possible concern is the non-rotation-centralized CCD position in the camera setup. Intuitively, this would pose a problem for distance estimations due to a bias from the off-centered rotation.

Table 1 Depth estimation results. Object

Green bottle Mug Green bottle Dell box Mug Green bottle Speaker R Door Poster Plant Poster printer Speaker L1 Speaker L2

True depth (m)

0.80 1.00 1.30 1.76 1.80 2.06 3.50 4.00 4.54 7.40 4.65 3.20 2.75

The log polar pyramidal model

The PCC model

Avg estimated depth (m)

Avg error and std dev (%)

Avg estimated depth (m)

Avg error and std dev (%)

0.84 1.06 1.24 1.74 1.82 1.92 3.27 3.68 4.71 6.40 4.33 3.18 2.90

5% ± 2% 6% ± 3% 5% ± 2% 4% ± 2% 3% ± 3% 7% ± 3% 7% ± 4% 8% ± 5% 7% ± 4% 14% ± 4% 7% ± 4% 5% ± 4% 12% ± 4%

Fail Fail Fail 1.78 1.87 2.51 4.14 3.81 4.57 7.17 4.43 3.51 2.94

Fail Fail Fail 1% ± 0% 4% ± 1% 23% ± 23% 26% ± 47% 5% ± 2% 1% ± 0% 3% ± 1% 5% ± 3% 10% ± 8% 7% ± 4%

76

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77

Fig. 15. Error plots of depth estimation. Top row: estimated depth versus true depth. Bottom row: estimation error and standard deviation of error versus true depth. The crosses indicate the failure cases.

However, this is true only for tilted scenarios as illustrated in Fig. 13 (c). If the cameras are not at a tilt, maintaining a horizontal pose, the panned cameras will not experience off-centered bias error as the angles used for computations remain consistent regardless of whether they are taken from the point of rotation or from the optical centers. This immunity is unique because the ﬁxation is achieved by a visual centering of the cameras through iterative disparity minimization, rather than a geometrical ﬁxation from camera panning parameters. Furthermore, when the cameras are at a certain tilt, the estimation of the vertical position of the target under vergence can be compensated by the geometry presented in Fig. 13(c). 5. Conclusions A spatial variant approach for vergence control was proposed in this paper. The model was inspired by the log polar transformation existing in the retino-cortical pathway of the primate vision system. The foveal magniﬁcation property of the log polar transformation was utilized for accurate and reliable vergence control, producing reliable camera ﬁxation. In the proposed model, image pyramids were generated from binocular images and coarse-to-ﬁne disparity estimation was carried out through disparity searching in the image pyramids. Coupling this with the normalized cross correlation in log polar domain, the disparity in the image was obtained. The model was deployed in a binocular vision system for real time vergence control in complex environment and this included a wide range of depth and disparity settings. Through qualitative and quantitative experiments,

the system was shown to be able to verge on objects in a cluttered environment. The log polar transformation based method was shown to be stable and reliable in vergence control and proven to be statistically more reliable than the PCC model in Cartesian domain. References [1] A.L. Yarbus, Eye movements and vision, Plenum Press, New York, USA, 1967. [2] E.L. Schwartz, Spatial mapping in the primate sensory projection: analytic structure and relevance to perception, Biological Cybernetics 25 (1977) 181–194. [3] Y. Cao, S. Grossberg, A laminar cortical model of stereopsis and 3D surface perception: closure and da Vinci stereopsis, Spatial Vision 18 (2005) 515–578. [4] S. Grossberg, P.D.L. Howe, A laminar cortical model of stereopsis and threedimensional surface perception, Vision Research 43 (2003) 801–829. [5] L.R. Squire, F.E. Bloom, S.K. McConnell, J.L. Roberts, N.C. Spitzer, M.J. Zigmond, Fundamental neuroscience, 2 edAcademic Press, San Diego, California, USA, 2003. [6] T.J. Olson, Stereopsis for verging systems, IEEE Conference on Computer Vision and Pattern Recognition, New York City, USA, 1993, pp. 55–60. [7] J.P. Siebert, D.F. Wilson, Foveated vergence and stereo, International Conference on Visual Search, Nottingham, UK, 1992. [8] J.R. Taylor, T.J. Olson, W.N. Martin, Accurate vergence control in complex scenes, IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 1994, pp. 540–545. [9] A.X. Zhang, A.L.P. Tay, A. Saxena, Vergence control of 2 DOF pan-tilt binocular cameras using a log-polar representation of the visual cortex, International Joint Conference on Neural Networks, Vancouver, Canada, 2006, pp. 4277–4283. [10] D.J. Coombs, C.M. Brown, Intelligent gaze control in binocular vision, IEEE International Symposium on Intelligent Control, Philadelphia, PA, USA, 1990, pp. 239–245. [11] G.C. DeAngelis, I. Ohzawa, R.D. Freeman, Depth is encoded in the visual cortex by a specialized receptive ﬁeld structure, Nature 352 (1991) 156–159. [12] J. Diaz, E. Ros, S.P. Sabatini, F. Solari, S. Mota, A phase-based stereo vision systemon-a-chip, Journal of BioSystems 87 (2005) 314–321.

X. Zhang, L.P. Tay / Image and Vision Computing 29 (2011) 64–77 [13] D.J. Fleet, A.D. Jepson, Stability of phase information, IEEE Transactions on Pattern Analysis and Machine Intelligence 15 (1993) 1253–1268. [14] D.J. Fleet, H. Wagner, D.J. Heeger, Neural encoding of binocular disparity: energy models, position shifts and phase shifts, Vision Research 36 (1996) 1839–1857. [15] M. Hansen, G. Sommer, Active depth estimation with gaze and vergence control using Gabor ﬁlters, International Conference on Pattern Recognition, Vienna, Austria, 1996, pp. 287–291. [16] M.M. Marefat, L. Wu, C.C. Yang, Gaze stabilization in active vision—I. Vergence error extraction, Pattern Recognition 30 (1997) 1829–1842. [17] I. Ohzawa, G.C. DeAngelis, R.D. Freeman, Stereoscopic depth discrimination in the visual cortex: neurons ideally suited as disparity detectors, Science 249 (1990) 1037–1041. [18] W.M. Theimer, H.A. Mallot, S. Tolg, Phase method for binocular vergence control and depth reconstruction, Proceedings of SPIE 1826 (1992) 76–87. [19] Y. Yeshurun, E.L. Schwartz, Cepstral ﬁltering on a columnar image architecture: a fast algorithm for binocular stereo segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence 11 (1989) 759–767. [20] A. Bernardino, J. Santos-Victor, Vergence control for robotic heads using log-polar images, International Conference on Intelligent Robots and Systems, Osaka, Japan, 1996. [21] C. Capurro, F. Panerai, G. Sandini, Dynamic vergence using log-polar images, International Journal of Computer Vision 24 (1997) 79–94. [22] W.-S. Ching, P.-S. Toh, K.-L. Chan, M.-H. Er, Robust vergence with concurrent detection of occlusion and specular highlights, International Conference on Computer Vision, Berlin, Germany, 1993. [23] E. Grosso, R. Manzotti, R. Tiso, G. Sandini, A space-variant approach to oculomotor control, International Symposium on Computer Vision, Coral Gables , Florida , USA, 1995, p. 509. [24] R. Manzotti, A. Gasteratos, G. Metta, G. Sandini, Disparity estimation on log-polar images and vergence control, Computer Vision and Image Understanding 83 (August 2001) 97–117. [25] J.H. Piater, R.A. Grupen, K. Ramamritham, Learning real-time stereo vergence control, International Symposium on Intelligent Control/Intelligent Systems and Semiotics, Cambridge, MA, USA, 1999, pp. 272–277. [26] C. Yim, A.C. Bovik, Vergence control using a hierarchical image structure, IEEE Southwest Symposium on Image Analysis and Interpretation, Dallas, Texas, USA, 1994.

77

[27] P.M. Daniel, D. Whitteridge, The representation of the visual ﬁeld on the cerebral cortex in monkeys, Journal of Physiology 159 (1961) 203–221. [28] B. Fischer, Overlap of receptive ﬁeld centers and representation of the visual ﬁeld in the cat's optic tract, Vision Research 13 (1973) 2113–2120. [29] R.B. Tootell, M.S. Silverman, E. Switkes, R.L.D. Valios, Deoxyglucose analysis of retinotopic organization in primate striate cortex, Science 218 (1982) 902–904. [30] D.C. Van Essen, W.T. Newsome, J.H. Maunsell, The visual representation in striate cortex of macaque monkey: asymmetries, anisotropies, and individual variability, Vision Research 24 (1984) 429–448. [31] R.A.P. II, M. Bishay, T. Rogers, On the computation of the log-polar transform, Technical Report, Intelligent Robotics Laboratory, Center for Intelligent Systems, Vanderbilt University School of Engineering, 1996. [32] C. Mehanian, S.J. Rak, Bi-directional log-polar mapping for invariant object recognition, Proceedings of SPIE 1471 (1991) 200–209. [33] V.J. Traver, F. Pla, Log-polar mapping template design: from task-level requirements to geometry parameters, Image and Vision Computing 26 (2008) 1354–1370. [34] The USC-SIPI Image Database. http://sipi.usc.edu/database/. Accessed on 23 Aug 2010. [35] W.-S. Ching, P.-S. Toh, M.-H. Er, Robust vergence with concurrent detection of occlusion and specular ts, Computer Vision and Image Understanding 62 (1995) 298–308. [36] T. Kanade, M. Okutomi, A stereo matching algorithm with an adaptive window: theory and experiment, IEEE Transactions on Pattern Analysis and Machine Intelligence 16 (1994) 920–932. [37] J. Monaco, A.C. Bovik, L.K. Cormack, Epipolar spaces for active binocular vision systems, International Conference on Image Processing, San Antonio, Texas, USA, 2007, pp. 549–551. [38] Y. Chen, N. Qian, A coarse-to-ﬁne disparity energy model with both phase-shift and position-shift receptive ﬁeld mechanisms, Neural Computation 16 (2004) 1545–1577. [39] X. Zhang, A.L.P. Tay, A physical system for binocular vision through saccade generation and vergence control, Cybernetics and Systems: an International Journal 40 (2009) 549–568.

Variant A Variant B Variant C Variant D - GitHub

Screening for collusion: A spatial statistics approach

A linear parameter variant HN control design for an ...

DIA5: An approach for modeling spatial relevancy in a ...

A control-theoretic investigation of dynamic spatial ...

Hybrid Stabilizing Control for the Spatial Double ...

Variable Spatial Springs for Robot Control Applications

Identification of a novel variant CYP2C9 allele in ... -

Lesson 5.2: Variant data

PDF Online Air Pollution Control: A Design Approach - Read Unlimited ...

A learning and control approach based on the human ... - CiteSeerX

TIME-VARIANT MODELING FOR GENERAL SURFACE ...

A modified model reference adaptive control approach ...