TH-H3-1

SCIS & ISIS 2008

Robust Eye Localization for Lip Reading in Mobile Phone Environments Thanh Trung Pham *, Jin Young Kim*, Seung Yu Na*, Sung Taek Hwang** * Dept. of ECE, Chonnam National University ** Telecommuniation R&D Center, Samsung Electronics * Buk-Gu Yongbong-Dong 300, Gwangju, 500-757, South Korea ** 416 Metan-Dong, Yeongtong-Gu, Suwon-si, Gyeonggi-do, 443-747, South Korea Emails: {[email protected], [email protected]}

Abstract- In this paper we present a new robust approach for eye localization adopted into lip reading in mobile environments, where the input image is assumed to contain a single face. Firstly, we segment eye candidates regions using intensity information in YCbCr color space. Then coupled regions of eye candidates including assumed eye brows are extracted with the restrictions of morphology and geometry characteristics of eyes in frontal face. GMM is applied next to validate exact eye couple. Keyword: eye localization, lip-reading, coupled eyes, Gaussian Mixture Model I.

INTRODUCTION

Since the last few years, lip reading or speech reading has attracted lots of attentions to enhance speech recognition performances. It’s because the visual movements of lip contain the information of speech articulation. The first step in lip reading system is to detect lip regions in face images. A common approach is the face detection based method. That is, the face region is detected by skin color information and then the lip detection is performed using the information of facial structure. However, the color information is so variable due to illumination change that exact face detection is not easy in dynamic lighting condition of indoor and outdoor environments. By the way, fortunately, in mobile application of lip reading only one face is assumed to be located near the image center. Thus, we can directly detect eye centers without face detection. Eye centers are very good features for setting a proper region including lip. In this paper, we propose a robust method of eye localization for mobile phone based lip reading with a little tolerance of localization accuracy. The main purpose of eye detection is to set a sufficient lip region, where the fine lip analysis would be performed. Various approaches have been proposed for eye localization during the last decade. These approaches are generally grouped into three categories: image matching, machine learning and image processing based methods. Image matching based methods [1, 2] first construct eye templates, and then compare them with subimages in the areas to be detected. The sub-image with the highest matching score is considered as the eye. Machine learning methods usually construct a classifier or detector to distinguish eye and non-eye regions. This kind of methods needs numerous training data to obtain good performance

and consume much time for searching eye regions. In [3], Adaboost detector is applied to segment eye regions and further fast radials symmetry process is used to locate eye center. Image processing based methods usually use edges, corners, intensity, color or other characteristics of the face to locate eyes. In [9], face is extracted firstly base on skin color information, then vertical projection in hue image is used to locate eye region; at last the eye center is located by a peak value extraction algorithm. In this paper, we present a new robust eye localization algorithm being applicable to lip reading system. The proposed method uses both intensity and geometry information for extracting eye candidate regions. In mobile phone environments, the image or video is assumed to contain single frontal face. The image is converted to YCbCr color space, and binarilized using threshold technique. As the result, we segment eye candidate regions such as: eyes, eyebrows, nostrils, lips, spectacles, hair, ears, and so on. Then geometry characteristics of eyes are applied to reject non-eye regions and keep possible ones. Finally, we use GMM to validate eye candidates and pick up the eye couple with the highest probability. Because this eye detection is applied to lip reading system, nearly exact eye center is acceptable instead of precise eye center. So the center point of eye region can be considered eye center. In next sections, the whole eye localization algorithm and experiment results will be explained in details. II.

EYE LOCALIZATION

Our eye localization method is summarized in Figure 1. Input RGB Image

Y Image

Eye Candiates Segmentation

Eye Center

Eye Validation Figure 1 Flow chart of eye localization. A. Eye Candidate Segmentation In mobile lip-reading application, the face is supposed locating at center of screen, so the eye region should belong

385

to a potential region in image, called mask (Figure 2). The input image is firstly converted to YCbCr color space, and then the Y component is used to segment eye candidate regions. One important property of eye is that it’s always darker than skin and other regions. So we can take advantage of this to segment image into regions that can be considered as eye regions. The image in mask region is then binarilized by using threshold techniques. The pixels having intensity less than threshold value are set to 1, otherwise set to 0. Now in binary image, we have eye candidate regions with pixel value 1 (Figure 3, 4).

Thresholds

Eye Candidates Segmentation

Eye Couples Extraction

<1

Couples of eyes

Decrease Thresholds

>=1 Eye Validation

Figure 2 Mask Image. Figure 5 Adaptive Segmentation Algorithm.

Figure 3 Input Y Image and Binarilized Image. 1. 2. 3.

4. 5. 6. 7. 8.

Init mask image, contain potential eye region Compute min threshold and max threshold based on intensity information of Y image For pixel in Y Image having intensity value less than min threshold is set to 1 (object), otherwise set to 0 (background). Create edge map image edgeIm from Y image Dilate edgeIm by rectangle structure element. edgeIm=1-edgeIm The binarilized image is the product of Y image, edgeIm and mask. Connected component analysis is applied on this binarilized image for next processing.

Figure 6 Eye Candidates Remaining. GMM is a statistical model or density model that comprise a number of component functions. We contruct Gaussian Mixture probality density function (pdf) from training data. Then for each eye candidate, we calculate its probability. After that, one having hightest probability will be chosen as eye region (Figure 7). Eye candidates

Training data Feature extraction Module

Figure 4 Segmentation Algorithm.

Feature vectors

Because intensity value is very sensitive to the changes of illumination so our segmentation algorithm is made adaptively. The threshold value is increased or decreased until eye candidates are well segmented (Figure 5).

Gaussian Mixture PDF {Probability}

B. Eye Validation using GMM

Eye

This step aims to validate which candidate region is eye region. To reduce time cost in validation step, the process rejects non-eye regions using physical characteristics of eye. The regions, having too big or small areas, height greater then width 3 times, will be considered as non-eye and be rejected. Figure 6 shows in the result after rejecting some non eye regions.

Max

Figure 7 Eye Validation Diagram. In feature extraction module (Figure 11), Haar wavelet transform is aplied to extract features for eye image. It is well known that Haar wavelet transform can deal very well with features in face such as nostril, eyes, and lip. And then PCA is used to reduce the feature dimension.

386

In our study, we construct 3 training data types to perform experiments (Figure 8, 9, 10). The first data contains single eyes, in which left eye and right eye are independent. The second one, each image is a couple of left eye and right eye. And the last one, each image is a couple of eyes including eyebow regions. Figure 12 Couple eye candidates. In this case, we model the second data with 5 Gaussian mixtures. The validation procedure is the same as the last one. All sample images are normalized to size 142x31, downsampled to 36x8. After taking Haar wavelet transform, we get 18x4 features for each image. Next, we apply PCA to extract first 15 principle components. The performance of this validation method is much better than the last one. But in some cases, eyebrow is very similar to close eye, so this may cause wrong detection. In order to solve this shortcoming, eyebrow region is included to eye-couple image for both GMM and validation (Figure 13).

Figure 8 Single eye samples.

Figure 9 Couple eye samples.

Figure 10 Couple eye with eyebrow samples. 1.

Validate single eye

All images in first sample data are normalized to standard size 71x31 and downsampled to 36x16 before Haar wavelet transform is applied to get 16x8 features. We also use PCA to reduce dimensionality of feature vectors. As result, each eye image is presented by 8 features. All 600 feature vectors are used to estimate Gaussian mixture PDF models with 3 conponents. From the center of each eye candidate region, we take eye image with size 71x31 and then the same feature extraction procedure is applied to this eye image. Finally, each eye candidate has its own probability. The candidates with greatest probabilities are chosen as true eyes in image. This validation method is so simple but does not show an excellent performance.The mose errors are due to eyebrows, thick spectacles, hair and other objects. In the next section, coupled eye validation apprach is introduced, in which physical properties of eyes, such as symetry and correlation , are taken into account to enhance detection rate.

Figure 13 some eyes and eye couples candidates. III.

EXPERIMENT RESULTS

In our study, we recorded videos of 105 different persons in standard, indoor and outdoor environments. Each frame has size 640x480. This data is used to verify our proposed algorithm. We contructed 600 single eye samples, 1000 eye couple samples and 1000 eye couple with eyebrow samples to build Gaussian mixture PDF seperately. Figure 14-16 show somes detection results with different eye status and diffirent environments. The experiment shows that when we validate eye candidate with eyebrow , the performance is much more efficient.

Downsampling

Haar Wavelet Transform

PCA

Normal Eye

Closed Eye with Gap

Figure 11 Feature extraction Module. 2.

Validate couple eye

Among eye candidates obtained from last step, a preprocessing step selects eye couples by considering geometry characteristics of left and right eye. All couples, having too large or small distance, low correlation, large height distance, are rejected. Firgure 12 shows the result as couple eye extraction.

Eye with Spectacle

Eye with Spectacle and cap

Figure 14 Detection results - Standard DB.

387

IV.

Normal Eye

Eye with Spectacle

Closed Eye with Spectacle

Closed Eye with Gap

CONCLUSION

In this paper, we presented a simple and robust method for eye localization. Both intensity information and physical properties of eyes are used to find the center of left eye and right eye. The main idea is to validate cadidate regions of left and right eyes simultaneously with coupled images. The intensity information is used to segment eye candidate regions, and then the potential eye couples are extracted based on geometry characteristics of eyes in frontal face image. These couples of eyes are finally validated using GMM to decide the best candidate. The experiment shows that our method works so well against the dynamicity of eyes and lightening conditions. and different environments. Our method aslo take small time cost, so that it is feasible to lip reading system in mobile environments. REFERENCES

Figure 15 Detection results - Indoor DB.

Normal Eye

Eye with Spectacle

Closed Eye

Closed Eye with Gap

Figure 16 Detection results - Outdoor DB. Table 1 shows the detection comparision of three validation methods in standard, indoor and outdoor environments. Table 1 Comparision of different validation methods. Standard DB Single eye Validation Couple eye Validation Couple eye with eyebrow Validation

Indoor DB

Outdoor DB

Number of samples

60%

--

--

105

90.3%

87.2%

85.6%

105

98.1%

96.6%

94%

105

[1] S. Kim, S.T. Chung, S. Jung, D. Oh, J. Kim, and S. Cho. Multiscale Gabor Feature based Eye Localization, Proceedings of World Academy of Science, Vol. 21, pp. 483-487, Jan. 2007. [2] K. Peng, L. Chen. A Robust Algorithm for Eye Detection on Gray Intensity Face without Spectacles, Journal of Computer Science & Technology, Vol. 5, No. 3, 2005. [3] Z. Wencong, C. Hong, Y. Peng, L. Bin and Z. Zhenquan. Precise Eye Localization with AdaBoost and Fast Radial Symmetry, Proceedings of Computational Intelligence and Security, Vol. 1, pp. 725-730, Nov. 2006. [4] H. Lu, W. Zhang, D. Yang. Eye Detection Based on Rectangle Features and Pixel-pattern-based Texture Features, Proceedings of International Symposium on Intelligent Signal Processing and Communication Systems, pp. 746-749, Nov.28-Dec. [5] Z. Niu, S. Shan, S. Yan, X. Chen, and W. Gao. 2D Cascaded AdaBoost for Eye Localization, Proceedings of the 18th International Conference on Pattern Recognition, Vol. 2, pp. 1216-1219, 2006. [6] P. Wang, M. Green, Q. Ji, and J. Waymanm. Automatic eye detection and its validation, Proceddings of IEEE Conf. Computer Vision and Pattern Regconition, Vol. 3, pp 164-164, 2005. [7] X. Tang, Z. Ou, T. Su, H. Sun, and P. Zhao. Robust Precise Eye Location by AdaBoost and SVM Techniques. Proceedings of Int’l Symposium on Neural Networks, pp. 93–98, 2005. [8] S. Du, Ward R. A Robust Approach for Eye Localization Under Variable Illuminations, Proceedings of ICIP 2007, Vol. 1, pp. 377-380, 2007. [9] W. T. Wang, C. Xu H. Shen. Eye localization based on hue image processing, Proceedings of ISPACS 2007, pp 730733, 2007. [10] Y. Chen, and K. Kubo. A Robust Eye Detection and Tracking Technique using Gabor Filters, Proceedings of Third International Conference on IIHMSP 2007, Vol. 1, pp. 109112, 2007. [11] O. Jesorsky, K. J. Kirchberg, and R. W. Frischholz. Robust Face Detection Using The Hausdorff Distance, Proceedings of Audio and Video-Based Biometric Person Authentication, pp. 91-95, 2001. [12] M. Hamouz, J. Kittler, J.K. Kamarainen, and H. Kalvioinen. Affine-Invariant Face Detection and Localization Using GMMBased Feature Detectors and Enhanced Appearance Model, Proceedings of IEEE Sixth International Conference on Automatic Face and Gesture Recognition, pp. 67-72, 2004.

388

Robust Eye Localization for Lip Reading in Mobile ...

Emails: {trung.ptt@gmail.com, [email protected]}. Abstract- ... setting a proper region including lip. ... detection is to set a sufficient lip region, where the fine lip.

791KB Sizes 2 Downloads 202 Views

Recommend Documents

External Localization System for Mobile Robotics - GitHub
... the most known external localization reference is GPS; however, it ... robots [8], [9], [10], [11]. .... segments, their area ratio, and a more complex circularity .... The user just places ..... localization,” in IEEE Workshop on Advanced Robo

Eye Localization via Eye Blinking Detection
May 1, 2006 - Jerry Chi-Ling Lam (994873428). Department of Electrical and Computer Engineering, University of Toronto, 10 Kings College Road. Toronto ...

Eye Localization via Eye Blinking Detection
In this paper, the goal is to localize eyes by determining the eye blinking motions using optical flow happened in a sequence of images. Frame differencing is used to find all possible motion regions that satisfy the eye blinking mo- tion constraints

Mobile Eye DriveCam -
Mobile Eye. R. DriveCam. TM. Next Steps. ○. Execute intellectual property and data usage license agreement. ○. Place order with payment. ○. Initiate training for use and video data downloads. ○. Train staff for maintenance and precautions. â—

Accurate Mobile Robot Localization in indoor ...
for localization of a mobile robot using bluetooth. Bluetooth has several inherent advantages like low power consumption, an ubiquitous presence, low cost and ...

External Localization System for Mobile... (PDF Download Available)
We present a fast and precise vision-based software intended for multiple robot localization. The core component of the proposed localization system is an efficient method for black and white circular pattern detection. The method is robust to variab

Kalman Filter for Mobile Robot Localization
May 15, 2014 - Algorithm - This is all you need to get it done! while true do. // Reading robot's pose. PoseR = GET[Pose.x; Pose.y; Pose.th]. // Prediction step. ¯Σt = (Gt ∗ Σt−1 ∗ GT t )+(Vt ∗ Σ∆t ∗ V T t ) + Rt. // Update step featu

Robust Self-localization of Ground Vehicles Using ...
utilizes artificial landmarks, a cheap GPS sensor, and wheel odometry. .... landmark for successful application of localization sys- tem. However .... in the test dataset successfully on average. Most of the ... nology (KEIT). (The Development of Low

Locus: robust and calibration-free indoor localization, tracking and ...
least four GPS satellites. For indoor environments, alternative technologies are required. A fundamental goal of indoor location technology is to achieve minimal cost with accuracy sufficient enough for general consumer applications. A low-cost indoo

A Robust Lip Center Detection in Cell Phone Environment
Telecommunication R&D Center, Samsung Electronics. * Buk-Gu ... Abstract- In this paper we present a new approach for the detection of lip centers based on ...

Mobile Robot Indoor Localization Using Artificial ...
to validate our approach several ANN topologies have been evaluated in experimental ... to accomplish several tasks in autonomous mobile robotic area. Also, knowledge about ... the wireless network are Received Signal Strength Indication. (RSSI) and

lip-repos.pdf
Abstract: Excessive gingival display (EGD) resulting in a “gummy smile” is a major esthetic concern with ramifi cations. in an individual's personal and social life.

Dead Reckoning for Monte Carlo Localization in Low ...
Especially in scenar- ios with low seed densities, we find SA-MCL to significantly outperform MCL. S. S. S. S. S. Figure 6: Setup of demo with seeds and RC cars.

Evaluation of Vocabulary Trees for Localization in ...
to build a vocabulary and to index images, which means a query image is represented by a vector. In addition, we consider memory occupancy. We use ukbench[3] for the dataset and oxford5k[7] for vocabulary learning. These are the most popular datasets

A Scalable UWB Based Scheme for Localization in ...
However simple GPS based schemes do not work well ... The goal is to track the vector hn, estimate the channel taps, ..... location and tracking system”, Proc.

Organització Lip dub.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Organització Lip ...

Lip Sync Review.pdf
Sign in. Page. 1. /. 9. Loading… Page 1 of 9. Page 1 of 9. Page 2 of 9. Page 2 of 9. Page 3 of 9. Page 3 of 9. Lip Sync Review.pdf. Lip Sync Review.pdf. Open.

lip sync hough.pdf
There was a problem loading more pages. lip sync hough.pdf. lip sync hough.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying lip sync hough.pdf.

Convolutional Networks for Localization Yunus Emre - GitHub
1 Introduction: Image recognition has gained a lot of interest more recently which is driven by the demand for more sophisticated algorithms and advances in processing capacity of the computation devices. These algorithms have been integrated in our

lip sync mike.pdf
There was a problem loading more pages. lip sync mike.pdf. lip sync mike.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying lip sync mike.pdf.

Relative-Absolute Information for Simultaneous Localization and ...
That is why it is always required to handle the localization and mapping. “simultaneously.” In this paper, we combine different kinds of metric. SLAM techniques to form a new approach called. RASLAM (Relative-Absolute SLAM). The experiment result

Recurrent Neural Networks for Noise Reduction in Robust ... - CiteSeerX
duce a model which uses a deep recurrent auto encoder neural network to denoise ... Training noise reduction models using stereo (clean and noisy) data has ...