Gesture Recognition

1

Gesture Recognition of Nintendo Wiimote Input Using an Artificial Neural Network Henry Owen Wiggins University of British Columbia

Cognitive Systems April 17, 2008

Abstract In this paper I demonstrate the use of a feed-forward multilayer perceptron in classifying gestures from acceleration samples. The release of Nintendo’s seventh generation video game console, the Nintendo Wii, brought with it a challenge to the traditional video game controller. Their Wiimote controller included an ADXL330 3-axis accelerometer for motion sensing and a PixArt optical sensor for pointing. Innovative use of the controller in games requires more involved analysis of the input data than traditional controllers. This paper focuses on using a back-propagation algorithm for supervised learning, and will also show how using simulated annealing, momentum, and localized learning rates can speed training times. These techniques allow for successful classification of complex gestures with significantly lower training times.

Gesture Recognition

2

Gesture Recognition of Nintendo Wiimote Input Data Using an Artificial Neural Network Introduction and Motivation In this paper I will show that feed-forward multilayer perceptrons can be used to accurately classify complex gestures from Wiimote acceleration data. Work by Hebb (1949), Rosenblatt (1958), and Minksy & Papert (1969) showed that perceptrons can successfully classify linearly separable data sets of optical patterns. Due to the limits imposed by the linearly separable requirement, perceptrons and connectionism fell out of favor until Rumelhart, Hinton, and McClelland (1986; RumelHart, Hinton, & Williams, 1986) popularized the multilayer perceptron using the back-propagation algorithm for training. The Cybenko theorem (Cybenko, 1989), a.k.a. the Universal Approximation Theorem for neural networks, later proved that multilayer perceptrons are in fact universal function approximators (Hornik, Stinchcombe, & White, 1989; Funahashi, 1989). Engelbrecht (2005) and Orr, Schraudolph, & Cummins (1999) list several improvements to the general back-propagation algorithm, including data preparation methods, use of simulated annealing, momentum, and localized learning rates. The goal of this paper is to give readers an introduction to multilayer perceptrons and apply the aforementioned research to the classification of gestures from Wiimote accelerometer data.

Background Wiimote The Wiimote has a remote control style design well suited for one handed pointing. It measures 144mm long, 36.2mm wide and 30.8mm thick. Communication with the console is via Bluetooth radio, which has a range of approximately 10m. Pointer functionality relies on an infrared sensor in the Wiimote and infrared emitting LEDs in the β€œsensor bar,” which has a range of approximately 5m. It also has a 3 axis accelerometer for motion sensing. (Wikipedia contributors, 2008) The Bluetooth controller in the Wiimote is a Broadcom 2042 chip and follows the Bluetooth human interface device (HID) standard. Reports are sent to the console at a maximum frequency of 100Hz. When queried, the Wiimote reports an HID descriptor block that includes report ID (a.k.a. channel ID) and payload size in bytes. Data packets include a Bluetooth header, report ID, and payload. (WiiLi contributors, 2008) The infrared sensor is a PixArt sensor and uses a PixArt System-on-a-Chip (SoC) to process the input. The Wiimote can detect up to four infrared hotspots, and can send any combination of position, size, and pixel value. Data is sent at 100Hz via a serial bus it shares with the Broadcom chip. (WiiLi contributors, 2008) The accelerometer is an Analog Devices ADXL330. The ADXL330 is a 3 axis linear accelerometer with a range of +/- 3 g and 10% sensitivity. The chip has a small micromechanical structure of silicon springs and a test mass. Differential capacitance measurements allow displacement of a tiny mass to be converted into voltage. This measures the net force of the mass on the springs, with a rest measurement of +g and a freefall measurement of

Gesture Recognition

3

approximately 0g, i.e. acceleration is measured in a freefall frame of reference. (WiiLi contributors, 2008) Please note that the above data is approximate and available from public sources. The precise capabilities of the Wii and Wiimote are covered by a non-disclosure agreement with Nintendo Company Ltd.

Single Layer Perceptrons Perceptrons are a type of linear classifier, meaning the outputs must be linearly separable. In geometric terms, this means that the inputs must be dividable by a line or hyperplane into the output categories. A classic example of this is that both the logical AND and OR operators can be represented by a perceptron, but neither the logical XOR nor NXOR operators can. To see why, we can look at how a perceptron works. A perceptron is divided into two layers: an input layer and an output layer. Each layer is fully connected with the other, but no connections exist within a layer. When inputs are received, each output layer neuron takes a weighted sum of the input values, and passes them though a threshold function. A typical threshold function is a hard limiter or step function: π‘₯β‰€πœƒ 0 𝑓 π‘₯ = π‘₯>πœƒ 1 Another common threshold function is a linear function: 𝑓 π‘₯ = π‘₯ To represent the logical AND operator, we require two inputs and one output. True can be represented with 1, and false with 0. Using a hard limiter with ΞΈ=0.9 for the threshold function; we set both input-output weights to 0.5. The results are consistent with the logical AND operator: Table 1 - Logical AND Perceptron Output

Input 2 = 1.0 Input 2 = 0.0

Input 1 = 1.0 1.0*0.5 + 1.0*0.5 = 1.0 > 0.9 β†’ 1.0 1.0 *0.5+ 0.0*0.5 = 0.5 ≀ 0.9 β†’ 0.0

Input 1 = 0.0 0.0*0.5 + 1.0*0.5 = 0.5 ≀ 0.9 β†’ 0.0 0.0*0.5 + 0.0*0.5 = 0.0 ≀ 0.9 β†’ 0.0

There is no similar way to represent an XOR function. (Minsky & Papert, 1969) A perceptron network is trained by adjusting the weights connecting input and output neurons. A common method for determining the change in weights is to use a gradient descent learning rule, such as the Delta rule (a.k.a. the Widrow-Hoff rule). The Delta rule determines the change in weight for the connection between output j and input i: βˆ†π‘€π‘—π‘– = 𝛼 𝑑𝑗 βˆ’ 𝑦𝑗 𝑔′ (𝑕𝑗 )π‘₯𝑖 Where Ξ± is the learning rate, tj is the target output, yj is the actual output, g(x)’ is the derivative of the activation function, and xi is the input. In other words, the change in weight is determined by the difference between target and actual output values. (Widrow & Hoff, 1960) Finally, each weight is updated: 𝑀𝑗𝑖 = 𝑀𝑗𝑖 + βˆ†π‘€π‘—π‘– By the Perceptron Convergence Theorem, this is guaranteed to find a solution in a finite number of steps if it can be expressed by a perceptron (Rosenblatt F. , 1962, pp. 111-116).

Gesture Recognition

4

Multilayer Perceptrons The multilayer perceptron is similar to the single layer perceptron discussed above, with the addition of hidden layers between the input and output layers. Input layer neurons are connected to hidden layer neurons, which are in turn connected to output layer neurons; each connection has an associated weight, similar to single layer perceptrons. Unlike single layer perceptrons, multilayer perceptrons are universal function approximators and are not restricted to linearly separable problems, i.e. any function can be approximated with arbitrary precision given finitely many discontinuities, enough hidden neurons, and a non-linear, monotonically increasing, differential activation function (Cybenko, 1989; Funahashi, 1989; Hornik, Stinchcombe, & White, 1989). Processing is similar to that of a single layer perceptron: the weighted sum of all inputs into a neuron is taken and passed into an activation function (usually with a bias value). These are then used as inputs into the next layer until the output layer is reached. Engelbrecht (2005, pp. 18-20) lists several frequently used activation functions, including linear functions, step functions, ramp functions, sigmoid functions, hyperbolic functions, and Gaussian functions. The historically popular activation function for multilayer perceptrons is the sigmoid logistic function: 1 𝑓 π‘₯ = 1+𝑒 βˆ’π‘₯ Another popular activation function is the hyperbolic tangent function: 𝑒 2π‘₯ βˆ’1

2

𝑓 π‘₯ = 𝑒 2π‘₯ +1 = 1+𝑒 βˆ’π‘₯ βˆ’ 1 Training is accomplished with an algorithm called back-propagation, based on the generalized delta rule developed by RumelHart, Hinton, & Williams (1986). The squared error of the network is given by: 1 𝐸 𝑀 = 2 π‘˜βˆˆπ‘œπ‘’π‘‘π‘π‘’π‘‘ π‘‘π‘˜ βˆ’ π‘¦π‘˜ 2 Where 𝑀 is the vector of weights into the neuron, output is the set of outputs, tk is the target output, and yk is the actual output. Similar to the delta rule used in training a single layer perceptron, back-propagation finds the error for each connection weight. First, the partial derivative of the network error is found for each output neuron: π›Ώπ‘˜ = π‘¦π‘˜ 1 βˆ’ π‘¦π‘˜ (π‘‘π‘˜ βˆ’ π‘¦π‘˜ ) These are then propagated backwards through the hidden layers: 𝛿𝑕 = 𝑦𝑕 1 βˆ’ 𝑦𝑕 π‘˜βˆˆπ‘ƒπ‘• π‘€π‘˜π‘• π›Ώπ‘˜ Where 𝑃𝑕 is the set of neurons posterior to neuron h, π‘€π‘˜π‘• is the weight connecting hidden neuron h with neuron k in the next layer, and π›Ώπ‘˜ error of the neuron k in the next layer. Finally, the change in weight for each connection weight is found: βˆ†π‘€π‘—π‘– = 𝛼𝛿𝑗 π‘₯𝑗𝑖 Where 𝛼 is the learning rate, 𝛿𝑗 is the error partial derivative for neuron j, and π‘₯𝑗𝑖 is the input from neuron i into neuron j. Finally, each weight is updated: 𝑀𝑗𝑖 = 𝑀𝑗𝑖 + βˆ†π‘€π‘—π‘–

Gesture Recognition

5

Data Preparation and Initialization Engelbrecht (2005, pp. 90-97) lists several aspects that influence performance of multilayer perceptrons, foremost amongst these being data preparation. Missing values, coding of input values, outliers, scaling and normalization, noise injection, and training set manipulation can all affect the performance of a perceptron. Missing values can be handled in three ways. The first method is to simply remove any patterns with missing values, which has the consequence of reducing information available for training and possibly losing important information. The second is to replace each missing value with the mean (for continuous values) or the median (for nominal or discrete values). This method introduces no bias. The third method is to add an additional input unit to indicate patterns for which parameters are missing. The influence of missing values can then be determined after training. Coding of input values is also an issue, especially where nominal values are concerned. For each nominal value with n different values, n binary inputs should be used; where the input corresponding to the nominal value is set to 1 and the rest set to 0. A single input can be used, with each nominal value mapped to a numerical value. The consequence of this is that the perceptron will interpret it as being a continuous parameter and lose the discrete characteristic of the original data. Outliers have a large effect on perceptron accuracy, especially with a gradient descent algorithm using the sum of squares error function. Three methods are proposed for dealing with outliers. First, outliers can be removed before training starts using statistical methods, although important information can be lost. Second, a more robust error function can be used than sum of squares. Engelbrecht (2005) suggests the Huber function, which is essentially a clamped version of sum of squares. Outliers with a large error will have a constant value with zero derivatives, and therefore will have no influence when weights are updated. Third, some researchers have suggested using Bimodal Distribution Removal, which removes outliers from training sets during training and reduces to standard learning when no outliers are present (Slade & Gedeon, 1993; Gedeon, Wong, & Harris, 1995). Scaling and normalization of input data and target values can greatly impact perceptron performance. Without scaling or normalization, input values can β€œsaturate” a neuron by approaching the asymptotic ends of the activation function. For the most popular activation functions, this often means a very small derivative and corresponding influence on weight changes leading to β€œparalysis” of the neuron. Furthermore, the bias value for each layer is constant, and may be scaled orders of magnitude differently from input values, requiring a large learning rate or increased training time to compensate. Scaling to the active domain of the activation function where small changes in input will have large effects on weight changes is therefore recommended. Since the inputs are now approximately normalized, it is reasonable to model the connection weights as independent, normalized random variables. The variance of the net input is then: π‘‰π‘Žπ‘Ÿ 𝑛𝑒𝑑𝑖 β‰ˆ 𝑗 βˆˆπ΄π‘– π‘‰π‘Žπ‘Ÿ 𝑀𝑗𝑖 𝑦𝑗 = 𝑗 βˆˆπ΄π‘– 𝑀𝑗𝑖2 π‘‰π‘Žπ‘Ÿ(𝑦𝑗 ) β‰ˆ 𝑗 βˆˆπ΄π‘– 𝑀𝑗𝑖2 ≀ 𝐴𝑖 π‘Ÿ 2 Where initial weights are in [-r, r], and 𝐴𝑖 is the number of inputs neuron i has, a.k.a. β€œfan-in.” The range for each connection weight is then given by: 1 π‘Ÿπ‘– = 𝐴𝑖

Gesture Recognition

6

Where 𝑀𝑗𝑖 is initialized from a uniformly random distribution in the range [βˆ’π‘Ÿπ‘– , π‘Ÿπ‘– ] (Orr, Schraudolph, & Cummins, 1999). A similar argument can be extended to hidden neurons. Target values must similarly be scaled to the range of the activation function. For target values outside of the range of the activation function, the goal of the perceptron will always be out of reach and weights will approach extreme values until stopped. Care must be taken with problems using nominal outputs mapped to 1; logistic sigmoid and hyperbolic tangent activation functions have ranges that only approach 1. (Engelbrecht A. P., 2005; Orr, Schraudolph, & Cummins, 1999) However, Engelbrecht, Cloete, & Geldenhuys (1995) showed that linearly scaled target values increases training time, with a hyperbolic tangent activation function resulting in a faster training time than a logistic sigmoid activation function. Noise injection can be used to generate new training patterns without biasing network output, assuming it is sampled from a Normal distribution of zero mean and small variance (HolmstrΓΆm & Koistinen, 1992). Controlled noise also results in reduced training time and increased accuracy by producing a convolutional smoothing of the target function (Reed, Marks II, & Oh, 1995). Training pattern presentation order can affect training time and accuracy. Engelbrecht (2005, pp. 96-97) summarizes several, three of which will be mentioned here. The first method is Selective Presentation, where patterns are alternated between β€œtypical” (far from decision boundaries) and β€œconfusing” (close to decision boundaries) (Ohnishi, Okamoto, & Sugiem, 1990). The second method is Increased Complexity Training, where a perceptron is first trained on easy problems, followed by a gradual increase in problem complexity (Cloete & Ludik, 1993). The third method is simply to randomly sample a subset of the training data (Engelbrecht A. P., 2005).

Localized Learning Rates Local learning rates can be set per neuron by modeling the error terms as independent random variables. The variance of each error term is then: π‘‰π‘Žπ‘Ÿ (𝛿 ) π‘‰π‘Žπ‘Ÿ 𝛿𝑗 = π‘–βˆˆπ‘ƒπ‘— π‘‰π‘Žπ‘Ÿ 𝛿𝑖 𝑀𝑗𝑖 = π‘–βˆˆπ‘ƒπ‘— π‘‰π‘Žπ‘Ÿ(𝛿𝑖 )𝑀𝑗𝑖2 ≀ π‘–βˆˆπ‘ƒπ‘— 𝐴 𝑖 𝑖

Using a back-propagation procedure, we can estimate the variance for each neuron: 1 𝑣0 = 𝐴 𝑣𝑗 =

1

0

𝐴𝑗

π‘–βˆˆπ‘ƒπ‘—

𝑣𝑖

Where 𝑣0 is the estimated variance for the output error term, and 𝑣𝑗 is the estimated variance of the hidden error term. With a normalized activation, we can set the local learning rate with: 𝛼 𝛼𝑖 = 𝐴 𝑣 𝑖

𝑖

Where 𝛼𝑖 is used for all weights into neuron i . (Orr, Schraudolph, & Cummins, 1999)

Momentum The most common modification to the standard back-propagation algorithm is the addition of a momentum term: βˆ†π‘€π‘—π‘– (𝑑) = 𝛼𝑖 𝛿𝑖 𝑦𝑗 + π’Žβˆ†π’˜π’‹π’Š (𝒕 βˆ’ 𝟏)

Gesture Recognition

7

Where βˆ†π‘€π‘—π‘– (𝑑) is the change in weight at training time t, and 0 < π‘š < 1. A momentum term adds a fraction of the previous change in weight to the current one. This results in increasingly large steps towards the minimum when the error gradient points in the same direction over several time steps. The benefit of this is improved training time and accuracy, which occurs due to smoothing of the error gradient, damping of oscillations due to narrow valleys in the error surface, and allowing escape from local minima.

Annealing One method of automatically adapting learning rates is simulated annealing. To prevent a perceptron from skirting around minima during online learning, it is necessary to gradually lower the learning rate. Simulated annealing with a Search-then-Converge schedule does just that: 𝛼 𝛼𝑑 = 1+𝑑0 𝑇

Where 𝛼0 is the original learning rate, t is the current training step, and T is the number of training steps for which the learning rate is held nearly constant. The Search-then-Converge schedule keeps the learning rate nearly constant until step T, then gradually lowers it to allow convergence to the global minimum.

Method A multilayer perceptron was constructed for classifying gestures from processed Wiimote accelerometer data. Raw accelerometer data consisted of between 8 and 400 samples in 3 axis. Samples were taken at a constant rate, and clamped by the maximum range of the ADXL330 accelerometer. Swings with the Wiimote were recorded, displayed onscreen, then either immediately processed by the perceptron or stored as part of a training data set depending on user input. 2 variations of multilayer perceptron were used to solve 2 problems. The first was a β€œpure” multilayer perceptron as described in the section Multilayer Perceptrons, the second included the improvements listed in the sections Localized Learning Rates, Momentum, and Annealing. The two problems were to 1) classify two β€œsimple” gestures by distinguishing between left and right swings, and 2) classify four β€œcomplex” gestures by distinguishing between slow left and right swings, and fast left and right swings. The simple gesture was chosen because the signs of the majority of the inputs would be different for the two different categories, and would therefore have a very clear boundary between them. The complex gesture was chosen because in addition to distinguishing between the simple swings, the perceptron had to identify an ill-defined boundary on one of its inputs: the swing length. This section details how the data was prepared for the perceptrons, and how each perceptron was constructed.

Data Preparation Raw input data from the Wiimote was processed in 6 steps: 1. Acceleration values clamped by the ADXL330 maximum range were interpolated with a cubic Bezier spline. 2. Velocity was determined by Verlet integration of the acceleration values.

Gesture Recognition 3. 4. 5. 6.

8

Velocity was sampled at 8 equally spaced points. Velocity samples were normalized to the active domain of the activation function. Each normalized velocity sample was mapped to an input neuron. Swing length was incorporated by mapping the normalized (max-min scaling with min=0, max=400) number of acceleration samples to an input neuron – since these are taken at a fixed interval, it is linearly dependent on the swing length.

Network Architecture The perceptron used was a multilayer perceptron with 1 hidden layer. The input layer consisted of 25 inputs: one for each axis of each velocity sample, and one for the swing length. The output layer consisted of either 2 or 4 output neurons depending on the number of gestures to be categorized. The hidden layer size was equal to the output layer size multiplied by the input layer size. The hyperbolic tangent function was chosen over the historically popular logistic sigmoid function for the activation function. This meant that inputs and outputs from the hidden layer were already normalized, compared to the asymmetrical range of the logistic sigmoid. The maximum derivative of the hyperbolic tangent is 1.0, compared to the logistic sigmoid function which has a maximum derivative of 0.25. This meant that the error term was not unnecessarily magnified or attenuated during back-propagation (Orr, Schraudolph, & Cummins, 1999). Linearly scaling target values also does not have as large a negative effect on training time when using the hyperbolic tangent function as it does when using the logistic sigmoid function (Engelbrecht, Cloete, & Geldenhuys, 1995). Both functions are non-linear, monotonically increasing, and easily differentiable. Weights were initialized as recommended in the section Data Preparation and Initialization. The two neural networks were initialized with the following parameters: Table 2 Perceptron Parameters

Learning Rate (Ξ±) Momentum (m) Annealing (T)

β€œPure” Multilayer Perceptron 0.1 N/A N/A

β€œImproved” Multilayer Perceptron 1.0 0.7 20

Training The simple classification problem had a training data set size of 20. The complex classification problem had a training data set size of 80. Training patterns in both cases were presented using the Selective Presentation method mentioned in the section Data Preparation and Initialization.

Preliminary Results and Discussion Results thus far are not extensive enough to definitively confirm which methods are optimal for gesture classification from Wiimote accelerometer data. However, preliminary results confirm that the improved multilayer perceptron is capable of successfully classifying gestures 95+% of the time in a finite number of training epochs. 10 trails were completed for each of the 4 problem variations, with the averaged results below:

Gesture Recognition

9

Table 3 Preliminary Results: Average Number of Training Epochs for 95+% Successful Classification

Pure Multilayer Perceptron Improved Multilayer Perceptron

Simple Gesture Problem 68 16

Complex Gesture Problem N/A 124

The pure multilayer perceptron was not able to classify the complex gesture problem at 95+% success within the maximum epochs we established (400 epochs). After approximately 250 epochs, the pure multilayer perceptron appeared to converge at around 80% successful classification. This is most likely due to getting stuck in local minima, although it is possible that the learning rate was too high and that it was skittering around a global minimum without converging. The improved multilayer perceptron performed much better; it successfully classified the simple gesture problem in only 16 epochs, much faster than the pure multilayer perceptron. It should be noted that the quality of the training data set had a much larger effect with so few training epochs. It also managed to converge on a successful solution for the complex gesture problem in all of the 10 trails. Informal tests distinguishing between leftstarting and right-starting figure-8 swings, over-handed and under-handed swings, and fast and slow swings were also successful. Informal tests using noise injection (see section Data Preparation and Initialization) to generate additional training patterns showed no effect.

Summary and Future Work This paper gave a literature review of the most common techniques used with perceptrons. It further detailed how these techniques can be used to construct a multilayer perceptron capable of successfully classifying complex gestures from Wiimote accelerometer data. Unfortunately, the exact optimal solution for the perceptron parameters was not confirmed. Future work could take two approaches: 1) determine an optimal solution via extensive cross-validation tests, or 2) create a tool to automatically determine an optimal or near-optimal solution. The first would require improving the perceptron prototype constructed for this paper by adding support for sending/receiving large amounts of data between the Wii development kit and the development PC. This was not possible in the time allotted due to technical reasons. The second would most likely involve a stochastic local search, such as a genetic algorithm, although further investigation is necessary. A genetic algorithm may be a good first candidate despite its limitations, due to the non-linearity of the search space, unknown dependencies between the parameters, and unknown dependencies between the parameters and specific problem.

Gesture Recognition

10

Works Cited Cloete, I., & Ludik, J. (1993). Increased Complexity Training. In J. Mira, J. Cabestany, & A. Prieto, New Trends in Neural Computation (pp. 267-271). Berlin: Springer-Verlag. Cybenko, G. (1989). Approximation by Superpositions of a Sigmoidal function. Mathematics of Control, Signals and Systems , 303-314. Engelbrecht, A. P. (2005). Computational Intelligence: An Introduction. West Sussex: John Wiley & Sons, Ltd. Engelbrecht, A., Cloete, I., & Geldenhuys, J. Z. (1995). Automatic Scaling using Gamma Learning for Feedforward Neural Networks. In J. Mira, & F. Sandoval, From Natural to Artificial Neural Computing (pp. 374-381). Torreolinos, Spain: Springer. Funahashi, K. (1989). On the approximate realization of continuous mappings by neural networks. Neural Networks , 183-192. Gedeon, T. D., Wong, P. M., & Harris, D. (1995). Network Topology and Pattern Set Reduction Techniques. In J. Mira, & F. Sandoval, From Natural to Artificial Neural Computing (pp. 551558). Torremolinos, Spain: Springer. Hebb, D. O. (1949). The Organization of Behavior. New York: Wiley. HolmstrΓΆm, L., & Koistinen, P. (1992). Using Additive Noise in Back-propagation Training. IEEE Transactions on Neural Networks , 24-38. Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks , 359-366. McClelland, J., Rumelhart, D., & Hinton, G. (1986). The appeal of parallel distributed processing. In Rumelhart, McClelland, & P. R. Group, Parallel Distributed Processing volume 1: Foundations. Cambridge: MIT Press. Minsky, M., & Papert, S. (1969). Perceptron. Cambridge: MIT Press. Ohnishi, N., Okamoto, A., & Sugiem, N. (1990). Selective Presentation of Learning Samples for Efficient Learning in Multi-Layer Perceptron. Proceedings of the IEEE International Joint Conference on Neural Networks , 688-691. Orr, G., Schraudolph, N., & Cummins, F. (1999). The Backprop Toolbox. Retrieved April 11, 2008, from Neural Networks: http://www.willamette.edu/~gorr/classes/cs449/intro.html Reed, R., Marks II, R. J., & Oh, S. (1995). Similarities of Error Regularization, Sigmoid Gain Scaling, Target Smoothing, and Training with Jitter. IEEE Transactions on Neural Networks , 529538. Rosenblatt, F. (1962). Principles of Neurodynamics. Washington D.C.: Spartan Books. Rosenblatt, F. (1958). The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain. Psychological Review , 386-408. RumelHart, D., Hinton, G. E., & Williams, R. J. (1986). Learning internal representations by error propagation. In D. E. Rumelhart, & J. L. McClelland, Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations (pp. 318 - 362). Cambridge: MIT Press. Siegman, H. T., & Sontag, E. D. (1994). Analog computation via neural networks. Theoretical Computer Science , 331-360. Slade, P., & Gedeon, T. D. (1993). Bimodal Distribution Removal. In J. Mira, J. Cabestany, & A. Prieto, New Trends in Neural Computation (pp. 249-254). Berlin: Springer-Verlag.

Gesture Recognition

11

Widrow, B., & Hoff, M. E. (1960). Adaptative switching circuits. 1960 IRE Western Electronic Show and Convention, Convention Record , 96--104. WiiLi contributors. (2008, March 20). WiiLi, a GNU/Linux port for the Nintendo Wii. Retrieved April 16, 2008, from Wiimote: http://www.wiili.org/index.php?title=Wiimote&oldid=11016 Wikipedia contributors. (2008, April 17). Wii Remote. Retrieved April 17, 2008, from Wikipedia, The Free Encyclopedia: http://en.wikipedia.org/w/index.php?title=Wii_Remote&oldid=206155037

Gesture Recognition of Nintendo Wiimote Input Using ...

Apr 17, 2008 - allow for successful classification of complex gestures with ..... http://en.wikipedia.org/w/index.php?title=Wii_Remote&oldid=206155037.

702KB Sizes 2 Downloads 212 Views

Recommend Documents

89. GESTURE RECOGNITION SYSTEM FOR WHEELCHAIR ...
GESTURE RECOGNITION SYSTEM FOR WHEELCHAIR CONTROL USING A DEPTH SENSOR.pdf. 89. GESTURE RECOGNITION SYSTEM FORΒ ...

Research Article Cued Speech Gesture Recognition
This method is essentially built around a bioinspired method called early reduction. Prior to a complete analysis of each image of a sequence, the early reduction process automatically extracts a restricted number of key images which summarize the wh

Hand gesture recognition for surgical control based ... - Matthew P. Reed
the desired hand contours. If PointCloud Data. (PCD) files of these gestures already exist, this method can be adjusted quickly. For the third method, the MakeHuman hand model has to be shaped manually into the desired pose and exported as a PCD-file

Computer Vision Based Hand Gesture Recognition ...
Faculty of Information and Communication Technology,. Universiti ... recognition system that interprets a set of static hand ..... 2.5 Artificial Neural Network (ANN).

Hand gesture recognition for surgical control based ... - Matthew P. Reed
Abstract. The introduction of hand gestures as an alternative to existing interface techniques could result in .... Above all, the system should be as user-friendly.

Hand Gesture Recognition for Human-Machine ...
achieve a 90% recognition average rate and is suitable for real-time applications. Keywords ... computers through hand postures, being the system adaptable toΒ ...

46.A Hand Gesture Recognition Framework and Wearable.pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. 46.A HandΒ ...

Activity Recognition Using a Combination of ... - ee.washington.edu
Aug 29, 2008 - work was supported in part by the Army Research Office under PECASE Grant. W911NF-05-1-0491 and MURI Grant W 911 NF 0710287. This paper was ... Z. Zhang is with Microsoft Research, Microsoft Corporation, Redmond, WA. 98052 USA (e-mail:

Gesture Recognition with a 3-D Accelerometer
devices in daily life, for example, Apple iPhone [21], Nintendo Wiimote [22]. ... The first step of accelerometer-based gesture recognition system is to get the time.

Robust Part-Based Hand Gesture Recognition ... - Semantic Scholar
or histograms. EMD is widely used in many problems such as content-based image retrieval and pattern recognition [34], [35]. EMD is a measure of the distance between two probability distributions. It is named after a physical analogy that is drawn fr

Hand gesture recognition for surgical control based ... - Matthew P. Reed
2Dep. of Mechanical Engineering, Robotics research group, KU Leuven, Belgium. Abstract. The introduction of hand gestures as an alternative to existing interface techniques could result in groundbreaking changes in health-care and in every day life.

Input scene restoration in pattern recognition correlator ...
The kinoform used as correlation filter is placed in a free space of the SLR camera ... of the two- and three-dimensional scenes at the step of their registration .... Tikhonov deconvolution filter, used in this work, is described in frequency domain

Input scene restoration in pattern recognition correlator ...
Diffraction image correlator based on commercial digital SLR photo camera was reported earlier. .... interface Ҁ“ IEEE1394 for data storage and shooting control. .... For RAW format conversion, open-source GNU GPL licensed Dave Coffin`sΒ ...

SPEAKER-TRAINED RECOGNITION USING ... - Vincent Vanhoucke
advantages of this approach include improved performance and portability of the ... tion rate of both clash and consistency testing has to be minimized, while ensuring that .... practical application using STR in a speaker-independent context,.

Efficient Speaker Recognition Using Approximated ...
metric model (a GMM) to the target training data and computing the average .... using maximum a posteriori (MAP) adaptation with a universal backgroundΒ ...

IC_55.Dysarthric Speech Recognition Using Kullback-Leibler ...
IC_55.Dysarthric Speech Recognition Using Kullback-Leibler Divergence-based Hidden Markov Model.pdf. IC_55.Dysarthric Speech Recognition Using Kullback-Leibler Divergence-based Hidden Markov Model.pdf. Open. Extract. Open with. Sign In. Main menu.

Rapid Face Recognition Using Hashing
cal analysis on the recognition rate of the proposed hashing approach. Experiments ... of the images as well as the large number of training data. Typically, faceΒ ...

Customized Cognitive State Recognition Using ... - Semantic Scholar
training examples that allow the algorithms to be tailored for each user. We propose a ..... to the user-specific training example database. The TL module is usedΒ ...

Speech Recognition Using FPGA Technology
Figure 1: Two-line I2C bus protocol for the Wolfson ... Speech recognition is becoming increasingly popular and can be found in luxury cars, mobile phones,.

SPEAKER-TRAINED RECOGNITION USING ... - Vincent Vanhoucke
approach has been evaluated on an over-the-telephone, voice-ac- tivated dialing task and ... ments over techniques based on context-independent phone mod-.

A Review: Study of Iris Recognition Using Feature Extraction ... - IJRIT
analyses the Iris recognition method segmentation, normalization, feature extraction ... Keyword: Iris recognition, Feature extraction, Gabor filter, Edge detectionΒ ...

Automatic speaker recognition using dynamic Bayesian network ...
This paper presents a novel approach to automatic speaker recognition using dynamic Bayesian network (DBN). DBNs have a precise and well-understandΒ ...