Learning Subjective Image Concepts
EECS 349: Machine Learning Professor Bryan Pardo
Edward Scott
Pubudu Silva
[email protected]
[email protected]
Introduction
Learning Methods
The goal of this project is to create a system that will learn image concepts based on an individual user's subjective preferences. Ultimately we would like for individuals to manipulate some subjective quality of images directly, instead of having to adjust multiple parameters. A system such as this, given training data for some high level concept from a specific user, would generate a controller which could map that concept onto low level image parameters. This poster presents the framework of our desired system, as well as the results of our current research into suitable learning methods.
We chose SVR as the learning method to use with our system because it provides for learning on continuous input and output variables and requires a smaller number of training examples than methods such as neural nets. The number of training examples needs to be fairly small, since each training example must be collected from a user and is specific to one concept and one user. Another strength of SVR is that, due to its use of a kernel function to map its inputs into higher dimensional space, it can learn arbitrary non-linear functions if its hyperparameters are tuned correctly. SVR was performed using the Spider Machine Learning Toolbox. We collected training data for three different image concepts: “Light,” “Crisp,” and “Pleasant.” 60 training examples were labeled by each subject for each concept. Within each set of 60 training examples, 20 were repeated to allow us to calculate the error between a subject's first and second ratings of an identical example (the subject's “self-error”). For the fourth phase of the system, we approximated the inverse of the learned SVR model by a hill climber. It is not practical to find the inverse of a non-linear function of four variables, hence the approximation.
Motivation This system would make it possible for any user, without significant image editing experience, to apply varied concepts to an image directly without having to manipulate a large set of parameters. All that would be required is the use of a quick training “wizard,” so the user can “teach” our software what they have in mind. Once that is done, the resulting controller could be applied to any image. This software could be a powerful addition to any image editing application.
System Overview System Work Flow Generate Examples Original Image
We compared the proposed learning method with a simple linear regression model. Each method's performance was evaluated in two ways: by cross-validation on the training data and by subjective evaluation of the final controller.
L*
Cross-Validated Root Mean Square Error
a* S
0.5
0.3
0.2
-1
Collect Training Data
Soft
1
Learned Concept Controller
0.1
SVR
The system consists of four phases. In the first phase, we generate 60 examples of one original image, where each example is a version of the original image shifted in four dimensions: L*, a*, b* (in the CIELAB color space), and s, where s represents sharpening (positive values) and blurring (negative values). In the second phase the system collects training data for the 60 examples from the user. The user labels each example with a rating of how well that example corresponds to the concept which the user would like the system to learn. In the third phase, an SVR (Support Vector Regression) learning algorithm is applied to the training data collected in the second phase. The SVR algorithm learns a function to map four image parameters to user ratings. The first three parameters are a function of the distance between each example and the original image in CIELAB space; one parameter each for L*, a*, and b*. The fourth parameter is the s value of each example. In the fourth phase, given a user rating (input by the user through the concept controller) and the learned SVR model of the concept, the system calculates values for the four parameters and generates an output image accordingly. The system is only intended to work for concepts which may somehow be expressed by color and sharpness.
Pleasant
SVR Model Pleasant
Challenges and Future Work
0.4
Model learned by Support Vector Regression
Pleasant
Not Pleasant
Top: “Pleasant” controller generated by linear regression has a point of maximum pleasantness. If the slider is moved past that point, toward “Pleasant”, the image actually becomes less pleasant. Bottom: The controller generated by SVR learns this non-linearity.
0.6
SVR Learning
Linear Regression Model
Not Pleasant
b*
60 Examples
Example: Linear Regression vs. SVR
Not Pleasant
Analysis of Results
Output Image
Each set of 60 examples was cross-validated separately. For each set of 60 examples, two concept controllers were generated: one by SVR and one by linear regression. Each subject, for each training set they labeled, was asked to rate the performance of both controllers. The rating was a function of how well the controller corresponded to the subject's perception of the concept. The figures at bottom center demonstrate that SVR had better performance than linear regression, particularly for the concept “Pleasant.” It is not surprising that the performance advantage of SVR was greater for this concept; “Light” and “Crisp” were expected to linearly correspond to L* and s, respectively, while “Pleasant” was expected to have a more complex relationship to the parameters. The subjects' self-error gives a rough upper limit on the performance that can be expected from the learner.
Linear
SVR
“Light”
Linear
“Crisp”
SVR
Linear
“Pleasant”
10-fold cross-validated root mean square error of SVR and Linear Regression models for three image concepts.
Subjective Validation
Subjects' Self-Error
Model Concept
SVR
Linear
Concept
Mean Self-Error
“Light”
0.83
0.88
“Light”
0.21
“Crisp”
0.79
0.73
“Crisp”
0.42
“Pleasant”
0.77
0.63
“Pleasant”
0.32
Mean subject ratings of generated concept controllers. Ratings are in the range [0, 1].
The mean of the subjects' RMS self-error for each concept.
There are a few aspects of the problem we are trying to solve which have proved challenging: ● Low number of training examples, but high number of parameters ● Noisy data – human ratings are inconsistent ● Synthesis part of system (fourth phase) restricts choice of parameters ● Must approximate inverse of learned regression to generate output images In the future, we would like to develop a better method to approximate the inverse of the learned SVR model since the hill climber we are currently using does not always produce consistent results.
Conclusion We have implemented the initial framework for a system to learn subjective, high level image concepts and map those concepts onto low-level image parameters in real time. We have developed an efficient method of collecting user training data and a learning method which outperforms linear regression (the first approach we tried). SVR demonstrated error similar to the subjects' self-error, which imposes a rough limit on possible learner performance. Our next goal is to implement a more accurate, more efficient way to approximate the inverse of the model learned by SVR. This is a key step toward our ultimate goal of a system which can function in real time. For more information, see our website and extended abstract at: http://sites.google.com/site/imageconceptlearning/