Scalable Perceptual Metric for Evaluating Audio Quality Rahul Vanam Dept. of Electrical Engineering University of Washington Charles D. Creusere Klipsch School of Electrical and Computer Engineering New Mexico State University

Background • Because modern audio compression algorithms are optimized for the human auditory system, conventional objective like segmental signal-tonoise ratio are not effective • This has forced researchers to rely upon human subjective testing in order to validate and compare different algorithms – Time consuming – Ill-suited to online implementation – The results are often difficult to repeat

Background • Since the late 1980s, there has been a strong push to develop objective metrics capable of quantifying subjective audio quality • The culminated in the development of ITUR Recommendation BS.1387-1, called PEAQ – Contains a lower-complexity basic version and a more accurate advanced version

Background • Problem: Both the basic and advanced versions of PEAQ are designed to evaluate the quality of mildly impaired audio • Because we are interested in scalable audio compression, we would like an objective metric that is accurate over a wide range of audio impairments

Solution Framework • In previous work, we found that an alternative metric, namely the Energy Equalization Approach (EEA), was far more accurate in characterizing the quality of highly impaired audio than either version of PEAQ • In this paper, we combine EEA with PEAQadvanced to create a metric that is fidelity scalable: i.e., that is accurate over a wide range of audio qualities.

Energy Equalization Approach • Idea: Apply a truncation threshold to the original audio sequence, adjusting it until the energy of this sequence is the same as that of the reconstructed audio sequence – Mimics the process of band truncation that occurs in perceptual audio codecs

Energy Equalization Metric Define: • Energy of reconstructed audio ek =

total _ blocks 100



i =0

GOAL: Select T so that: eT = ek

∑ (rec_spec(i, j )k )2

j =51

• Modified time-frequency spectrum ⎧ o_spec (i, j ), m_spec(i, j )Tkn = ⎨ ⎩0,

if o_spec (i,j ) ≥ Tkn if o_spec (i,j ) < Tkn

• Energy of modified spectrum eTkn =

total _ blocks 100



i =0

∑ (m_spec(i, j)Tkn )2

j =51

New Metric Design • We combine the ‘T’ parameter generated by EEA with the five Model Output Variables (MOVs) that are already part of the PEAQ-advanced recommendation – Existing MOVs quantify the distortion loudness, the changes in modulation, the linear distortion, the harmonic structure of the error, and the noise-to-mask ratio

• A simple optimal linear weighting is used to fuse the MOVs into a single value

Subjective Test Data • The data used to test and design the proposed objective metrics was collected using the Comparison Category Rating (CCR) approach – 20 test subjects – 7 different audio sequences – Encoded bitrates of 16 and 32 kb/s – Using MPEG4 codecs: AAC, BSAC, and TVQ

Comparisons Optimal Linear Combination of PEAQ MOVs: Predictor Fit 3 Data point for Modified Advanced ver. LS Fit: Modified Advanced ver. Advanced version data point LS Fit:Advanced version EAQUAL data point LS Fit: EAQUAL

Subjective Measurement

2.5

2

1.5

1

0.5

0

0

0.5

1

1.5 Objective Measurement

2

2.5

3

Comparisons Optimal Linear Combination of PEAQ MOVs: Holdout Case 2.5 Squared Error Change in slope

Squared Error / Change in slope

2

1.5

1

0.5

0

0

5

10

15 Holdout Case

20

25

Comparisons Optimal Linear Combination, PEAQ MOVs plus EEA Error in Holdout Case

Optimal Fit 1.5

3

Squared Error / Change in slope

2.5

Subjective Measurement

Squared Error Change in slope

Adv ver. with EEA MOV and Single Layer NN data point LS Fit:Adv ver. with EEA MOV and Single Layer NN Advanced ver. data point LS Fit:Advanced version Adv ver. with single layer NN data point LS Fit: Adv ver. with sigle layer NN

2

1.5

1

1

0.5

0.5

0

0

0

0.5

1

1.5 Objective Measurement

2

2.5

3

0

2

4

6

8

10 12 Holdout Case

14

16

18

20

Comparisons Optimal Linear Combination, Bitrate Optimized: Low/Mid Quality Audio Error in Holdout Case

Optimal Fit 3

1.5

Squared Error / Change in slope

2.5

Subjective Measurement

Squared Error Change in slope

Data point:Advanced ver. with Energy equalization LS Fit:Advanced ver. with Energy Equalization Data point:Adv. ver. with bitrate based weight selection LS fit:Adv ver. with bitrate based weight selection

2

1.5

1

1

0.5

0.5

0

0

0.5

1

1.5 Objective Measurement

2

2.5

3

0

0

2

4

6

8

10 12 Holdout Case

14

16

18

20

Comparisons Optimal Linear Combination, Bitrate Optimized: High Quality Audio Error in Holdout Case

Optimal Fit 3

0.25 Squared Error Change in slope

2.5

Squared Error / Change in slope

Subjective Measurement

0.2

2

1.5

1

0.15

0.1

0.05 0.5

0

0

0.5

1

1.5 Objective Measurement

2

2.5

3

0

1

2

3

4

5

6

7 8 9 Holdout Case

10

11

12

13

14

15

Note: Perceptual measurements are simulated by treating the ODG values Generated by PEAQ-advanced as if they SDG values acquired through perceptual testing

Conclusions • Combining the EEA truncation threshold with the PEAQ MOVs clearly improves the predictive performance of the metric – The correlation coefficient is increased – The MSE of the predication error is decreased

• If bitrate information is also available, performance is further increased significantly

Future Work • Design a more complex 3-layer neural network similar to that used in PEAQ to generate the metric’s output from the MOVs • Generate additional subjective data using the more recent MUSHRA testing protocol and use it to more thoroughly validate the proposed metric

Scalable Perceptual Metric for Evaluating Audio Quality

Rahul Vanam. Dept. of Electrical Engineering ... Klipsch School of Electrical and Computer Engineering. New Mexico State ... Ill-suited to online implementation.

107KB Sizes 1 Downloads 255 Views

Recommend Documents

Scalable Perceptual Metric for Evaluating Audio Quality
Rahul Vanam. Dept. of Electrical Engineering. University of Washington. Charles D. Creusere. Klipsch School of Electrical and Computer Engineering.

A No-reference Perceptual Quality Metric for ... - Research at Google
free energy of this inference process, i.e., the discrepancy between .... such that their encoding bit rates are all higher than 100 ..... approach in the DCT domain.

EVALUATING LOW BITRATE SCALABLE AUDIO ...
dation includes two versions–the Basic and Advanced ... be poor for the Basic version [9]. In this paper we show .... predictor xˆ for the linear system (1) with redefined a is .... [10] Subjective performance assessment of telephone-band and.

An Objective Metric of Human Subjective Audio Quality ...
test with Hidden References and Anchor (MUSHRA) subjec- tive testing protocol used .... Note that hidden reference and anchor signals can be used peri- odically as controls to ..... 2000 [Online]. Available: http://www.ebu.ch/trev_home.html.

An Objective Metric of Human Subjective Audio Quality ...
amounts of compression are required in an application. Thus, it ...... to attend graduate school in Santa Barbara where he worked with Prof. ... Princeton, NJ.

Perceptual coding of audio signals
Nov 10, 1994 - “Digital audio tape for data storage”, IEEE Spectrum, Oct. 1989, pp. 34—38, E. .... analytical and empirical phenonomena and techniques, a central features of ..... number of big spectral values (bigvalues) number of pairs of ...

Perceptual coding of audio signals
Nov 10, 1994 - for understanding the FORTRAN processing as described herein is FX/FORTRAN Programmer's Handbook, Alliant. Computer Systems Corp., July 1988. LikeWise, general purpose computers like those from Alliant Computer Sys tems Corp. can be us

BugzillaMetrics - An adaptable tool for evaluating metric ...
Information on the evolution of software processes and products can be obtained by analyzing the data available in change request management (CRM) ...

BugzillaMetrics - An adaptable tool for evaluating metric ...
Information on the evolution of software processes and products can be obtained by analyzing the data available in change request management (CRM) ...

Robust audio watermarking using perceptual masking - CiteSeerX
Digital watermarking has been proposed as a means to identify the owner or ... frequency bands are replaced with spectral components from a signature.

Scalable High Quality Object Detection
Dec 9, 2015 - posal quality over its predecessor Multibox [4] method: AP increases from 0.42 to ... call compared to Multiscale Combinatorial Grouping [18] with less proposals on the ... proposal ranking function provides a way to balance recall ....

Modeling Perceptual Similarity of Audio Signals for ...
Northwestern University, Evanston, IL, USA 60201, USA pardo@northwestern. .... The right panel of Figure 1 shows the standard deviation of participant sim- ... are only loosely correlated to human similarity assessments in our dataset. One.

Perceptual Reasoning for Perceptual Computing
Department of Electrical Engineering, University of Southern California, Los. Angeles, CA 90089-2564 USA (e-mail: [email protected]; dongruiw@ usc.edu). Digital Object ... tain a meaningful uncertainty model for a word, data about the word must be

A Method for Metric-based Architecture Quality Evaluation
metric counts the number of calls which are used in .... Publishing Company, Boston, MA, 1997. [9]. ... Conference Software Maintenance and Reengineering,.

Scalable all-pairs similarity search in metric ... - Research at Google
Aug 14, 2013 - call each Wi = 〈Ii, Oi〉 a workset of D. Ii, Oi are the inner set and outer set of Wi ..... Figure 4 illustrates the inefficiency by showing a 4-way partitioned dataset ...... In WSDM Conference, pages 203–212, 2013. [2] D. A. Arb

Criteria for Evaluating the Quality of Online Courses
effectiveness of online courses, they may also be used as guidelines for course ... Internet bandwidth, hardware (computing speed and storage capacity), ...

A wavelet-based quality measure for evaluating the ...
In fact, without a. Further author information: (Send correspondence to Dr. Vladimir Buntilov) ... 6255, Fax: +66 (2) 889-2138 ext. 6268. ..... component PC1 of the decorrelated MS bands was replaced by its sharpened counterpart PC∗. 1 .

Similarity-Based Perceptual Reasoning for Perceptual ...
Dongrui Wu, Student Member, IEEE, and Jerry M. Mendel, Life Fellow, IEEE. Abstract—Perceptual reasoning (PR) is ... systems — fuzzy logic systems — because in a fuzzy logic system the output is almost always a ...... in information/intelligent

Robust audio watermarking using perceptual masking
In particular, the watermark may not be stored in a file header, a separate bit stream, or a ... scheme for audio which exploits the human auditory system (HAS) to ...

Install the Best Home Audio Systems for Quality Music Hearing.pdf ...
Install the Best Home Audio Systems for Quality Music Hearing.pdf. Install the Best Home Audio Systems for Quality Music Hearing.pdf. Open. Extract. Open with.

The _patient experience_- a quality metric to be aware of.pdf ...
Page 1 of 2. Commentary. The “patient experience”: a quality metric to be aware of. Seba Ramhmdani, MDa,b. , Ali Bydon, MDa,b,. * a. The Spinal Column Biomechanics and Surgical Outcomes Laboratory, The Johns Hopkins University School of Medicine,

Audio Codec Quality Shootout
Oct 3, 2006 - If you need to get up to speed on digital audio terms and formats, check .... order to achieve such high compression, the codecs perform what is ..... It's great for portable players or PCs, but not very good for streaming over the Inte