Flexible Integer DCT Architectures for HEVC Sang Yoon Park and Pramod Kumar Meher Institute for Infocomm Research, 1 Fusionopolis Way, Singapore-138632 Email: {sypark, pkmeher}@i2r.a-star.edu.sg

Abstract—In this paper, we present high throughput and power-efficient architectures for the implementation of integer DCT of different lengths to be used in upcoming High Efficiency Video Coding (HEVC). We have shown that efficient matrixmultiplication schemes could be used to derive parallel architectures for 1-D integer DCT of different lengths. Apart from that we have proposed three different flexible architectures which could be used for implementing the DCT of any of the prescribed lengths such as 4, 8, 16 and 32, each having particular advantage in terms of area, delay, or power. The proposed architectures can provide higher throughput at a lower operating frequency than the existing architectures for HEVC. Furthermore, it can support Ultra-High-Definition (UHD) 7680×4320 @30fps video which is one of the applications of HEVC.

I. I NTRODUCTION The Joint Collaborative Team on Video Coding (JCT-VC) of Moving Picture Experts Group (MPEG) and Video Coding Experts Group (VCEG) is now in the process of developing a new video coding standard called High Efficiency Video Coding (HEVC) [1], which is positioned as a successor to H.264/MPEG-4 AVC. The upcoming HEVC standard aims to double the compression ratio of H.264/AVC with comparable quality of the video at the expense of increased computational complexity, having flexibility to trade compression ratio, robustness, and processing time. The JCT-VC has recently decided on the integer Discrete Cosine Transform (DCT) of length 4, 8, 16, and 32 to be used for HEVC. In this paper, we present the generalized architecture for N -point integer DCT based on optimized hardware-oriented algorithms for different lengths. Besides, three flexible architectures for the implementation of integer DCT of any of the prescribed lengths of HEVC are proposed. The proposed architectures provide a higher throughput at a lower operating frequency than existing architectures [2], [3]. II. R EVIEW OF I NTEGER DCT FOR HEVC T

Let X = [x(0),x(1),· · ·,x(N − 1)] be an N -point input vector and Y = [y(0),y(1),···,y(N −1)]T be the corresponding integer DCT output, given by Y = CN X, where CN denotes the N -point integer DCT kernel. The kernel matrices, CN for N = 4, 8, 16 and 32, are defined by JCT-VC [4]. The integer DCT, Y = [y(0),y(1),y(2),y(3)] of a 4-point input vector X = [x(0),x(1),x(2),x(3)] is given by [4]      y(0) 64 64 64 64 x(0) y(1) 83   36 −36 −83     x(1) . (1) y(2) = 64 −64 −64 64 x(2) y(3) 36 −83 83 −36 x(3)

The DCT of (1) can be separated into even and odd-indexed components and simplified as      y(0) 64 64 a(0) = (2) y(2) 64 −64 a(1) and      y(1) 83 36 b(0) = y(3) 36 −83 b(1)

(3)

where a(0) = x(0) + x(3), a(1) = x(1) + x(2), b(0) = x(0) − x(3) and b(1) = x(1) − x(2). As in the case of 4-point DCT, the computation of 8, 16 and 32-point DCT can also be simplified to calculate the even-indexed and odd-indexed DCT coefficient vectors as     a(0) y(0)    y(2)  a(1)         · ·        = CN/2  · · (4a)         · ·     a(N/2 − 2) y(N − 4) a(N/2 − 1) y(N − 2) and    b(0) y(1)    y(3)  b(1)         · ·         · ·   = MN/2       · ·     y(N − 3) b(N/2 − 2) b(N/2 − 1) y(N − 1)

(4b)

a(i) = x(i) + x(N − i − 1)

(5a)

b(i) = x(i) − x(N − i − 1)

(5b)



where

for i = 0, 1, · · ·, N/2 − 1. CN/2 is an N/2-point integer DCT kernel matrix and MN/2 is a matrix of size (N/2) × (N/2) whose (i, j)th entry is defined as 2i+1,j mi,j for 0 ≤ i, j ≤ N/2 − 1 N/2 = cN

(6)

where c2i+1,j is the (2i + 1, j)th entry of CN . Note that the N N/2-point DCT given by (4a) could be computed using similar hierarchical decomposition.

mi,9 = 9b(i) = b(i) << 3 + b(i)

(7a)

ti,83 = 83b(i) = b(i) << 6 + mi,9 << 1 + b(i)

(7b)

ti,36 = 36b(i) = mi,9 << 2, for i = 0 and 1.

(7c)

y(1) = t0,83 + t1,36 , y(3) = t0,36 − t1,83 ,

(8)

and STAGE-3 for the DCT of higher length can also be derived from (4b) directly. For 4-point DCT, [y(0), y(2)] is obtained using a simple calculation as y(0) = (a(0) << 6)+(a(1) << 6) and y(2) = (a(0) << 6) − (a(1) << 6). For any N -point DCT, the even-indexed output, [y(0),y(2),...,y(N − 2)] can be computed using N/2-point DCT of [a(0),a(1),...,a(N/2 − 1)].

IV. F IXED A RCHITECTURE FOR I NTEGER DCT A generalized architecture for N -point integer DCT based on the optimized hardware-oriented algorithm of Section-III is shown in Fig.1. It consists of four units, namely the inputadder-unit (IAU), N/2-point integer DCT unit, shift-add-unit (SAU) and output-adder-unit (OAU). The IAU computes a(i) and b(i) for i = 0, 1, ..., N/2 − 1 according to STAGE-1 of Section-III. The SAU provides the multiplication results of b(i) with coefficients of MN/2 by the algorithm of STAGE-2. Finally, the OAU generates the output of DCT via the addertree with (log2 N −1) stages using the algorithm of STAGE-3. The detail structures of SAU, IAU and OAU for an 8-point

<<1

<<3

0 1 a(0) a(1)MUX

A <<1

<<1

A

t64,75 A t64,50 INPUT-ADDER-UNIT <<6 b(1) a(N/2-1)Sel b(0) t83,75 t83,50 _ A <<3 -/+ t36,18 0 1 A/S MUX t36,89

b(N/2-1) _

(N/2)-POINT FIXED <<2 (-) INTEGER DCT A UNIT <<1

t64,50 t64,75

A

t64,89 t64,18

A/S

1

0

1

0

_

MUX

MUX

t36,18

t83,50

t64,75

t64,89

y(N-2)

y(1) y(3)

A

(-)

a(0)

A

b(0)

A

a(1)

b(i)

y(N-1)

(-)

x(2)

(-)

x(3)

(-)

A

b(1)

A

a(2)

A

b(2)

A

a(3)

A

<<6

A A

A

A

t64,75 t64,50

A

t83,75 t83,50

_ A

t36,18 t36,89

A

<<1 ti,18

(a)

ti,50

(b)

t36,18 t36,89

A

ti,75

ti,89

_

stage-1 t64,50 t64,75 t64,89 t64,18

<<2

<<1

(-)

b(3)

t64,89 t64,18

A

y(1)

A

y(3)

<<3 <<4

x(5)

y(7)

A generalized architecture of 8-, 16- and 32-point integer DCT.

x(1) x(6)

y(5)

N/2

_ A A _ t83,50 A t83,75 OUTPUT-ADDER-UNIT

1

MUX

x(0) x(7)

0

A

+/-

N/2 t36,18 N/2

MUX

y(0) y(2) Fig. 1.

1

y(3)

A

stage-2

stage-1

A

t36,89

0

sel

y(1)

A

<<4

A

x(4)

Detail computations of STAGE-2 for length 8, 16, and 32 can be obtained from a multiplier block generator using MCM approach [5]. In STAGE-3, the final additions/subtractions involved in the matrix calculation of (4b) are performed. For 4-point DCT, the third stage is calculated as

x(N/2-1) x(N-1) t64,89 t64,18

SHIFT-ADDUNIT SHIFT-ADDUNIT

As shown in Section-II, the computation of an N -point integer DCT involves the computation of an (N/2)-point integer DCT and a multiplication of (N/2)-point vector with a constant matrix MN/2 of size (N/2) × (N/2). The computation of integer DCT thus is reduced to the problem of constant matrix multiplication (CMM), which can be implemented as a set of multiple constant multiplications (MCM). As an example, for the implementation of 4-point integer DCT, according to (3), we can have y(1) = 83b(0) + 36b(1) and y(3) = 36b(0) − 83b(1). One can perform a pair of constant multiplications {r00 = 83b(0), r10 = 36b(0)} and {r01 = 36b(1), r11 = 83b(1)}, and y(1) = r00 + r01 and y(3) = r10 −r11 . Similarly, this feature could be used to implement the multiplication of an M × M matrix with an M -point vector by a set of M multiple constant multiplications. Based on this approach, we can obtain hardware-oriented algorithms for integer DCT of lengths 4, 8, 16 and 32. In general, the N -point integer DCT given by Section-II can be computed in three stages. STAGE-1 involves the additions/subtractions of (5). The algorithm of STAGE-2 involves the multiplication of each coefficient in MN/2 with b(i). For 4-point DCT, STAGE2 is computed as

x(1) x(2)

SHIFT-ADDUNIT

x(5),b(1) x(0)

III. H ARDWARE -O RIENTED A LGORITHMS FOR I NTEGER DCT C OMPUTATION

t83,50 t83,75

stage-2

A _

A

y(5)

A

y(7)

A _ A _ A

(c)

Fig. 2. Proposed architecture of 8-point integer DCT. (a) input adder unit. (b) SAU-i, for i = 0, 1, 2 and 3. (c) output adder unit.

integer DCT are shown in Fig.2. Each SAU computes ti,89 , ti,75 , ti,50 , and ti,18 according to STAGE-2 of the algorithm for i=0, 1, 2, and 3, and four equivalent SAUs are accordingly instantiated. The outputs of SAUs are finally added by twostage adder-tree based on STAGE-3 as shown in Fig.2(c). V. F LEXIBLE A RCHITECTURES FOR I NTEGER DCT In this section, we discuss three flexible architectures for the computation of integer DCT of different lengths for HEVC. A. Flexible Architecture-1 The architecture of Fig.3(a) is a simplest improvisation of that of Fig.1, where the input to (N/2)-point DCT unit is fed through a MUX-unit to select either [a(0), ..., a(N/2 − 1)] or [x(0), ..., x(N/2 − 1)], depending on whether it is used for N -point DCT computation or for the DCT of a lower size, respectively. Such a structure for 32-point DCT can be used to compute a 16-point or an 8-point or a 4-point integer DCT by appropriate selection of inputs by MUX-unit which consists of N/2 2:1 MUXes.

)

N/2

N/2

b(1)

b(0)

(N/2)-POINT REUSABLE

(N/2)-POINT INTEGER DCT REUSABLE UNIT INTEGER DCT UNIT

N/2

OUTPUT-ADDER-UNIT

y(N-2)

y(0) y(2)

y(1) y(3)

y(N-2)

y(1) y(3)

b(N/2-1) x(N/2)

b(N/2-1) x(N/2)

N/2

N/2 N/2

x(N-1)

x(N-1)

sel

(N/2)-POINT REUSABLE INTEGER DCT UNIT

N/2

sel

N/2

N/2

OUTPUT-ADDER-UNIT

OUTPUT-ADDER-UNIT y(0) y(2)

b(1)

MUX-UNIT-2

N/2

N/2

N/2

b(0)

b(N/2-1)

CONFIGURABLE SHIFT-ADDUNIT-(N/2-1)

REUSABLE INTEGER DCT UNIT

INPUT-ADDER-UNIT

SHIFT-ADDUNIT

SHIFT-ADDSHIFT-ADDUNIT UNIT

(N/2)-POINT FIXED (N/2)-POINT INTEGER DCT UNIT

b(N/2-1)

b(1)

b(0)

OUTPUT-ADDER-UNIT

y(N-1)

N/2

N/2

N/2

MUX-UNIT-2

MUX-UNIT-3

y(N-1)

(a) y(0) y(0)y(2) y(2) x(0) x(1) x(N/2-1)

y(N-2) y(N-2)

y(N-1)y(N-1)

y(1)y(1) y(3)y(3)

x(N-1)

(a) x(5),b(1)

t64,89 t64,18 t64,75 t64,50

INPUT-ADDER-UNIT

MUX-UNIT-1

<<1

<<3

a(N/2-1) b(N/2-1) x(N/2)

b(1)

b(0)

<<4 <<6

x(N-1)

0

Sel

1

A

SHIFT-ADDUNIT

x(1) N/2 N/2 x(2)

N/2

0

(N/2)-POINT REUSABLE INTEGER DCT x(N/2-1) x(N-1) UNIT

b(0)

b(1)

<<1

sel

0

N/2

N/2

N/2

Fig. 3. Flexible architectures for integer DCT computation. (a) architecture-1. (b) architecture-2. OUTPUT-ADDER-UNIT

B. Flexible Architecture-2 y(0) y(2)

y(N-2)

y(1) y(3)

y(N-1)

The architecture of Fig.3(b) involves an extra (N/2)-point x(0) x(1) DCT unit over the structure ofx(N/2-1) Fig.3(a), x(N-1) which takes the input [x(N/2),...,x(N − 1)]. The output of this additional (N/2)point DCT unit is multiplexed with that of the OAU followed a(0) a(1) by the SAUs of the structure. For N = 32, this structure can sel MUX-UNIT-1 compute one 32-pointa(N/2-1) DCT,INPUT-ADDER-UNIT two 16-point DCTs, four 8-point DCTs and eight 4-pointb(0)DCTs, whileb(N/2-1) the throughput remains x(N-1) x(N/2) b(1) the same as 32 DCT coefficients per cycle irrespective ofselthe MUX-UNIT-2 desired transform size. REUSABLE SHIFT-ADDUNIT-(N/2)

REUSABLE SHIFT-ADDUNIT-1

C. Flexible Architecture-3 Fig.4(a) shows the third one of the proposed architectures (N/2)-POINT for flexible REUSABLE DCT computation. To compute N -point DCT, INTEGER DCT UNIT

N/2

N/2

N/2 OUTPUT-ADDER-UNIT N/2

N/2 MUX-UNIT-3

1

0

1

N/2

0

1

0

1

MUX

MUX

MUX

MUX

t36,18

t83,50

t64,75

t64,89

y(N-1)

INTEGER DCT UNIT (b)

b(1)

b(0)

SHIFT-ADDUNIT

SHIFT-ADDUNIT SHIFT-ADDUNIT

y(0) y(2) (N/2)-POINT y(N-2) y(1) y(3) FIXED

<<1

b(N/2-1)

MUX-UNIT-2

36,18

t36,89

x(N-1) A/S

A

INPUT-ADDER-UNIT

a(N/2-1)

t83,75 t83,50 _ A

stage-1 A t64,50 A a(0) t64,75 a(1) <<2 t 64,89 A/S INPUT-ADDER-UNIT MUX-UNIT-1 (-) t64,18 A a(N/2-1) +/<<1

sel

(b)

t36,18 t36,89 t83,50 t83,75

A

y(1)

A

y(3)

A

-/+ x(0) x(1) x(N/2-1) 1 t

MUX

OUTPUT-ADDER-UNIT a(0) a(1)

<<3

REUSABLE SHIFT-ADDUNIT-0

(N/2)-POINT REUSABLE INTEGER DCT x(0) UNIT

SHIFT-ADDUNIT SHIFT-ADDUNIT

MUX

A

_

stage-2

_

y(5)

A

x(N-1)

x(N/2) b(N/2-1) _ A A _ MUX-UNIT-2 A

REUSABLE SHIFT-ADDUNIT-1

sel

(c)

y(7)

sel

REUSABLE SHIFT-ADDUNIT-(N/2)

a(0) a(1)

REUSABLE SHIFT-ADDUNIT-0

UNIT

1)

b(1)

a(N/2-1)

SHIFT-ADD-

a(N/2-1)

b(0)

INPUT-ADDER-UNIT

a(N/2-1)

SHIFT-ADDSHIFT-ADDUNIT UNIT

a(0) a(1)

a(1)

sel MUX-UNIT-1 MUX-UNIT-1

INPUT-ADDER-UNIT a(N/2-1)

1)

)

a(0) a(0) a(1)

sel

INPUT-ADDER-UNIT

MUX-UNIT

SHIFT-ADDSHIFT-ADDUNIT UNIT

sel

x(N/2-1) x(N-1)

a(0) a(1)

CONFIGURABLE SHIFT-ADDSHIFT-ADDUNIT-0 UNIT

x(0) x(1) x(2)

x(N/2-1) x(N-1)x(N-1) x(0)x(0) x(1) x(1) x(N/2-1)

x(N-1)

CONFIGURABLE SHIFT-ADD-UNIT UNIT-1

x(0) x(1) x(N/2-1)

Fig. 4. Flexible (N/2)-POINT architectures for integer DCT computation. (a) architecture-3. REUSABLE (b) reusable SAU-1 for N = 8. (c) OAU for N = 8. INTEGER DCT UNIT N/2 N/2 N/2 OUTPUT-ADDER-UNIT the (N/2)-point integer DCT unit performs the computation N/2 N/2 of (4a) whereas the configurable SAU (CSAU) and OAU b(i) t64,89 N/2 a(0) A x(0) A t64,18 perform the computation of (4b) which isMUX-UNIT-3 similar to that of the A t64,75 architecture Aof Fig.1. In order to reuse this architecture Afor the t64,50 x(7) b(0) (-) of the (N/2)-point <<3 computation DCT, the (N/2)-point integer y(N-1) 83,75 y(1) y(3) DCT, x(1)DCT unit y(0) A y(2)a(1) y(N-2) <<6 ttand computes an (N/2)-point the 83,50 _ ACSAU <<4 and OAU compute the other (N/2)-point DCTt36,18 providing two _ A A A t36,89 x(6)(N/2)-pointA DCTs. b(1) Similarly, for the computation of (N/4)(-) stage-1 stage-2 the (N/2)-point DCT unit computes two (N/4)A A x(2)point DCT, A a(2) t64,50 A t64,75the other point DCTs, and the CSAU and OAU compute two A _ t64,89 <<2 (N/4)-point DCTs, producing total four N/4-point DCTs. A <<1 x(5) b(2) A t64,18 (-) Fig.4(b) shows the structure of CSAU-1 (-) which is one out of the t36,18 x(3) four CSAUsA useda(3)for N = 8.<<1For theA 4-point CSAUt36,89DCT, _ A _ column A 1 multiplies x(5) with [64, 36, 64, 83] in the tsecond 83,50 x(4)

(-)

A

b(3)

ti,18

ti,50

ti,75

ti,89

t83,75

A

y(1)

y(3)

y(5)

y(7)

of (1) ignoring signs of the coefficients. This is the part of 4-point DCT computation of [x(4),x(5),x(6),x(7)]. For the computation of 8-point DCT, the CSAU-1 multiplies b(1) with [89, 75, 50, 18] which is the second column of M4 . As shown in Fig.4(b), the CSAU can be designed based on the SAU in Fig.2(b) with several MUXes, however it doesn’t need any additional adder. Each CSAU is designed differently so that last four MUXes have different combinations of coefficients corresponding to the configuration whereas SAUs in Fig.1 and 3 have the same structure. Since signs of coefficients in each configuration may also be different, the OAU is implemented using add/sub units instead of adders or subtractors in some locations as shown in Fig.4(c). The MUX-unit-3 is not required for N = 8 since the CSAU and the OAU can generate outputs corresponding to the 4-point DCT or the 8-point DCT using necessary MUXes. However, when the architecture-3 for N = 16 works for the 4-point DCT, outputs of stage-2 in three-stage adder-tree should be directed to the output y(i) unlike the 8-point DCT and the 16-point DCT. Therefore, the MUX-unit-3 which includes an array of 2:1 MUXes is required to select one out of the output of stage-2 and the output of stage-3. Similarly, a MUX-unit-3 consisting of 3:1 MUXes is used in the architecture of N = 32. VI. H ARDWARE AND T IME C OMPLEXITY The proposed architectures have been coded in VHDL and synthesized by Synopsys Design Compiler using TSMC 90nm CMOS library for N = 32. The data-arrival-time (DAT), gate count, pixels/cycle, area-delay-product (ADP), and energy-persample (EPS) (at 100MHz clock frequency) are shown in Table I. Since the flexible architecture-2 and 3 yield 32 samples in every cycle, 2-D transform of 32×32 block can be processed in 64 cycles (16 pixels/cycle) with 2-D transposition memory for all different DCT sizes. The throughput of architecture-1 varies with the DCT size. Therefore, we have averaged it over all the transform sizes, 4, 8, 16, 32 and shown in Table I. The ADP and EPS for the architecture-1 is almost the same as those for the architecture-2, but the architecture-3 involves more area and longer delay, and consumes less energy than the other two. In Table II we have compared the proposed flexible architecture-2 and 3 with existing architectures for HEVC for N = 32. The proposed designs offer much higher throughput than the existing designs with marginal overhead of the gate count. In order to support 8K Ultra-High-Definition (UHD) (7680 × 4320)@30 frames/sec and 4:2:0 YUV format which is one of the applications of HEVC [6], the proposed flexible architecture-2 and 3 require the maximum operating frequency of 94MHz (7680×4320×30×1.5/16). However, the designs of [2] and [3] require frequencies of 381MHz and 702MHz since 2-D transform of 32×32 block can be computed in 261 cycles and 481 cycles, respectively. The higher operating frequency leads to more power consumption in the transposition buffer for the 2-D transform as well as in the DCT core. However, 94MHz operating frequency can be obtained by the proposed

TABLE I S YNTHESIS R ESULT OF THE P ROPOSED 1-D I NTEGER DCT A RCHITECTURES design

DAT (ns)

gate count

pels / cycle

ADP ns/(pels·gate)

EPS mW/pels

Fixed Flexible-1 Flexible-2 Flexible-3

2.65 4.54 4.84 5.95

53K 92K 141K 156K

− 10.06 16 16

− 41855 42819 58234

− 2.192 2.040 1.574

TABLE II C OMPARISON OF D IFFERENT 1-D I NTEGER DCT A RCHITECTURES WITH THE P ROPOSED A RCHITECTURES design

tech. (µm)

gate count

freq. (MHz)

pels / cycles

support

[2] [3] Flexible-2 Flexible-3

0.13 0.18 0.15 0.15

134K 52K 105K 127K

350 300 94 94

4 2.12 16 16

4096x2048@30fps 3840x2160@30fps 7680x4320@30fps 7680x4320@30fps

architectures using TSMC 0.15 µm or newer technologies as shown in Table II.. VII. S UMMARY AND C ONCLUSIONS In this paper, we have proposed high-throughput and powerefficient architectures for the implementation of integer DCT of different lengths to be used in HEVC. The computation of N -point 1-D DCT has been performed using an (N/2)point 1-D DCT unit and a constant matrix multiplication unit recursively. We have proposed three flexible architectures which can process computation of integer DCT of different sizes. One major advantage of the proposed architectures is that they provide higher throughput at lower operating frequencies than existing architectures for HEVC. This feature of the proposed architectures could be leveraged to realize lowenergy computation of DCT. Furthermore, it can support UltraHigh-Definition (UHD) 7680×4320 4:2:0 video sequence with 30fps at 94MHz operating frequency with 105K gate count. R EFERENCES [1] Joint Collaborative Team on Video Coding (JCT-VC). “High Efficiency Video Coding (HEVC) Text Specification Draft 8, JCTVC-J1003”. [2] S. Shen, W. Shen, Y. Fan, and X. Zeng, “A unified 4/8/16/32-point integer IDCT architecture for multiple video coding standards,” in IEEE International Conference on Multimedia and Expo (ICME), 2012, pp. 788–793. [3] J.-S. Park, W.-J. Nam, S.-M. Han, and S. Lee, “2D large inverse transform (16 × 16, 32 × 32) for HEVC (high efficiency video coding),” Journal of Semiconductor Technology and Science, vol. 12, no. 2, pp. 203–211, 2012. [4] Joint Collaborative Team on Video Coding (JCT-VC). “CE10: Core Transform Design for HEVC, JCTVC-G495”. [5] Software/Hardware Generation of DSP Algorithms. [Online]. Available: http://www.spiral.net/ [6] G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the high efficiency video coding (HEVC) standard,” IEEE Transactions on Circuits and Systems for Video Technology, to be published.

Flexible Integer DCT Architectures for HEVC - Semantic Scholar

synthesized by Synopsys Design Compiler using TSMC 90nm. CMOS library for N = 32. The data-arrival-time (DAT), gate count, pixels/cycle, area-delay-product ...

1MB Sizes 1 Downloads 218 Views

Recommend Documents

Flexible Integer DCT Architectures for HEVC - Semantic Scholar
Email: {sypark, pkmeher}@i2r.a-star.edu.sg. Abstract—In this paper, we ... Coding (HEVC) [1], which is positioned as a successor to. H.264/MPEG-4 AVC.

Theoretical Complex Cepstrum of DCT and ... - Semantic Scholar
(e-mail: [email protected]). Abhijeet Sangwan is with the Center for Robust Speech Systems, Dept. of Electrical Engg., University of Texas at Dallas, ...

Theoretical Complex Cepstrum of DCT and ... - Semantic Scholar
filters. Using these derivations, we intend to develop an analytic model of the warped ... Abhijeet Sangwan is with the Center for Robust Speech Systems, Dept.

A Flexible and Semantic-aware Publication ... - Semantic Scholar
(ECAI'04), Valencia, Spain, August 22-27, IOS Press (2004) 1089–1090. 22. Damiani, E., Fugini, M.G., Bellettini, C.: A hierarchy-aware approach to faceted.

A Flexible and Semantic-aware Publication ... - Semantic Scholar
In contrast, the software company might look for Web services with particular WSDL interfaces. In addition, the. 3 http://www.xmethods.net. 4 http://www.wsoogle.

Multi-Domain IT Architectures for Next Generation ... - Semantic Scholar
customer knowledge, enterprise-quality services, billing and security ... chains based on virtual operators, software as a service and cloud-based solutions in.

Multi-Domain IT Architectures for Next Generation ... - Semantic Scholar
enterprise value chains based on virtual operators, software as a service and ... the multi-domain managed system as its solutions and applications reconfigure. ... Traceability Map tool chain – to support relationship context tracking, contract-.

Improving Embeddings by Flexible Exploitation of ... - Semantic Scholar
Many machine learning approaches use distances between data points as a ..... mapping it into new spaces and then applying standard tech- niques, often ...

Optimal k-Anonymity with Flexible Generalization ... - Semantic Scholar
distance to x as y does not need to be in the same partition. Consider ..... JAVA and experiments are run on a 3.4GHZ Pentium 4 ma- chine with ... 3 Education.

The Difficulty of Training Deep Architectures and ... - Semantic Scholar
As suggested in (Bengio et al., 2007), we adapt all the layers adapted si- multaneously during the unsupervised pre-training phase. Ordinary auto-encoders can ...

Anesthesia for ECT - Semantic Scholar
Nov 8, 2001 - Successful electroconvulsive therapy (ECT) requires close collaboration between the psychiatrist and the anaes- thetist. During the past decades, anaesthetic techniques have evolved to improve the comfort and safety of administration of

Considerations for Airway Management for ... - Semantic Scholar
Characteristics. 1. Cervical and upper thoracic fusion, typically of three or more levels. 2 ..... The clinical practice of airway management in patients with cervical.

Flexible Software Profiling of GPU Architectures
are often difficult to connect to the latest software toolchains .... GPU Software Stack ..... divergence; and detailed accounting of unique references gen-.

Czech-Sign Speech Corpus for Semantic based ... - Semantic Scholar
Marsahll, I., Safar, E., “Sign Language Generation using HPSG”, In Proceedings of the 9th International Conference on Theoretical and Methodological Issues in.

Discriminative Models for Semi-Supervised ... - Semantic Scholar
and structured learning tasks in NLP that are traditionally ... supervised learners for other NLP tasks. ... text classification using support vector machines. In.

Dependency-based paraphrasing for recognizing ... - Semantic Scholar
also address paraphrasing above the lexical level. .... at the left top of Figure 2: buy with a PP modi- .... phrases on the fly using the web as a corpus, e.g.,.

Coevolving Communication and Cooperation for ... - Semantic Scholar
Chicago, Illinois, 12-16 July 2003. Coevolving ... University of Toronto. 4925 Dufferin Street .... Each CA agent could be considered a parallel processing computer, in which a set of .... After 300 generations, the GA run converged to a reasonably h

Model Combination for Machine Translation - Semantic Scholar
ing component models, enabling us to com- bine systems with heterogenous structure. Un- like most system combination techniques, we reuse the search space ...

Biorefineries for the chemical industry - Semantic Scholar
the “green” products can be sold to a cluster of chemical and material ..... DSM advertised its transition process to a specialty company while building an.

Nonlinear Spectral Transformations for Robust ... - Semantic Scholar
resents the angle between the vectors xo and xk in. N di- mensional space. Phase AutoCorrelation (PAC) coefficients, P[k] , are de- rived from the autocorrelation ...

Leveraging Speech Production Knowledge for ... - Semantic Scholar
the inability of phones to effectively model production vari- ability is exposed in the ... The GP theory is built on a small set of primes (articulation properties), and ...

Enforcing Verifiable Object Abstractions for ... - Semantic Scholar
(code, data, stack), system memory (e.g., BIOS data, free memory), CPU state and privileged instructions, system devices and I/O regions. Every Řobject includes a use manifest in its contract that describes which resources it may access. It is held

SVM Optimization for Lattice Kernels - Semantic Scholar
gorithms such as support vector machines (SVMs) [3, 8, 25] or other .... labels of a weighted transducer U results in a weighted au- tomaton A which is said to be ...