Journal of Shanghai University (English Edition), 2007, 11(4): 380–384 Digital Object Identifier(DOI): 10.1007/s 11741-007-0412-2

A reordered first fit algorithm based novel storage scheme for parallel turbo decoder ZHANG Le (

), HE Xiang ( ), XU You-yun (),

LUO Han-wen ()

Department of Electronic Engineering, Shanghai Jiaotong University, Shanghai 200240, P. R. China Abstract In this paper we discuss a novel storage scheme for simultaneous memory access in parallel turbo decoder. The new scheme employs vertex coloring in graph theory. Compared to a similar method that also uses unnatural order in storage, our scheme requires 25 more memory blocks but allows a simpler configuration for variable sizes of code lengths that can be implemented on-chip. Experiment shows that for a moderate to high decoding throughput (40∼100 Mbps), the hardware cost is still affordable for 3GPP’s (3rd generation partnership project) interleaver. Keywords turbo codes, parallel turbo decoding, interleaver, vertex coloring, reordered first fit algorithm (RFFA), field programmable gate array (FPGA).

1 Introduction During field programmable gate array (FPGA) implementation of turbo decoder, a substantial amount of memory is assigned to store channel information and extrinsic information. A decoder using parallel maximum a priori (MAP) algorithm contains multiple soft-inputsoft-output (SISO) modules[1] , so parallel access to these storages is required. When translated into hardware design, it means that data required by different SISOs at the same time must not be stored in the same RAM block. Fig.1(a) and Fig.1(b) illustrate memory access during one iteration in turbo decoding, which is conceptually divided into 2 phases: (1) decoding against the 1st component code, and (2) decoding against the 2nd component code. During each phase, the trellis of the component code is divided into three segments, each taken care of by one SISO module. Suppose we avoid memory access contention in the 1st phase by storing SISOs’ output physically in 3 different RAM blocks. However, during the 2nd phase, previously separate writing of addresses from SISOs are translated by the interleaver Π. Potentially they can end up on the same RAM, and memory access contention still exists. Designing the interleaving pattern wisely can prevent such collision[1] and empirical results show that these contention-free interleavers can yield similar performance as conventional interleavers designed for serial implementation[2,3] . However, we notice that they are all constructed in a semi-random way so that at least

SISO1

SISO2

SISO3

RAM1

RAM2

RAM3

(a) Decoding the first component code SISO1

SISO2

SISO3



Collision RAM1

RAM2

RAM3

(b) Decoding the second component code Fig.1

Memory access in parallel Turbo decoder

one part of the interleaver pattern must be stored explicitly, which can be inconvenient when support for different code length is required. A solution is provided in [4], but it still requires explicit storage of the interleaver of the longest size. Interleavers of other lengths are then obtained by pruning this longest interleaver pattern. Other works try to solve this problem without redesigning interleaver pattern, partly because in some applications, interleaving pattern is predefined as part of the standard[5] . Also many excellent variable length real-time addressable interleavers exist, but they are not contention free[6] . In [7], an architecture was proposed which buffers memory access requests when they point to the same address. But it requires a special hardware structure unavailable in today’s FPGA. The idea

Received Oct.25, 2005; Revised Feb.22, 2006 Project supported by the National High-Technology Research and Development Program of China (Grant No.2003AA123310), and the National Natural Science Foundation of China (Grant Nos.60332030, 60572157) Corresponding author ZHANG Le, PhD Candidate, E-mail: [email protected]

Vol. 11

No. 4

Aug. 2007

ZHANG L, et al. : A reordered first fit algorithm based novel storage scheme ...

of resolving memory access contention by storing data in an unnatural order is first introduced in [8], which proves that, in this way, P RAM blocks are adequate to support P concurrent memory access. It provides the bottom line of the memory requirement and an ‘anneal procedure’ which computes memory storage order offline. In this paper we follow the idea of unnatural order in storage proposed in [8] but try to make the calculation of storage order simple enough so as to implement on-chip. The new storage order calculation borrows its idea from graph theory and is in essence a serial vertex coloring algorithm using greedy heuristics[9] . Simulation with 3GPP’s interleaver pattern shows the solution given by this method requires a few more memory blocks than [8], but the storage order calculation module can be implemented with simple logic, which reconfigures the decoder at the time of code length change within O(10L) clock cycles, where L is the length of information bits. We also find the major bottleneck of this unnatural order storage scheme is interconnection, in that the number of required tri-state buffers increases significantly for high throughput. However, we verify that for moderate decoding throughput (40∼50 Mbps), the required amount of tri-state buffers is still affordable for 5 iterations and 80∼100 MHz system clock. For application in WCDMA system, the peak throughput can even reach 100 Mbps. The rest of the paper is organized as follows. Section 2 models concurrent memory access as vertex coloring problem. Section 3 explains the resultant architecture of turbo decoder. Section 4 discusses the design of vertex coloring algorithm in the light of interconnection bottleneck. We also show how to implement this algorithm with hardware. Finally, to test the viability of our scheme, we implement it on a Xilinx Virtex II Pro xc2vp70 FPGA.

2 Memory access as a vertex-coloring problem Graph coloring is first used to solve a register allocation problem in compiler design[10] . The same principle can be borrowed to solve memory access problem in Section 1. It may be described as follows. (1) Every stage of the trellis of the component code is modeled as a vertex. Any two vertexes are connected with an (undirected) edge if and only if data related to these two stages are accessed simultaneously by different SISO processors[2] in ‘parallel turbo decoding algorithm’. (2) Let each color represent an RAM block. As long as any two adjacent vertexes are labeled with different colors, memory access will be collision free. This is exactly the ‘vertex coloring’ problem in graph theory.

381

(3) To use as few RAM blocks as possible, the number of colors in use should be minimized. Principles (2) and (3) should be obvious. The graph construction of principle (1) is demonstrated with an example in Fig.2, where there are 2 SISO processors, and the trellis length is 8. Assume the interleaving pattern π(x) is 5, 3, 7, 8, 1, 4, 6, 2 for x=1, 2,· · · , 8. During the 1st phase of one iteration, trellis stages requiring simultaneous access are paired as {1, 5}, {2, 6}, {3, 7}, {4, 8} according to Fig.2(a). During the 2nd phase, they are paired as {5,1}, {8,7}, {2,3}, {6,4}. In the following context, we call edges resulting from the 1st phase, thus drawn below the numbers, ‘low edges’, and edges from 2nd phase, thus drawn above the numbers, ‘high edges’. 1 2 3 4 5 6 7 8 SISO1

SISO2

5 8 2 6 1 7 3 4 SISO1

(a) Before interleaving

SISO2

(b) After interleaving

1 2 3 4 5 6 7 8

(c) The graph Fig.2

Construction of the graph from interleaving pattern when 2 SISO processors are used and trellis has 8 stages

3 Hardware design For explanation purpose, channel and extrinsic information of one trellis stage are simply called an ‘element’. Let the number of stages in the trellis be M , the number of SISO processors be P , the graph is colored with χ colors. Then access collision can be avoided by storing the M elements in χ memory blocks, each with the capacity of L/P  elements, as follows: Stage i’s data element (i = 0, · · · , M − 1) is stored in the cth RAM at position (i mod L/P ) where c is the stage’s color. Fig.3 shows the turbo decoder’s overall architecture. Only information bits’ extrinsic values are stored with the new scheme, while channel values of parity check bits are still stored orderly in ordinary way. Information bits’ channel values, which remain unchanged over the decoding period, are replicated twice and stored in natural and interleaved orders respectively (see Fig.8 in [3]). ‘x’ indicates the part of RAM which is occupied by a data element. The P tables store the c value of data elements used by the SISO. Since every SISO must support two kinds of input order, every table has a size 2log2 χ2log2 L/P  , which is 2K when χ=10 and L/P =256. During the ‘learning period’[8] , each SISO consults their neighbor’s table, so the connections between SISO processors and tables may switch to the ‘dash-dot’ line in the figure.

Journal of Shanghai University

382 [L/P] elements/block here L=20, P=4 x

x

x χ=6 RAM blocks

x

x

x

x

x

x

x

x

x

x x

x

x

x

x

x

Config SISO1

Table1

SISO2

Table2

SISO3

Table3

SISO4

Table4

RAM2 RAM3

x

RAM4

RAM stack for Selection information bits’ network extrinsic values

Fig.3

RAM blocks of channel values RAM1

Turbo decoder architecture (P =4, χ=6, L=20)

Data exchange between SISO processors and RAM stack is done via a ‘selection network’, whose internals are shown in Fig.4. The switches are implemented with tri-state buffers. We define χp as the number of output ports of the pth switch. Input is copied to the port indicated by control signal (drawn in dots), and all other ports are left in high-impedance state. The port width of switchp (for read) is log2 L/P . The port width of switchp (for write) is log2 L/P  + w, where w is word length of data. Thus the overall number of triP  state buffers can be calculated with (2log2 L/P + p=1

w)χp . Inter/ deinter

Tablep p=1,..., P

Counter Read address (p)

Delayp

Fig.4

Counter

Switchp Switch'p Delayp' From ... ... ... switches ... To χp RAM blocks RAMq, q=1,..., χ

Delayp" Read data (p)

Inter/ deinter Write address (p) Write data (p)

Data from χp RAM blocks ...

Selectorp

Selection network with concurrent read and write support

In practice we find the number of tri-state buffers can be overwhelming, which equals 1600 when L=2048, P =8, w=9, χp ≡ P . It is about 10% of all available tri-state buffers on a Xilinx xc2vp70 FPGA with 7 million gates. We also notice if the decoder’s latency is fixed, which means 2log2 L/P +w remains constant, the number of tri-state buffers increases at the speed of O(P 2 ). This rapid increase in hardware consumption is mainly due to the ‘time varying’ nature of switches and is called ‘interconnection bottleneck’[7] . Designing the coloring scheme properly can alleviate this problem. We do this by restricting the whole number of colors seen by an SISO processor when it decodes

the two component codes. If χp can be restricted, the total tri-state buffer consumption will decrease. The resultant ‘reordered first fit’ algorithm is described as follows. Let n = L/P , and π(x) be the interleaved index of x. For simplicity, we assume P divides L. Let Ap be the index of elements processed by an SISO processor, which is defined as follows: Ap = {pn + m, m = 0, 1, · · · , n − 1} ∪ {π −1 (pn + m), m = 0, 1, · · · , n − 1}. For p = 0, 1, · · · , P −1, color vertices in set Ap as follows. Look at the vertex’s every adjacent vertex and record all colors (if any) already used by them. Let the smallest color not used by adjacent nodes be the vertex’s color. The difference of our algorithm from a canonical first fit algorithm[9] , which may also be used here, is that the latter colors vertices orderly from index 0 to L − 1. Here we color vertices in set A1 first, and then color vertices in set A2 , and so on. Every vertex is colored twice. This, in theory, will not increase coloring latency significantly if some additional complexity is introduced to jump over colored vertices. Reordered first fit algorithm and its canonical version are compared in the sense of number of tri-state buffers and χ in Fig.5, in which L/P is kept constant at 256 and P is increased from 4 to 16. The number of tri-state P  (2log2 L/P  + w)χp with buffers is calculated as p=1

w=9, L/P =256. We also provide estimated number of tri-state buffers of [8] whose results imply χp ≡ P . According to Fig.5(a), the reordered first fit algorithm saves 12%∼20% tri-state buffers compared to canonical first fit algorithm. It uses even fewer tri-state buffers than [8] for P =16. According to Fig.5(b), the coloring scheme uses 2∼5 more RAM blocks than [8]. In practice, this additional cost can be alleviated by implementing the last RAM block with slices as it usually hosts less than 16 elements. Reordered first fit algorithm makes the connection network ‘irregular’, in that χp is different from each other. When different code lengths are supported, we are concerned if connections saved under one code length are unlikely to reappear under another code length. This is ensured by the following 2 observations: (1) color indices are assigned to Ap sequentially, that is, if color index i appears in Ap , all colors with index smaller than i must appear in Ap as well; (2) χp generally increases monotonically with p because of the greedy nature of our algorithm. This is verified in Fig.6. Here the χp for each p is its maximum value over the code length listed beP  low. Calculation with (2log2 L/P  + w)χp shows p=1

that 15% of tri-state buffers are saved.

Vol. 11

No. 4

ZHANG L, et al. : A reordered first fit algorithm based novel storage scheme ...

Aug. 2007

Number of tri-state buffers

8000

ors of neighboring vertices from tables, one for neighbors seen from ‘high edges’, the other for neighbors connected via ‘low edges’. The ‘read’ ports are then idle during the next (N −2) cycles. This is because calculation for a new vertex can only start after the previous result has been written back into the tables. The ‘color calculator’ picks up the color for the vertex, which is written to the table indicated by the ‘write control’ module if necessary. For simplicity we just color every vertex twice, thus computation for all table entries takes O(2N L) clock cycles. Address buses addra and addrb are calculated as follows.

7000 6000

Canonical first fit

5000 4000 3000

Reordered first fit

2000 Ref.[4]

1000 0 4

6

8

10

12

14

16

for p = 1, · · · , P for k ∈ Ap for t = 0, · · · , N − 1 ( k mod L/P , t = 0, addra = π(k) mod L/P  + 2log2 L/P  , otherwise, ( k/L/P , t = 0, addrb = π(k)/L/P , otherwise,

P (a) Tri-state buffers consumption 22 Reordered first fit

20 18 16

χ

14 12 Canonical first fit 10

end for end for end for

Ref.[4]

8

383

6

Addrb

4 4

6

8

10

12

14

16

P (b) RAM block consumption when different vertex coloring scheme is used (L/P = 256)

Fig.5

Tri-state buffers and RAM block simulation

Canonical first fit

9

Write control

χp

The flag bit

8

Fig.7

Reordered first fit

7 6 5 1

2

3

4

p

5

6

7

8

Comparison of χp of different coloring scheme when supporting code length (L=512, 1024, 2048, 4096, P =8)

The ‘config’ module in Fig.3 computes data for the P tables when code length changes. At this time, ports of the tables are switched to the connection shown in Fig.7. Assume the configuration of one table entry takes N clock cycles. The first two cycles are used to read col-

Delay Table1 ...

Color of current vertex

11

Fig.6

Delay

Flag

12

10

Addra

P tables ... TableP Color calculator

Colors of neighboring vertices

Flag Delay

Configuration module

We notice for reordered the first fit algorithm, both π(x) and π −1 (x) are required to be real-time addressable, which can be inconvenient for some interleaver design. If only π −1 (x) or π(x) are real-time addressable, canonical first fit algorithm can be used instead. ‘Color calculator’ is shown in Fig.8. An additional bit is added to every table entry to indicate if it has been initialized or not. The correct value of this bit for initialized entry is indicated by ‘flag’ signal, which flips after every configuration round. Read-in data from table are first validated based on this bit (the flag bit) and then translated by the subsequent decoder into color usage vector (1 x for input x) if it has been initialized or 0 if it is not. These vectors are further combined via

Journal of Shanghai University

384

OR logic to reflect color usage among neighbors. ‘Select color’ module outputs k if 0 ∼ (k − 1)th bits in ‘color usage vector’ are 1 and the kth bit is 0. This is exactly the smallest color not used by adjacent vertices if the current vertex has not been assigned any color. Otherwise, the output of color selector should not be used. This is achieved by buffering the flag bits from tables and feed them to ‘write control’ module as shown in Fig.7. Flag Validator1 Colors of neighboring vertices ValidatorP

Decoder1 . . . . . .

References

Bit OR

DecoderP

[1] Nimbalker A, Blankenship T K, Classon B, Fuja T E, Costello D J, Jr. Contention-free interleavers [C]//Proceedings International Sympusium on Information Theory, Chicago, USA. 2004: 54.

Bit OR

Color usage vector Vertex color

Fig.8

Sample

Select color

[2] Kwak J, Lee K. Design of dividable interleavers for parallel decoding in turbo codes [J]. IEE Electronics Letters, 2002, 38(22): 1362–1364.

Color calculator

4 Implementation results To test the viability of our storage scheme, we implement the selection network (see Fig.4) with on-chip reconfiguration (see Figs.7 and 8) capability on a Xilinx Virtex II pro FPGA (xc2vp70, 7 million gates). The cost summary after placement and routing for our method is shown in Table 1. Code lengths of 512, 1024, 1536, and 2048 are supported and P =8. The latency of the coloring scheme is 20L clock cycles, which is 512 µs for 80 MHz system clock and L=2048. The decoding throughput at this clock and code length is 45 Mbps for 5 iterations and the total hardware consumption is 83%. If we increase P to 10 and decrease iteration number to 3, which is still applicable to most applications in WCDMA system, the peak throughput reaches as high as 100 Mbps and the hardware consumption is about 78%, of which 26% is consumed by the storage scheme. Table 1

Hardware cost of the storage scheme with reconfiguration capacity on XC2VP70 FPGA

Number of hardware resources

(40 ∼50 Mbps to as high as 100 Mbps) on a Xilinx Virtex II pro FPGA. As in [8], the scheme supports arbitrary interleaving pattern, which may not be realizable with a parallel interleaver[1]. However, our method offers a simpler configuration algorithm convenient for ‘on-chip’ configuration, which does not require iterative adjustments as annealing method in [8]. Both the proposed method and that described in [8] have interconnection bottleneck. However, for 40∼100 Mbps throughput the hardware cost is still affordable.

Occupied of all resources

Occupied percentage (%)

Number of RAMB16s

19 out of 328

5

Number of SLICEs

653 out of 33088

2

Number of TBUFs

1976 out of 16544

11

5 Conclusions We conclude that the new storage scheme is suitable for a turbo decoder with moderate to high throughput

[3] Dobkin R, Peleg M, Ginosar R. Parallel interleaver design and VLSI architecture for low-latency MAP turbo decoders [J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2005, 13(4): 427– 438. [4] Dinoi L, Benedetto S. Variable-size interleaver design for parallel turbo decoder architectures [C]//IEEE Global Telecommunications Conference GLOBECOM, Dallas, USA. 2004, 5: 3108–3112. [5] TSGR1#6(99)927. Updated text proposal for Turbo code internal interleaver [S]. NTT DoCoMo, Nortel Networks, SAMSUNG Electronics Co., 1999. [6] Crozier S, Guinand P. High-performance lowmemory interleaver banks for turbo-codes [C]//IEEE 54th Vehicular Technology Conference, Atlantic, USA. 2001, 4: 2394-2398. [7] Thul M J, Gilbert F, Wehn N. Optimized Concurrent Interleaving Architecture for High-Throughput Turbo Decoding [C]// 9th International Conference on Electronics, Circuits and Systems, Kaiserslautern, Germany. 2002, 3: 1099–1102. [8] Tarable A, Montorsi G Benedetto S. Mapping of interleaving laws to parallel turbo decoder architectures [J]. IEEE Communications Letters, 2004, 8(3): 2002– 2009. [9] Assefaw H G, Fredrik M. Scalable parallel graph coloring algorithms [EB/OL]. [2005.10.25] http:// www.ii.uib.no/∼assefaw/pub/thesis/paper1.pdf. [10] Mueller F. Register allocation by graph coloring: a review [EB/OL]. [2005.10.25] http://moss.csc. ncsu.edu/∼mueller /ftp/pub/PART/ color.ps.Z. (Editor HONG Ou)

A reordered first fit algorithm based novel storage ... - Springer Link

context, we call edges resulting from the 1st phase, thus drawn below the numbers, ..... Networks, SAMSUNG Electronics Co., 1999. [6] Crozier S, Guinand P. ... Turbo Decoding [C]// 9th International Conference on. Electronics, Circuits and ...

362KB Sizes 0 Downloads 250 Views

Recommend Documents

A DNA-Based Genetic Algorithm Implementation for ... - Springer Link
out evolutionary computation using DNA, but only a few implementations have been presented. ... present a solution for the maximal clique problem. In section 5 ...

Augmented reality registration algorithm based on ... - Springer Link
CHEN Jing1∗, WANG YongTian1,2, GUO JunWei1, LIU Wei1, LIN JingDun1, ... 2School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China; .... One is degrees and the other is meters or ... these years many successf

A Linear Time Algorithm for the Minimum-weight ... - Springer Link
In this paper, we study the minimum-weight feedback vertex set problem in ...... ISAAC'95 Algorthms and Computations, Lecture Notes in Computer Science,.

A Linear Time Algorithm for the Minimum-weight ... - Springer Link
For each Bi, 1 ≤ i ≤ l, applying Algorithm II, we can compute a collection of candidate sets. FBi u. = {FBi .... W. H. Freeman and Company, New York, 1979.

A New Multi-view Learning Algorithm Based on ICA ... - Springer Link
the image retrieval, and through comparison, the conclusion is made that the. ICA basis ..... iteration algorithm has a closed form of solution f. ∗ .... Proc. of the Conference on Computational Learning Theory, 1998, pp. ... Video Retrieval.

Asymptotic tracking by a reinforcement learning-based ... - Springer Link
Department of Mechanical and Aerospace Engineering, University of Florida, Gainesville, FL 32611, U.S.A.;. 2.Department of Physiology, University of Alberta, ...

A Velocity-Based Approach for Simulating Human ... - Springer Link
ing avoidance behaviour between interacting virtual characters. We first exploit ..... In: Proc. of IEEE Conference on Robotics and Automation, pp. 1928–1935 ...

LNCS 6622 - NILS: A Neutrality-Based Iterated Local ... - Springer Link
a new configuration that yields the best possible fitness value. Given that the .... The neutral degree of a given solution is the number of neutral solutions in its ...

Wiki-based Knowledge Sharing in A Knowledge ... - Springer Link
and also includes a set of assistant tools that support this collaboration. .... knowledge, and can also query desirable knowledge directly by the search engine.

A biomimetic, force-field based computational model ... - Springer Link
Aug 11, 2009 - a further development of what was proposed by Tsuji et al. (1995) and Morasso et al. (1997). .... level software development by facilitating modularity, sup- port for simultaneous ...... Adaptive representation of dynamics during ...

Wiki-based Knowledge Sharing in A Knowledge ... - Springer Link
with other hyper text systems such as BBS or Blog, Wiki is more open and .... 24. Wiki-based Knowledge Sharing in A Knowledge-Intensive Organization.

A Fuzzy-Interval Based Approach for Explicit Graph ... - Springer Link
number of edges, node degrees, the attributes of nodes and the attributes of edges in ... The website [2] for the 20th International Conference on Pattern Recognition. (ICPR2010) ... Graph embedding, in this sense, is a real bridge joining the.

A Fuzzy-Interval Based Approach for Explicit Graph ... - Springer Link
Computer Vision Center, Universitat Autónoma de Barcelona, Spain. {mluqman ... number of edges, node degrees, the attributes of nodes and the attributes.

Is surface-based orientation influenced by a ... - Springer Link
21 May 2011 - For decades, it has been suggested that spatial represen- tations are based on metric relations (Gallistel, 1990). ... object memory and layout memory, respectively). Within a psychophysical framework, we hypothesized ..... Shape parame

3D articulated object retrieval using a graph-based ... - Springer Link
Aug 12, 2010 - Department of Electrical and Computer Engineering, Democritus. University ... Among the existing 3D object retrieval methods, two main categories ...... the Ph.D. degree in the Science of ... the past 9 years he has been work-.

Asymptotic tracking by a reinforcement learning-based ... - Springer Link
NASA Langley Research Center, Hampton, VA 23681, U.S.A.. Abstract: ... Keywords: Adaptive critic; Reinforcement learning; Neural network-based control.

TRENDS: A Content-Based Information Retrieval ... - Springer Link
computer science and artificial intelligence. This growing ... (2) More recently, design knowledge and informational processes have been partly .... Table 1 Sectors of influence classified by frequency of quotation by designers. Year. 1997 ..... McDo

A Content-Based Information Retrieval System for ... - Springer Link
This paper deals with the elaboration of an interactive software which ... Springer Science + Business Media B.V. 2008 .... Trend boards offer a good representation of the references used ..... function could be fulfilled by mobile devices.

A Novel Gene Ranking Algorithm Based on Random ...
Jan 25, 2007 - Proceedings of International Joint Conference on Neural Networks, Orlando, Florida, USA, ... Ruichu Cai is with College of Computer Science and Engineering, South .... local-optimal when it ranks all the genesat a time. The.

Foveal Algorithm for the Detection of Microcalcification ... - Springer Link
Ewert House, Ewert Place, Summertown, Oxford OX2 7BZ, UK ... present in about a quarter of the total number of screening mammograms. For these reasons ... microcalcifications are expected to appear bright in a mammogram, indeed to be.

Air Quality Forecaster: Moving Window Based Neuro ... - Springer Link
(Eds.): Applications of Soft Computing, ASC 52, pp. 137–145. springerlink. ... To develop the neural network models for PM10, SO2, and NO2. 4. .... no. of moving windows with q no. of windows containing both inputs and the target. 3.4 Model ...

LNAI 4285 - Query Similarity Computing Based on ... - Springer Link
similar units between S1 and S2, are called similar units, notated as s(ai,bj), abridged ..... 4. http://metadata.sims.berkeley.edu/index.html, accessed: 2003.Dec.1 ...