FPGA Implementation of a Fully Digital CDR for Plesiochronous Clocking Systems Eliyah Kilada Ain Shams University Mentor Graphics Egypt [email protected]

Mohamed Dessouky Ain Shams University Mentor Graphics Egypt [email protected]

Adel Elhennawy Ain Shams University

A. NPhasesGen This block takes the MasterClk as an input. The MasterClk frequency is 4x of the bit rate. NPhasesGen is responsible for generating 16 different phases of the WindowClk (referred as NPhases). The WindowClk frequency is half the bit rate. The delay between each two successive WindowClks is half the MasterClk period. The FD-CDR is assumed to lock at one of these 16-different phases.

Abstract— This paper describes an FPGA implementation of a fully digital clock and data recovery system (FD-CDR) with plesiochronous clocking. The design utilizes 51 FF’s only. It does require, at worst, 2 preamble bits to get into lock. The extracted clock is not shifted as long as input data jitter is small (typically less than ±12.5%UI), thus, minimizing jitter in the extracted clock. Typically, for small input data jitter, the extracted clock shows an rms jitter of 51 ps which is mainly due to the 35.6 ps jitter (rms) of the system 100MHz-master clock. Besides, it can withstand an input data cycle-to-cycle jitter up to ±37.5% UI without getting out of lock. Data are obtained through digital correlation with the incoming symbol instead of ordinary sampling at the middle of the eye pattern, which improves BER. It is insensitive to long runs of transition-free data patterns. Besides, the extracted clock has a 50% duty cycle.

B. ClockSelector This block implements the Digital-to-Phase converter function. It takes the 4-bit Sel_d (i.e., Select_delayed) signal and chooses the corresponding WindowClk phase as follows: WindowClk <= NPhases(Sel_d) Sel_d is a delayed version of the Sel (i.e., Select) signal. Besides, ClockSelector also generates NPhasesSamp, which are 8 subsequent NPhases that will be used in sampling the input data.

Index Terms—Clock and data recovery (CDR), clock multiplication, jitter filtering, phase locking.

I. INTRODUCTION

C. Clock2XGen It generates the DataClk (whose frequency is the same as bit rate) from the selected Windowclk (whose frequency is half bit rate). Synchronous delay block of half bit period is employed here to guarantee 50% duty cycle of the extracted DataClk.

S

erial links are widely used at the peripheral of many ASIC’s, specially after the failure of parallel busses at very high speeds. Semi-digital implementations have been reported [1]-[6]. However, mixed signal blocks could not be completely avoided, specially the Digital to Phase Converter (DPC). An architecture of a fully digital CDR (FD-CDR) has been suggested in [9]. This paper presents an FPGA implementation of the FDCDR reported in [9]. Section II reviews briefly the theory of operation of the FD-CDR. Some design issues are discussed in Section III. Experimental results and performance summary are given in Section IV.

D. Sampler The 8-NphasesSamp clocks sample the incoming data through 8-dual edge flip flops. Effectively, this block generates 8-subsequent samples of the input data during the bit period. E. DigCorrelator Digital correlation is being done between the 8-data samples of each bit and the data symbol coefficients. For the case of NRZ line coding, this block reduces to a summing circuit. The Sum signal should carry the number of the HI samples (i.e., samples that are One). Besides, DigCorrelator also produces the UpDown signal that indicates the location of the HI samples within this window. UpDown is HI if the sum of the first 4-samples are greater than that of the last 4-samples, and LO otherwise.

II. THEORY OF OPERATION The theory of operation is described briefly in this section. The reader is encouraged to get a detailed description of operation in [9]. Fig. 1 shows the CDR block diagram. It is composed of seven main building blocks: Manuscript received June 30, 2007. E. Kilada is with Mentor Graphics Corp., Cairo, Egypt, and also with Ain Shams University, Cairo, Egypt (e-mail: [email protected]). M. Dessouky is with Mentor Graphics Corp., Cairo, Egypt, and also with Ain Shams University, Cairo, Egypt (e-mail: [email protected]). A. Elhennawy is with Ain Shams University, Cairo, Egypt.

F. MotherControl This is the core of the FD-CDR. MotherControl is a FSM that is clocked by DataClk. It takes Sum and UpDown signals

__________________________________________________________________________________________ 978-1-4244-1847-3/07/$25.00 ©2007 IEEE

311

IEEE ICM - December 2007

Fig. 1. Block diagram view of the FD-CDR [9].

can occur on DataClk. This is done by buffering Sel signal with the negative edge of the DataClk to produce Sel_d. ClockSelector, in turn, works on Sel_d.

as inputs and generates Sel, DataOut and Lock signals. Sel, which carries the actual phase information in the system, changes according to the phase misalignment between the selected phase of WindowClk (or extracted DataClk) and the transmitter clock. This phase misalignment information is provided by Sum and UpDown signals. The Lock signal is asserted when the receiver is confident about its relative extracted clock phase with respect to the transmitter clock. DataOut is determined based on Sum value.

III. DESIGN ISSUES A. MasterClk Frequency Obviously locking tolerance is determined by the MasterClk frequency. In details, to get a resolution of 1/8 bit period, a MasterClk of 4x of the bit rate is required which can be a very high frequency that causes implementation issues. However, it should be clear, that this clock is only used in NPhasesGen to generate the different phases of WindowClk and in Clk2XGen to guarantee 50% duty cycle of the extracted DataClk. On the other hand, all other system blocks are working on the bit rate or even half-bit rate.

G. NegSelBuffer When a positive edge of DataClk occurs, the FSM is clocked, and the Sel signal is changed accordingly. The ClockSelector will change the WindowClk based on the new value of the Sel signal, which, in turn, will modify DataClk. This loop is unstable by nature and will cause glitches in DataClk as well as undesired transitions in the MotherControl. To break up this unstable loop a buffer is inserted to delay the Sel signal, so that ClockSelector shifts the WindowClk based on a delayed version of the Sel signal (i.e., Sel_d). How much delay is required? The maximum shift forced by the MotherControl on DataClk is 4 sample periods [9] (i.e., half bit period). Therefore, if the Sel signal is delayed by half bit period before it takes effect on DataClk, then no glitches

B. ClockSelector Glitches Since the intrinsic delays of the different Sel bus elements through the ClockSelector block up to its output (i.e., WindowClk) varies slightly, therefore, some glitches have been observed on WindowClk during certain switching events in post place and route simulation and confirmed by measurements. To remedy this effect a buffer has been added at the ClockSelector output.

__________________________________________________________________________________________ 312

IEEE ICM - December 2007

(a) MasterClk input jitter is 35.6 ps (rms).

(b) Extracted DataClk jitter is 51 ps (rms).

Fig. 2. MasterClk input and extracted DataClk jitters in case of an input data jitter of less than ±12.5% UI

(a) Tracking a long One.

(b) Tracking a short Zero.

Fig. 3. Tracking an input data cycle-to-cycle jitter of ±12.5% UI.

(a) USB 2 standard compliance sequence spectrum. Center: 25 MHz, 4.95 MHz/div

(b) DataClk spectrum. Center: 25 MHz, 638 KHz/div.

Fig. 4. DataIn and DataClk spectrums. __________________________________________________________________________________________ 313

IEEE ICM - December 2007

D. Extracted Clock Spectrum Using 100MHz MasterClk (i.e., data rate of 25Mbps), the FD-CDR has been stimulated by the USB 2.0 standard compliance pattern [10]. Fig. 4 shows the spectrum of both DataIn and DataClk. Obviously, the input NRZ data pattern has no peaks at 25MHz. The extracted clock (i.e., DataClk) clearly locks on 25MHz. E. Transition-Free Data Pattern The proposed FD-CDR is insensitive to Transition-Free Data Pattern. Fig. 5 shows a shot of long succession of Zeros followed by One. Clearly, the system is able to recover the data correctly in this case. Fig. 5. FD-CDR doesn’t lose lock even with transition-free data patterns.

F. Extracted Clock Duty Cycle The proposed FD-CDR guarantees 50% duty cycle of DataClk in lock conditions except in Sel switching times.

IV. EXPERIMENTAL RESULTS AND PERFORMANCE

V. CONCLUSION

A. Area The system has been downloaded to Virtex4-XC4VFX12. It does utilize 63 slice registers (51 FF's and 12 latches) and 186

An FPGA implementation of a fully digital CDR system with plesiochronous clocking was presented. The FD-CDR enjoys superior jitter performance especially when the input data jitter is less than ±12.5% UI. It can withstand an input data cycle-to-cycle jitter up to ±37.5% UI. It needs, at worst, two preamble bits to get into lock. It is insensitive to long runs of transition-free data patterns. Besides, the extracted clock has 50% duty cycle. No specific test has been done to measure the BER. However, the authors believe that using digital correlation instead of one mid-point sampling should enhance BER. Finally, the system features were verified by measurements.

4-input LUTs. This is a 1% utilization of the FPGA resources. B. Tracking Jitter 1) Jitter Less Than 12.5% UI: As described in the MotherControl operation [9], DataClk is not shifted as long as the input data jitter is less than ±12.5% UI. In this case, DataClk jitter is mainly due to the MasterClk input jitter. This has been confirmed experimentally in Fig. 2. While MasterClk jitter is 36 ps (rms), DataClk jitter is only 51 ps (rms). The added jitter is mainly due to the inherent clock division and selection in the system. 2) Jitter More Than 12.5% UI: The system is able to withstand cycle-to-cycle jitter up to ± 37.5% UI . Fig. 3 shows a test bench that has been synthesized to verify the tracking range. Two flags are generated to indicate the relative periods of the current input data bits, namely, Long1Flag and Long0Flag. They are ±1 if the current input data bit is One (or Zero) and is longer (or shorter) than the nominal UI by ±25% respectively. For other signals definitions, refer to Section II of this paper. As seen in the waveforms, the extracted DataClk expands and shrinks as required by the input data jitter amplitude and direction. Obviously, the system can recover this jittered input data patterns successfully without losing lock at switching times.

REFERENCES [1] G Jeff L. Sonntag and John Stonick , “A Digital Clock and Data Recovery Architecture for Multi-Gigabit/s Binary Links,” IEEE J. Solid-State Circuits, vol. 4, no. 8, Aug. 2006. [2] K. K. Chang, et al, “A 0.4–4-Gb/s CMOS quad transceiver cell using on- chip regulated dual-loop PLLs,” IEEE J. Solid-State Circuits, vol. 38, May 2003, pp. 747-753. [3] Stefanos Sidiropoulos et al., “A Semidigital Dual Delay-Locked Loop,” IEEE J. Solid-State Circuits, ol. 32, no. 11, Nov. 1997. [4] Hideki Takauchi et al., “A CMOS Multichannel 10-Gb/s Transceiver,” IEEE J. Solid-State Circuits, vol. 38, no. 12, Dec. 2003. [5] Hirotaka Tamura et al., “5Gb/s Bidirectional Balance-Line Link Compliant with Plesiochronous Clocking,” in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, 2001. [6] R. Farjad-Rad., et al, “A 33-mW 8-Gb/s CMOS clock multiplier and CDR for highly integrated I/Os,” IEEE J. of Solid-State Circuits, vol. 39, no. 9, Sept. 2004, pp. 1553 – 1561 [7] M. Ramezani et al., “A 10 Gb/s CDR with a half-rate bang-bang phase detector,” in Proc. Int. Symp. Circuits and Systems, May 2003, vol. 2, pp.181-184. [8] M. Ramezani et al., “An Improved Bang-Bang Phase Detector for Clock and Data Recovery Applications,” ISCAS, vol. 1, pp. 715-718, 2001. [9] E. Kilada, M. Dessouky, and Adel Elhennawy, “Architecture of a Fully Digital CDR for Plesiochronous Clocking Systems,” ICSPC, 2007, submitted for publication. [10] USB 2.0 standard, April 27, 2000. Available at: www.usb.org/developers/docs/usb20.zip

C. Tracking Frequency Shift A test bench has been developed (on the simulation level) to verify the system response in case of 100 ppm drift of the receiver (or transmitter) MasterClk frequency. With the 100 ppm drift, the extracted DataClk successfully tracks the drift. Typically, it makes one shift per 2500 bit periods. Even at the switching times, the data is correctly recognized and the system never gets out of lock under these conditions.

__________________________________________________________________________________________ 314

IEEE ICM - December 2007

FPGA Implementation of a Fully Digital CDR for ...

fully digital clock and data recovery system (FD-CDR) with .... which carries the actual phase information in the system, changes .... compliance pattern [10]. Fig.

352KB Sizes 29 Downloads 286 Views

Recommend Documents

Architecture of a Fully Digital CDR for Plesiochronous ...
CDR with a digital one as in [1], data recovery is done based on a digital correlation rather ... are taken by a smart finite state machine (FSM). The proposed CDR ...

FPGA Implementation of Encryption Primitives - International Journal ...
Abstract. In my project, circuit design of an arithmetic module applied to cryptography i.e. Modulo Multiplicative. Inverse used in Montgomery algorithm is presented and results are simulated using Xilinx. This algorithm is useful in doing encryption

FPGA Implementation of Encryption Primitives - International Journal ...
doing encryption algorithms in binary arithmetic because all computers only deal with binary ... This multiplicative inverse function has iterative computations of ...

FPGA IMPLEMENTATION OF THE MORPHOLOGICAL ...
used because it might be computationally intensive in some applications, however, the available current hardware resources overcome this disadvantage.

FPGA Implementation of a Configurable Cache ...
... by allowing explicit control and optimization of data placement and transfers. .... this allows a low-cost virtualized DMA engine where every process/thread can ...

FPGA Implementation Cost & Performance Evaluation ...
IEEE 802.11 standard does not provide technology or implementation, but introduces ... wireless protocol for both ad-hoc and client/server networks. The users' ...

A Remote FPGA Laboratory for Digital Design Students - CiteSeerX
We have developed a new remote and interactive labora- tory for engineering ... this, we have created a platform to support a simple database to store projects ... The implemented remote FGPA laboratory is online at http : //lsd.deec.uc.pt.

A Remote FPGA Laboratory for Digital Design Students - CiteSeerX
the switches and keys, remotely. This was accomplished by developing a webpage interface, with PHP dynamic func- tionalities that enabled the remote use of ...

A Remote FPGA Laboratory for Digital Design ... - Semantic Scholar
Virtual labs provide an online visualization of some sim- ulated experiment, the idea ... The server is connected via USB to the FPGA JTAG port on the DE2 board, and .... notice the pattern on the control windows buttons and the. LEDs on the ...

FPGA Based Implementation of Compact Genetic ...
1 [email protected] , 2 [email protected] , 3 [email protected]. Abstract. This paper presents implementation of compact ...

A Review on Neural Network Implementation Using FPGA
Implementation method with resource/speed tradeoff is proposed to handle signed ... negative value for a weight indicates an inhibitory connection while a ..... Derivative using Back Gate Effect”, VLSI Design and Test Workshop-2003, pp.

An FPGA Implementation of 8-Channel Arbitrary Waveform ... - IJRIT
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 6, .... is basically a scaled down version of SONAR in the ocean, although, of course, there ... evaluated and the best one meeting the requirements is selected.

FPGA Based Implementation of Compact Genetic ...
The software implementation is always restricted in term of high real time application ... population-based nature, that is, they handle a set of potential solutions instead ..... NASA/ESA Conference on Adaptive Hardware and Systems, 2008, pp.

On the Implementation of FPGA-Based Adaptive ...
high computational load for many conventional processors. In this paper, we present a configurable hardware for ... both algorithms and the field programmable gate array. (FPGA) implementation and experimental result. ... realized, which we use mean

An FPGA Implementation of 8-Channel Arbitrary Waveform ... - IJRIT
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 6, ... does not fit the requirements of flexibility, data access, programmability, ... is basically a scaled down version of SONAR in the ocean, although, of course, .

MAP OF MALDIVES EDITED270409.cdr
Kudahuraa (Four Seasons Resort Maldives at Kuda Hur aa). En'boodhoofinolhu. (Taj Exotica and Spa Maldives). Gasfinolhu (Gasfinolhu Island Resort).

Comments on" A Fully Electronic System for Time Magnification of ...
The above paper by Schwartz et al. recently demonstrates time stretching of RF signals entirely in the electronic domain [1], which is in contrast to the large body ...

A fully automatic method for the reconstruction of ...
based on a mixture density network (MDN), in the search for a ... (pairs of C and S vectors) using a neural network which can be any ..... Recovery of fundamental ...

Development of a fully automated system for delivering ... - Springer Link
Development of a fully automated system for delivering odors in an MRI environment. ISABEL CUEVAS, BENOÎT GÉRARD, PAULA PLAZA, ELODIE LERENS, ...

BI8200 v1.1.cdr - Splatspace
Tools. Test Equipment. Soldering Iron. Soldering. Where to begin. Step by step ... We do stress that .... connection find out why, and fix it before applying power.

A FPGA-based Soft Multiprocessor System for JPEG ...
2.2 Soft Multiprocessor System on Xilinx FPGA. We implement JPEG encoder on a Xilinx Virtex-II Pro. 2VP30 FPGA with Xilinx Embedded Development Kit. (EDK). For the entire system, including I/O, we use. Xilinx XUP2Pro board, with Compact Flash (CF) ca