*i*

*c*

*i*

*+a*

*i*

*c*

*i*where

*a*

*i*,

*b*

*i*are summands,

*c*

*i*is input carry and stands for XOR operation.

In 1955 Gilchrist et al. proposed speed-independent RCA with carry completion signal [18]. In 1960s that circuit was carefully analyzed and improved [19-21]. In 1980 Seitz used RCA for illustrating his concept of equipotential region and his approach to self-timed system design [4].

Now we use RCA as a CL for illustrating our approach to SIM design.

As it was shown in Section 4.2 the turn-on and turn-off delays of the OVD circuit are proportional to the equivalent capacitance *C**eq* associated with OVD circuit input. Capacitance *C**eq* depends linearly on a number of gates *N* in CMOS CL. To speed up a SIM it is necessary to reduce a number *N*. This can be reached by structural decomposition CMOS CL into subcircuits CL1, CL2, etc. Each subcircuit CL*i* is connected to its own detecting circuit OVD*i *or directly to the power supply if this subcircuit transition does not affect the transition duration in CL as a whole. Each detecting circuit OVD*i* generates its own *OV *signal which is combined with other OVDs' output signals via a multi-input OR (NOR) element. The output signal of that element serves as *OV* signal of the CMOS CL.

Multi-bit RCA computation time is determined by length of maximal activated carry chain. A lot of papers were devoted to analysis of carry generation and carry propagation in RCA [19-21], many of them contained their own methods for estimation or calculation of average maximal activated carry chain. We do not intend to add another one.

Let us have a look inside RCA. As it was mentioned above RCA consists of one-bit full adders and each full adder consists of two parts: forming sum *s**i* part and forming carry *c**i*+1 part (Fig.16).

In multi-bit RCA all forming sum parts do not interact with each other and do not affect on transition duration in RCA. Each forming carry *c**i*+1 part receives *c**i* signal from preceding forming carry part and sends *c**i*+1 signal to consequent one.

To decompose RCA we use three heuristic tricks:

(i) All forming sum parts we connect directly to power supply.

(ii) We divide each forming carry part into three subcircuits denoted in Fig.16 by numbers 1,2 and 3. All subcircuits 1 we connect directly to power supply because they do not contain input *c**i* and so do not contain carry propagation path.

(iii) All subcircuits 2 we connect to OVD1 and all subcircuits 3 we connect to OVD2. Outputs of OVD1 and OVD2 are connected to two-input NOR-gate forming RCA * OV* signal in positive logic manner (Fig.17).

OVD1 and OVD2 input currents *I*1 and *I*2 curves for 6-bit RCA and longest transition duration are shown in Fig.18.

Accepting *V**th*1,2=400mV we calculated the OVD circuits parameters. It was obtained *R*11=5k, *I**th*1=0.08mA, *R*12=3k, *I**th*2=0.13mA. OVD1 and OVD2 delay dependencies on a number of bits in RCA are shown in Fig.19.

**4.5 Comparison of SIMs with synchronous counterparts**

Transition duration in CL is a random variable. Probability of transition with duration *D* is determined by implemented Boolean function and distribution of input logical combinations. Domain of possible values for variable *D *occupies the interval [0;*D*max]. Here *D*max is a length of critical path in CL.

Let is a mathematical expectation of transition duration in CL where *D**i* is a length of *i*-th SPP in CL, *p*i is a probability of *i*-th path being the longest activated SPP.

When CL works in the synchronous mode, the cycle duration *T**s* is chosen with regard to maximal transition duration *D**max*. Certain margin must be added to *D*max to provide reliable operation of CL in the case of CL parameter variations: *T**s* =*kD*max where *k* is a margin coefficient.

In SIM cycle duration is a random variable with expectation *T**si* = *gD**me*+*t**off*+*t**if * where *g* is a coefficient of CL delay increasing due to reducing power supply voltage, *t**off* is turn-off delay of the OVD circuit, *t**if* is an interface circuitry delay.

We determine efficiency *E* for speed-independent mode of CL operation as relative increase of SIM performance in comparison to its synchronous counterpart:.

Generally, speed-independent mode is more efficient than synchronous one if *T**s* >*T**si* or, in other words, .

In the case of RCA where *t**c* is a delay of carry forming part, *n* is a number of full adders in RCA.

It has been shown [19] that in *n*-bit RCA *D**me* *t**c*log2(5*n*/4). Then, in the case of speed-independent operation *T**si*=*gt**c*log2(5*n*/4)+*t**off*+*t**if*.

We have obtained dependencies of *T**s *, *T**si* on a number of bits in RCA that are shown in Fig.20. As it can be seen, speed-independent operation of RCA is more efficient while *n*>8.

**5.Conclusion**

**6.Acknowledgement**

I would like to thank Igor Shagurin and Vlad Tsylyov of the Moscow Physical Engineering Institute for helpful discussions of this work. I am also grateful to Chris Jesshope of University of Surrey and Mark Josephs of Oxford University who kindly provided the latest material on their research in the area of delay-insensitive circuit design.

**References**

[1]Miller, R.E., *Switching theory* (Wiley, New York, 1965), vol.2, Chapter 10.

[2]Unger, S.H., *Asynchronous Sequential Switching Circuits* (Wiley, New York, 1969).

[3]Armstrong, D.B., A.D. Friedman, and P.R. Menon, Design of Asynchronous Circuits Assuming Unbounded Gate Delays, *IEEE Trans.on Computers* **C-18** (12) (1969) 1110-1120.

[4]Seitz, C.L., System timing, in: C.A. Mead and L.A. Conway, eds., *Introduction to VLSI Systems* (Addison-Wesley, New York, 1980), Chapter 7.

[5]Izosimov, O.A., I.I. Shagurin, and V.V. Tsylyov, Physical approach to CMOS module self-timing, *Electronics Letters* **26** (22) (1990) 1835-1836.

[6]Veendrick, H.J.M., Short-circuit dissipation of static CMOS circuit and its impact on the design of buffer circuits, *IEEE J. Solid-State Circuits ***SC-19** (4) (1984) 468-473.

[7]Chappell, B.A, T.I. Chappell, S.E. Schuster, H.M. Segmuller, J.W. Allan, R.L. Franch, and P.J. Restle, Fast CMOS ECL receivers with 100-mV worst-case sensitivity, * IEEE J. Solid-State **Circuits* **SC-23** (1) (1988) 59-67.

[8]Chu, S.T., J. Dikken, C.D. Hartgring, F.J. List, J.G. Raemaekers, S.A. Bell, B. Walsh, and R.H.W. Salters, A 25-ns Low-Power Full-CMOS 1-Mbit (128K8) SRAM, * IEEE J. Solid-State **Circuits* **SC-23** (5) (1988) 1078-1084.

[9]Frank, E.H., and R.F. Sproull, A Self-Timed Static RAM, in: *Proc. Third Caltech VLSI * *Conference * (Springer-Verlag, Berlin, 1983) pp.275-285.

[10]Donoghue, W.J., and G.E. Noufer, Circuit for address transition detection, US Patent 4563599, 1986.

[11]Huang, J.S.T., and J.W. Schrankler, Switching characteristics of scaled CMOS circuits at 77K, *IEEE Trans*. *on Electron Devices* **ED-34** (1) (1987) 101-106.

[12]Gilchrist, B., J.H. Pomerene, and S.Y. Wong, Fast Carry Logic for Digital Computers,* IRE Trans. **on Electronic Computers* **EC-4 **(4) (1955) 133-136.

[13]Hendrickson, H.C., Fast High-Accuracy Binary Parallel Addition, *IRE Trans. on Electronic **Computers ***EC-9** (4) (1960) 465-469.

[14]Majerski, S., and M. Wiweger, NOR-Gate Binary Adder with Carry Completion Detection, *IEEE **Trans. on * *Electronic Computers* **EC-16** (1) (1967) 90-92.

[15]Reitwiesner, G.W., The determination of carry propagation length for binary addition, *IRE Trans. **on Electronic Computers* **EC-9** (1) (1960) 35-38.

**Appendix**

SPICE2G.6: MOSFET model parameters

VALUENameParameterUnitsPMOSNMOS1levelmodel index-332VTOZERO-BIAS THRESHOLD VOLTAGEV-1.3371.1613KPTRANSCONDUCTANCE

PARAMETER

A/V2

2.310-5

4.610-54GAMMABULK THRESHOLD PARAMETER0.5010.3545PHISURFACE POTENTIALV0.6950.6606RDDRAIN OHMIC RESISTANCEOHM333857RSSOURCE OHMIC RESISTANCEOHM333858CBDZERO-BIAS B-D JUNCTION

CAPACITANCE

F

1.9810-14

6.910-159CBSZERO-BIAS B-S JUNCTION

CAPACITANCE

F

1.9810-14

6.910-1510ISBULK JUNCTION SATURATION

CURRENT

A

3.4710-15

9.2210-1511PBBULK JUNCTION POTENTIALV0.80.812CGSOGATE-SOURCE OVERLAP CAPACI-

TANCE PER METER CHANNEL WIDTH

F/M

6.7010-10

3.3010-1013CGDOGATE-DRAIN OVERLAP CAPACI-

TANCE PER METER CHANNEL WIDTH

F/M

6.7010-10

3.3010-1014CGBOGATE-BULK OVERLAP CAPACITANCE

PER METER CHANNEL LENGTH

F/M

1.9010-9

2.6010-915RSHDRAIN AND SOURCE DIFFUSION

SHEET RESISTANCE

OHM/SQ

5