ECE
558 / 658 VLSI Design
Lab 3: Design of a 4-Bit Accumulator
Due
Monday, November 9, 11:59 PM
Objective: Design a CMOS circuit and
layout
for a 4-bit accumulator with four instances of the bitslice
accumulator from Lab 2. Note that a fully functional Lab 2 is a
prerequisite
for this lab!
A
4-bit accumulator consists of a 4-bit full adder and a resettable
4-bit register. Its 7 inputs are phi, {A3, A2, A1, A0},
c_in, and reset. Its 5 outputs are {Q3, Q2,
Q1, Q0}
and c_out. The adder computes the sum of
{A3, A2, A1,
A0}, {Q3, Q2, Q1, Q0}, and c_in, and
generates a sum
{S3, S2, S1, S0} and a carry c_out. The
register
samples {S3, S2, S1, S0} on the rising edge of phi and stores the
result on
{Q3, Q2, Q1, Q0}. Note: The braces { } in the above description are for
readability of the 4-bit vectors only and are NOT part of the signal
names.
Be sure to include each required item (indicated by POST:)
in your report. You must also explain what you did and why; images
alone are
not sufficient. Analyze your results, draw conclusions, and describe
what you
learned.
0. Block Diagram

* There are 4 1bit_accumulator. The Cin and Cout is
chainning. {A3,A2,A1,A0} is primary input data with Cin. Another input
of adder is from output of Flip-Flop.
* The design style of 4bit adder is ripple carry adder.
and This adder can calculate 0000~1111. If the result of
calculation
is bigger than 1111, final Cout will be set, and the result is back to
0000.
* The carry logic is designed as domino style, The
precharging phase is when phi(clk) is low. Thus, the valid value for
Cout is when phi(clk) is high(evaluation phase).
* {Q3,Q2,Q1,Q0} is the primary result vector of this design. There are
buffer to handle big next gate.
* Flip-Flop sample at rising edge, and
1.
Draw
a schematic
for the entire 4-bit accumulator using the schematic editor tool (Dsch2
for
undergrads, Cadence for grads). Capture a screen image of the editor
window.

I designed
the circuit hierarchily. which means I made single carry logic simbol
and single sum logic simbol and made adder simbol by this two simbols.
and than I also made a simbol for Flip-Flop. And then with adder and
flip-flop, I made 1 bit accumulator simbol. Finally, for 4bit
accumulator, I connected each 1bit-accumulator.
There are 2
kind of capacitor. one of them is a load for Cout(100fF), the other is
for Q(100fF). Because Q is 4bit vector, there are 4 capacitor for each
bits.
I also made
sum from each adder for testing. This is not actually usful for this
circuit. This value is valid only clk is low, because this phase is
evaluating phase.
{A3,A2,A1,A0}
and Cin is primary input data. There are also Clk and Rst for carry
logics and Flip-Flops, which is connected parallely.
2.
Use
the logic
simulator tool (schm2sim.pl + IRSIM) to validate your schematic.
The main idea of the
reduced testbench generation is to make the testbench which can test
each identical module parallelly
Each bit slice accumulator
has almost the same runtime behavior except only few things.
Look at this table. This
is the truth table for 1bit-accumulator
A
|
Q
|
Cin
|
S
|
Cout
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
1
|
0
|
0
|
1
|
0
|
1
|
0
|
0
|
1
|
1
|
0
|
1
|
1
|
0
|
0
|
1
|
0
|
1
|
0
|
1
|
0
|
1
|
1
|
1
|
0
|
0
|
1
|
1
|
1
|
1
|
1
|
1
|
Inputs of each adder is
A,Q,and Cin. A and Q is independent from other bit slice accumulator.
Cin has depentent with
other bit slice accumulator, and It is connected to Cout of others.
Thus, if Cin and Cout has
same value it can be tested parallelly.
The color marked row
should be tested sequencially.
Luckly, this two low can
be overlaped with next bit slice accumulator. After rising edge,
from the input vector {A,Q,Cin}={0,0,1}
the Q will store value of
1. At the next cycle, if You give A={1} for the second colored low, the
carry will be rippled to next accumulator ( this
is the first colored low condition for
second accumulator ).
This parallel testing idea
and overlap idea can reduce the test sequence dramatically. Here is
full chip test sequence.
The thing is that this
sequence has the same coverage with complete case test sequence. which
means every possible case can be tested
with this testbench
completely.
Here is the test sequence.
all zero input is redundant case. so I removed.
I changed the timing from
other irsim test. I gave inputs 'just before' the clock rather than
'just after' clock. By doing this, I reduced 1cycle. You can see the
value of {A3,A2,A1,A0} and the value of {Q3,Q2,Q1,Q0} in the same phase.
I didn't gave initial
reset for 1 full cycle. If the next phase is just for state setup, I
did it in reset phase.
Phase(Clk)
|
Rst
|
A3
|
A2
|
A1
|
A0
|
Q3_n
|
Q2_n
|
Q1_n
|
Q0_n
|
Cout2(Cin3)
|
Cout1(Cin2) |
Cout0(Cin1)
|
Cin
|
Cout
|
Comment
|
1
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
0
|
sequential test
|
2
|
0
|
0
|
0
|
0
|
1
|
0
|
0
|
0
|
1
|
0
|
0
|
1
|
0
|
0
|
|
3
|
0
|
0
|
0
|
1
|
0
|
0
|
0
|
1
|
0
|
0
|
1
|
0
|
0
|
0
|
|
4
|
0
|
0
|
1
|
0
|
0
|
0
|
1
|
0
|
0
|
1
|
0
|
0
|
0
|
0
|
|
5
|
0
|
1
|
0
|
0
|
0
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
|
6
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
reset.
|
7
|
0
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
0
|
0
|
0
|
0
|
0
|
A,Q,Cin = 100
|
8
|
0
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
A,Q,Cin = 111
|
9
|
0
|
0
|
0
|
0
|
0
|
1
|
1
|
1
|
1
|
0
|
0
|
0
|
0
|
0
|
A,Q,Cin = 010
|
10
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
reset
|
11
|
0
|
1
|
1
|
1
|
1
|
0
|
0
|
0
|
0
|
1
|
1
|
1
|
1
|
1
|
A,Q,Cin = 101
|
12
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
0
|
0
|
0
|
0
|
0
|
reset
|
13
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
1
|
A,Q,Cin = 001
|
For this test, every state can be visited. and be
accomplished complete combinational logic validation.
Total cycle is 13. Only 10 cycle is for validation
and 3 of them is for setting states.
Here is IRSIM Result.

3.
Design
a layout
for your accumulator (Cadence).

* This
is the layout for 4bit-accumulator. Totol size is 500(lamda)*500(lamda)
= 600um*600um. The ratio is 1:1
* Input
{A3,A2,A1,A0} and output {S3,S2,S1,S0},{Q3,Q2,Q1,Q0} can be accessable
top of the layout. Clk and Rst is connected all over the design by
horizontal metal1 line. Cin and Cout is connected each other.
* I used
simbolized 1bit accumulator and I didn't give any additinal wire. To do
that I have to re-make the label and pins, because the pin and label is
included in a simbol which means, there 4A rather than {A3,A2,A1,A0}
* Almost
every horizontal line is by metal 2, and almost vertical like is metal
1, and I didn't use the poly for wiring and try to minimize the length
of it for delay reason.
4.
Test
your layout
for functionality by executing the following algorithm. Sequentially
add the
last 4 digits of your student ID number, where each digit is
represented as a
4-bit word.
(1) Hand
Calculation
<Flow
Table>
* The input
vector is {5,3,5,1} = {0101,0011,0101,0001}.
*Cin is
keeping 0.
* I gave Rst
for 1st phase only.
Phase(Clk)
|
Rst
|
Cin
|
A3
|
A2
|
A1
|
A0
|
S3
|
S2
|
S1
|
S0
|
Q3
|
Q2
|
Q1
|
Q0
|
Cout
|
1
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
2
|
0
|
0
|
0
|
1
|
0
|
1
|
0
|
1
|
0
|
1
|
0
|
0
|
0
|
0
|
0
|
3
|
0
|
0
|
0
|
0
|
1
|
1
|
1
|
0
|
0
|
0
|
0
|
1
|
0
|
1
|
0
|
4
|
0
|
0
|
0
|
1
|
0
|
1
|
1
|
1
|
0
|
1
|
1
|
0
|
0
|
0
|
0
|
5
|
0
|
0
|
0
|
0
|
0
|
1
|
1
|
1
|
1
|
0
|
1
|
1
|
0
|
1
|
0
|
6
|
0
|
0
|
0
|
1
|
1
|
0
|
0
|
1
|
0
|
0
|
1
|
1
|
1
|
0
|
1
|
i) Phase 1 : I gave reset only. All the other
signals are keeping 0.
There are no unknown stage here.
because the Rst is already 1 at the first clock rising edge.
ii)Phase 2 : I gave 5=0101. and Sum is 0101
because Cin and Q is zero(FF was reset at phase1)
iii)Phase 3 : I gave 3=0011. the Sum is
1000(5+3) because Sum of Phase 2(0101) is stored to the FF at the
begining of
Phase 3(rising clock).
Sum is always valid
only clock is low.
iv)Phase 4: I gave 5=0101. Previous
Sum(phase3 1000) was saved at rising clock. The Sum is 1101(8+5).
v)Phase 5: I gave 1=0001. Previous
Sum(phase4 1101) was saved at rising clock. The Sum is 1110(13+1).
vi)Phase 6: I gave 6=0110 additionally,
because I
couldn't check Cout with this vector only.
Previous Sum(phase5 1110) was
saved at
rising clock. The Sum is 0100 and Cout is 1(20) .
Final value of S = {0100} because the result
is 16 (over 1111)
Final value of Cout = {1}
because the result is 16 (over 1111)
Final value of Q = { 1110} because the final
sum is not stored yet.
(2) IRSIM result for layout

* It represent the same result with hand
calculated.
* S is stored rising edge of the clock. Valid
sum is present when clk is low, because cout logic is domino logic and
I gave clk' instad clk.
* Because the value from adder is valid at
low clock, the value which is stored into FF at rising edge is valid.
Otherwise I
will store precharged value.
(3) IRSIM input vector - [Click Here]
(4) HSPICE Result.
i) Overall result
1
I gave
exactly the same input sequence with the one of IRSIM. I have gotten
the same result with the result of IRSIM, except the delays.
There
are 2 glich on S2, because sum logic is combinational logic and the
arrival time of each input of sum logic(A - primary input, Cin' - carry
logic, Q - from Flipflop) is different.
I
gaved enough time for clock period, because of checking functionality.
ii) Power
*
Overall power plot.

-
Because this is synchronus circuit, all activity is accur when clock
edge. For the flipflop, it is sample at rising edge. For the carry
logic, it start precharging at the begining of high level of
clock, and it start evaluating at the begining of low level of clock.
Thus, most of power consumption is accur
at the clock edge.
At the near the rising of the clock, the sampling and
precharging is done together, which means there are more power
consumption at the rising edge than falling edge.
peak power is more than 10mA.
* Average power result.
$DATA1 SOURCE='HSPICE'
VERSION='W-2005.03-SP1 '
.TITLE '************************** 4bit accumulator
**************************'
pstath
pstatl
pdynavg poverall
temper
alter#
2.653e-06
1.452e-06
6.537e-03 7.169e-04
25.0000
1.0000
|
* pstath is static power when clock is high.
* pstatl is static power when clock is low.
* pdynavg is the worst case dynamic power
consumption.
At the begining of phase4, the
overall result waveform say that every output is changed except
Cout,Q1,S1,S3.
And the overall power plot also
say that.
* poverall is the average power from 0 to 48 which
means from the starting of the scenario to the scenario
-
power summery.
Power
|
Value
|
Peak power
|
11mA1
|
dynamic power(worst case)
|
6.537mA
|
static power while clk is high
|
2.653uA
|
static power while clk is low
|
1.452uA
|
total average power for the
testing
|
0.717mA
|
iii) Hspice test file - [Click Here]
5.
Identify
the
critical (slowest) timing path in your circuit which will limit the
clock
frequency of your accumulator.
The critical
path is the maximum delay path among the synchronized unit. which is
primary input, promary output, flipflop, and carry logic, respectibly.
We don't need
to think about primary input and primary out in this case, because
there are no delay unit in this path.
The flipflop
works at rising edge only, but the carry logic works high level and low
level.
And to find
maximum clock rate, we should adjust the high level period of the clock
and low level period of the clock, separately.
Thus, we can
think the critical path in two way, rising and falling.
1) rising or
high level of the clock.
At the rising edge, the flipflop sample the input value,
and the carry logic start to precharge.
So, sampling time(including setup time) and precharging
time is the candidate.
Therefore, we can find the critical path from max(FF
sampling time, worst case precharging time of carry logic).
There are only two case for FF sampling time - sampling 1
while old value is 0 or opposite case.
The worst case of precharging time is :
Every input
should be 1 for the 0 output(Cout') - it means every n-mos was
discharged.
Therefore, the capacitance of
all nmos should be charged while precharging phase.
The input sequence is :
Phase(Clk)
|
Rst
|
Cin
|
A3
|
A2
|
A1
|
A0
|
S3
|
S2
|
S1
|
S0
|
Q3
|
Q2
|
Q1
|
Q0
|
Cout(Cout')
|
1
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0(1)
|
2
|
0
|
0
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
0
|
0
|
0
|
0
|
0(1)
|
3
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
1
|
1
|
1
|
0(1)
|
Just after phase 2(begining of phase 3) every nmos for
carry logic should be recharged, and the flipflop samples new values.
Phase 3 is for clock to q delay. If clock to q delay is
longer than precharge time, the delay will dominate.
2) falling or
low level of the clock.
At the falling edge, the flipflop doesn't do
any activity.
Well known problem of the ripple carry is the delay for
rippling carry.
While evaluating phase of domino
logic, the carry must be rippled from Cin of first adder to Cout of
last adder. Sum logic cannot produce the valid output until the
valid Cout' is
proceduced.
Sum = ABCin + Cout'(A+B+Cin)
Following logic of the sum logic is FF which is the
end of synchronizing path. Thus, the critical path is the path from the
input of the carry logic of the
first adder to the sum logic of the last adder.
We can estimate the delay :
high
voltage level delay
for critical path = max(precharging time, clock to q delay)
low voltage level delay for critical
path = 4 carry logic propagation delay + 1 sum logic
propagation delay + setup time
All this this calculation should be completed while clock
is low.
To make input sequence it must be met this
condition,
i)The carry
should be rippled to Cout' of last adder.
ii)The output of last sum logic must
be changed.
The input sequence is :
Phase(Clk)
|
Rst
|
Cin
|
A3
|
A2
|
A1
|
A0
|
S3
|
S21
|
S1
|
S0
|
Q3
|
Q2
|
Q1
|
Q0
|
Cout(Cout')
|
1
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0(1)
|
2
|
0
|
0
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
0
|
0
|
0
|
0
|
0(1)
|
3
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
0(1)
|
4
|
0
|
0
|
0
|
0
|
0
|
1
|
0
|
0
|
0
|
0
|
1
|
1
|
1
|
1
|
1(0)
|
I filled every FF to 1. After phase 3, it is
ready to ripple.
Because I gave 1 for A0 at phase 4, the carry will ripple.
and S3 is changed from 1 to 0.
* The test for the period of low level : 10 ps precision.
1.60 ns :
voltage
for high is no more 2.5Vns the 1.56
ns is pure worst case evauation period.
1.56 ns : voltage for high is 2.27V <= minimum
period.
1.55 ns :
voltage
for high is 2.24V <= below than 10% of the 2.5V
* The test for the period of high level : 10 ps precision.
0.3 ns :
works well -
Cout goes down to 0 perfectly.
0.28 ns :
Cout goes
down almost 0.
0.26 ns :
Cout goes
down almost 0.
0.24 ns :
low
voltage is little far from 0.
0.23 ns : 10% point.
The voltage for low is 0.24V
<=minimum period
The setup time can be obtained by solving next problem.
worst setup time(falling) = 150ps
* The Clock to Q delay.
While determining the period of low level I gave enough
high level period. It garanties every input for the carry logic is
arrived before the evaluation period. This means the 1.56
ns is pure worst case evauation period.
If clock to q delay is more than 0.23ns, the input Q
cannot be arrived before carry evaluation.
<The result of Clock
to Q delay>
The clock to q delay is 1.05ns.
< Calculation of Maximum Clock rate>
In this case, we can think this way.
Clock period = Low level period + High level period
High level period = Max(Clock to Q delay, Precharging time)
Low level period = (Carry propagation delay
from primary Cin to out of last carry logic) + (output of
final sum logic) + (setup time).
The setup time can be optained by solving next Problem.
Thus, the period of 1 cycle = 1.05ns + 1.56ns + 0.15ns =
2.76ns
Maximum clock rate = 1/2.76ns = 362.319MHz
I tried Problem 4 with this result. It works well. When I
increased clock rate little bit, it start malfunctional. Which means
the value is
quite accurate.
<The result
of Problem 4 with this values >
6.
Again
using
trial-and-error simulation (lo2spice.pl + HSPICE), determine the setup
times
for the A3, A2, A1, A0, and c_in inputs to
the
nearest 10 ps.
Setup time is the time
that input signal must be reached some amount of time before the clock
is rising. Setup time is not dynamic factor, in other words it is
intrinsic factor. Thus, in this case setup time for every case
(A3,A2,A1,A0,c_in) is the same.
Moreover, the input of
flipflop is Sum from adder logic, we should make input vector to accur
sum signal in appropriate timing.
The test vector is like
this.
1) Setup time for rising.
First
of all the FF must be 0. And while keeping every input to 0, except
A0. Sum must become 1 just before rising edge of clock. and
then try and fix.
- hspice test
vector.
Vclk Clk gnd dc 0 pulse (0 2.5
0n 100p 100p 2n 4n)
Vrst Rst gnd dc 0 pulse (2.5 0 2.5n 100p 100p 3000n 6000n)
Va1 A1 gnd dc 0 pulse (0 0 2.5n 100p 100p 2.5n 4n )
Va2 A2 gnd dc 0 pulse (0 0 2.5n 100p 100p 2.5n 4n )
Va3 A3 gnd dc 0 pulse (0 0 2.5n 100p 100p 2.5n 4n )
Va0 A0 gnd dc 0 pulse (0 2.5 7.52n 100p 100p 4n 8n )
Vin Cin gnd dc 0 pulse (0 0 2.5n 100p 100p 1n 1n )
|
* Try
and fix flow - the precision is 10ps.
A : 7.5ns
Term between sum and rising edge
0.079 (S0->1 , clk0->1) : works fine.
[result]
A: 7.51ns
Term between sum and rising edge
0.070(S0->1, clk0->1) : works fine
[result]
A: 7.52 ns
Term between sum and rising edge 0.058 : doesn't
work
[result]
Thus, setup time for rising activity of A is 0.070
C : 7.52 ns
Term between sum and rising edge 0.180 : works
[result]
C : 7.62 ns
Term between sum and rising edge 0.082 : works
[result]
C : 7.63 ns
Term between sum and rising edge 0.072 : works
[result]
C : 7.64 ns
Term between sum and rising edge 0.063 : doesn't
works
[result]
The delay about Cin and A should be the same, this little difference is
from precision. Basically the rising and falling time of S is
different, if the source of it(A,Cin) is different. Thus slope of
voltage plot is different.
Because the smallist number is 0.70, it is more accurate value.
2) Setup time for falling.
The input
sequence must make the FF to be 1 first, and Sum must be 1->0 just
before rising edge of clock. and try ,fix.
- hspice test
vector.
Vclk Clk gnd dc 0 pulse (0 2.5
0n 100p 100p 2n 4n)
Vrst Rst gnd dc 0 pulse (2.5 0 2.5n 100p 100p 3000n 6000n)
Va1 A1 gnd dc 0 pulse (0 0 2.5n 100p 100p 2.5n 4n )
Va2 A2 gnd dc 0 pulse (0 0 2.5n 100p 100p 2.5n 4n )
Va3 A3 gnd dc 0 pulse (0 0 2.5n 100p 100p 2.5n 4n )
Va0 A0 gnd dc 0 pulse (0 2.5 8n 100p 100p 4n 7.25n )
Vin Cin gnd dc 0 pulse (0 0 2.5n 100p 100p 1n 1n )
|
A : 15.22 ns
Term between sum and rising edge 0.18ns : works well
[result]
A : 15.24 ns
Term between sum and rising edge 0.16ns : works well
[result]
A : 15.25 ns
Term between sum and rising edge 0.15ns : works
[result]
A : 15.26 ns
Term between sum and rising edge 0.14ns : doesn't work
[result]
Thus, setup time for falling activity is 0.070
* Summery
The setup time is not a function of primary input but the input of
flipflop. In this case, the S0,S1,S2,S3 is that.
The setup time is highly related the flipflop itself.
|
Setup
time for Rising activity
|
Setup
time for falling activity
|
A0(S0)
|
70ps
|
150ps
|
A1(S1)
|
70ps
|
150ps
|
A2(S2)
|
70ps
|
150ps
|
A3(S3)
|
70ps
|
150ps
|
Cin(Sn)
|
72ps
|
152ps
|
Setup
time of Cin has bigger value. but the slope of voltage plot is bigger
than An as well, which means the precision for input(10ps) is less
accurate for Cin. We can make sure it from the malfunctional point
(0.063). There are big gap between 0.072<->0.063.
Thus setup
time for this FF is 150ps.
7.
Again
using
trial-and-erro simulation (lo2spice.pl + HSPICE), determine the
propagation
delay for the outputs Q3, Q2, Q1, Q0, and c_out
to
the nearest 10 ps.
(1) Delay for Q.
To deal this delay. we
should make the input sequence first.
I gave long enough clock
period.
The flipflop sample the
input value at rising edge.
i) tp from
reset.
We should make Q to 1. And give reset reset while clock is not edge.
Because reset is asyncronize signal, the Q will be go to zero, clock
independly.
ii) tpLH from
clock rising.
First of all, we should clean the Q to 0, and give A
to 0 for a while and then make A to 1. At the next rising clock edge,
the Q will be propagated.
The 4 accumulator logic is
identical. which means the input and state is same, the delay will be
the same as well.
The reset make Q always 0,
so the tp means tpHL.
|
tpHL
from Clock rising
|
tpLH
from Clock falling
|
tp
from reset
|
Q0
|
670ps
|
885ps
|
690ps
|
Q1
|
670ps |
885ps
|
690ps |
Q2
|
670ps |
885ps
|
690ps |
Q3
|
670ps |
885ps
|
690ps |
[Result . delay for Q]
(2) Delay for Cout.
There are 6
source of this delay.
i) from Cin :
This is ripple carry adder. If other signal is keeping current
condition, the Cin can occur the change of Cout.
First of all we should make {Q3,Q2,Q1,Q0}={1,1,1,1}. While
there are no signal changing from any other signal (including phi) at
evaluation phase(low clock), I changed the value of Cin only. this is
blind environment for Cout.
[Result]
ii) frim phi
: This is domino logic. and to propagate correct value when the rising
of the FF's clock, the phi for this logic is inversed.
At low level of phi is evaluation period. and at high
level of phi is precharging period. We can measure the delay from
falling edge of phi to cout.
First of all we should make {Q3,Q2,Q1,Q0}={1,1,1,1} first.
and give Cin from precharge phase to evaluation phase. We
can measure the delay from the beginning of evaluation phase, even
though the Cin is arrived before then.
[Result]
iii) From
A3,A2,A1,A0 : A3,A2,A1,A0 is one of input of each accumulator,
respectly. I can occur Cout directly.
First of all weh should make {Q3,Q2,Q1,Q0}={1,1,1,1} and
Cin is keeping 0 in this case, because A should occur the output.
And then, I gave one of {A3,A2,A1,A0} to 1, it generate
Cout.
The path from Cout to A3 is shortest. and to A0 is
longest. Thus the delay from A3 will short and A0 will be long.
[Result
A1]
[Result
A2]
[Result A3]
[Result
A0]
* Summery of
the Result
|
Q
|
Cout
|
reset
|
690ps
|
-
|
rising clock
|
tpLH:885ps ,
tpHL:670
|
-
|
Cin
|
-
|
1.32ns
|
falling edge
of phi
|
-
|
1.54ns
|
A3
|
-
|
390ps
|
A2
|
-
|
700ps
|
A1
|
-
|
1.02ns
|
A0
|
-
|
1.33ns
|