



E-Mail: editor.ijasem@gmail.com editor@ijasem.org





# HIGH-SPEED MULTIPLIER DESIGN USING OPTIMIZED 8-4 AND 9-4 COMPRESSOR STRUCTURES

Mr. Mohammed Waheed Sohail \*1, Dr. Mohammad Iliyas \*2, Dr. Amairullah Khan Lodhi \*3
\*1 PG Student Dept. of ECE (VLSI System Design), Shadan College of Engineering and Technology
mohdwaheed952@gmail.com

\*2 Dean Academics & Professor, HOD of ECE Department, Shadan College of Engineering and Technology ilyas.ece@gmail.com

\*3 Professor, Dean (R & D), Dept. of ECE, Shadan College of Engineering and Technology lak resumes@yahoo.com

Abstract: This research focuses on the development of advanced compressor architectures optimized for high-speed and energy-efficient multiplication. The proposed compressor designs offer two primary benefits: reduced computational latency and improved spatial utilization. Although lower-order compressors exhibit slightly better Energy Delay Product (EDP), their performance trade-offs must be considered. To evaluate the effectiveness of the proposed compressors, the study implements and analyzes multipliers of sizes 8×8, 16×16, and 24×24. These are then compared against traditional multipliers that use lower-order compressor structures. Simulation results indicate that the newly designed compressors are highly effective for applications requiring fast multiplication operations. The modeling and synthesis of these compressors were conducted using the Cadence RTL Compiler under standard conditions, including a supply voltage of 1.2V and an ambient temperature of 25 °C.

**Keywords:** High-speed multiplication, Energy-efficient design, Compressor architecture, Energy Delay Product (EDP), Multiplier design, RTL synthesis.

#### INTRODUCTION TO VLSI

remains for "Critical VLSI Integration". This is the field which joins pressing continuously technique for thinking contraptions into increasingly little and littler areas. VLSI, circuits that would have consumed boardfuls of room would now have the ability to be put into a little space couple of millimeters over! This has opened up an important chance to do things that were implausible heretofore. VLSI circuits are all over the place .your PC, your auto, your fresh out of the plastic new top level automated camera, the mobile phones, and what have you. This consolidates a ton of bent on different fronts inside a practically identical field, which we will take a gander at in later bits. VLSI has been around for quite a while, yet as a reaction of advances in the space of PCs, there has been an eager improvement of contraptions that can be utilized to format VLSI circuits. Close by, consenting to Moore's law, the point of confinement of an IC has expanded exponentially reliably, to the degree estimation control, a usage of accessible territory, yield. Mix enables us to frameworks with different deliver transistors, engaging through and through furthermore enlisting capacity to be related with managing an issue. Made circuits are moreover essentially less mentioning to plan and make and are more dependable than discrete structures; that makes it conceivable to make remarkable reason frameworks that are more convincing than completely accommodating PCs for the action that ought to be done.

### ADVANTAGES OF VLSI

While we will focus on melded circuits in this book, the properties of formed circuits what we can and can't satisfactorily put in an intertwined circuit, in a manner of speaking, pick the planning of the whole structure. Merged circuits update structure properties in several major ways. ICs have three key positive conditions over front line circuits worked from discrete segments:

- Size. Joined circuits are basically increasingly modest—the two transistors and wires are contracted to micrometer sizes, showed up diversely in connection to the millimeter or centimeter sizes of discrete parts. Negligible size prompts central focuses in speed and power utilization, since more minor parts have increasingly modest parasitic protections, capacitances, and inductances.
- Speed. Signs can be exchanged between strategy for thinking 0 and avocation 1 basically





snappier inside a chip than they can between chips. Correspondence inside a chip can happen various occasions speedier than correspondence between chips on a printed circuit board. The quick of circuits on-chip is an immediate consequence of their little size—increasingly small parts and wires have progressively minor parasitic capacitances to back off the pennant.

• Power use. Legitimization practices inside a chip in like way take basically less power. After a short time, chop down power utilization is, everything considered, because of the little size of circuits on the chip—littler parasitic capacitances and securities require less capacity to drive them.

#### LITERATURE REVIEW

A. Ultra Low-Voltage Low-Power CMOS 4-2 and 5-2 Compressors for Fast Arithmetic Circuits [2004]

In this paper the maker presents distinctive models and blueprints of low-control 4-2 and 5-2 blowers prepared for working at ultra low supply voltages. Each designing has unmistakable courses of action fuses different novel 4-2 and 5-2 blower plans. They are prototyped and imitated to survey their execution in speed, control dispersal and power-delay thing. The proposed new circuit for the XOR- XNOR module delineated in this paper takes out the weak method of reasoning on the inside center points of pass transistors with a few info NMOS-PMOS transistors. Driving limit has been considered in the blueprint and furthermore in the reenactment setup with the objective that these 4-2 and 5-2 blower cells can work constantly in any tree composed parallel multiplier at low supply voltages. In this paper the maker construe that a phenomenal disappointment control circuit with extraordinary drivability is proposed for the complex XOR method of reasoning module which is used to co-produce the XNOR-XOR yields. Novel 5-2 blower structure of deferral is also proposed. Another arrangement of the pass on generator cell has been delivered in view of this novel designing.

B. Novel and Efficient 4:2 and 5:2 Compressors with Minimum number of Transistors Designed for Low-Power Operations [2006]

This paper proposes beneficial and perfect 4:2 and

Vol 19, Issue 3, 2025

5:2 blowers. The blowers are significantly redesigned the extent that transistor check. These plans have the standard ideal position that despite reduced change development, they have no quick relationship with the power-supply and is totally dictated by the data signals, inciting a noticeable

dictated by the data signals, inciting a noticeable diminishing in obstruct usage. The proposed 4-2 and 5-2 blower blueprints have been executed with an outright least of 20 and 30 transistors independently. The speed, an area and power usage of the multipliers will be in guide degree to the efficiency of the blowers. This paper gives novel plans of 4:2 and 5:2 blowers with least number of transistors. The proposed layouts are outstandingly viable to the extent little domain low power and high throughput. In this paper, the maker at first depicted about the present structure of 4:2 and 5:2 blowers. In the proposed blower

plan, the XOR portal and MUX plots are appropriately superseded with least transistor utilization to diagram the circuit to such a degree, that it has no quick supply from VDD. It will in general be interpreted that the proposed 4:2 blower design is an upgraded structure as the makers have streamlined the entire diagram

C. A Novel High-Speed and Energy Efficient 10-Transistor Full Adder Design [2007]

starting from the central cell level.

In this paper, the maker proposed a full snake design using only ten transistors for each piece. The proposed design features cut down working higher figuring speed and lower voltage, essentialness action when differentiated and existing techniques. The arrangement uses inverter bolstered XOR/XNOR diagrams to take out the edge voltage mishap issue. This issue shields the full snake from working in low supply voltage. The proposed arrangement embeds the buffering circuit in the full snake plan and the transistor count is restricted. The upgraded buffering works under lower supply voltage differentiated and existing works. It also overhauls the speed execution of the fell action basically while keeping up the execution edge in essentialness use. Both dc and displays of the proposed diagram against various full snake plans are surveyed by methods for expansive HSPICE propagations.

D. Novel Architectures for High-Speed and Low-Power 3-2, 4-2 and 5-2 Compressors [2007]

In this paper, the maker showed novel models and





plans of fast, low power 3-2, 4-2 and 5-2 blowers fit for working at ultra-low voltages. The power use, deferral and domain of these new blower structures are differentiated and existing and starting late proposed blower models and are seemed to perform better. The proposed building lays highlight on the usage of multiplexers in number juggling circuits that result in fast and gainful layout. In the proposed building these yields are capably utilized when appeared differently in relation to existing designs to improve the execution of blowers. In the proposed novel structures of 3-2, 4-2 and 5-2 blowers, the maker replaces some XOR deters with MUX squares. Since the availability of the select bits beforehand the information bits arrive completes the trading development of the transistors, the general deferment in the essential way is decreased while using MUX squares. This paper closes by dismembering the proposed structures of 3-2, 4-2, 5-2 blowers with the normal models.

#### **COMPRESSORS**

Augmentation is a basic task in the majority of the flag handling calculations. Multipliers have substantial zone, long inertness and devour significant power and the plan of good multipliers is dependably a test for VLSI framework planners. The target of a decent multiplier is to give a physically reduced, great speed and ought to expend low power. Increase comprises of three stages (I) Partial item age (ii) Partial item decrease (iii) Final item calculation. Decrease of halfway item stage will influence the multiplier execution as far as speed and power scattering in the VLSI circuits. Halfway item decrease has high latencies because of the long vertical way. Ordinarily adders are utilized to lessen the vertical basic way. In any case, adders will make issues like glitches, uneven flag change; and it will take more number of stages to diminish the incomplete item decrease. To stay away from those issues, blowers should be executed in the multiplier plan. (Oklobdzija ET AL., 1996; Dandapat ET AL., 2007). The upside of utilizing blowers is to give customary structure in fractional item decrease organize.

The lower arrange blowers, for example, 3 - 2, 4 - 2, 5 - 2 were considered by numerous scientists (Dandapat ET AL.,

2010) and (Veeramachaneni ET AL., 2007). In rapid multiplier, 4-2 blowers have been broadly used to bring down the dormancy of the halfway

Vol 19, Issue 3, 2025

item decrease stages. The majority of the business outlines in the different processors in the market utilize 4 to 2 blower. Indeed, even the quantity of incomplete item decrease stages can't be diminished as much utilizing the lower arrange blowers. Thus the postponement of multipliers additionally was not diminished to such an extent. The higher request blowers (5-3, 6-3 and 7-3) were utilized to enhance the execution of multipliers prior (Dandapat ET AL., 2010; Dadda, 1976). They have combined double counter property into the high request blowers which have additionally decreased the halfway item stages and power utilization. In this examination, we have utilized 7-3 blower which is outlined by four full adders (Dadda, 1976) to enhance the execution of multiplier. In this investigation, we have proposed 8-4 and 9-4 blowers which have additionally diminished the quantity of incomplete item phases of multipliers when contrasted with existing blowers. Also, the proposed blowers utilize less number of doors, so generally plan territory diminishes. This procedure offers less deferral, however Energy Delay Product (EDP) is somewhat higher than bring down request blower.

The 4-2 blower has five information sources and creates two yields and one do. This blower utilizes two phases of full adders associated in arrangement. This straight forward usage has four XOR door delays. (Oklobdzija ET AL., 1996). Different methodologies have been proposed to enhance their speed. As, 4-2 blowers were executed with 3 XOR delays (Hsiao ET AL., 1998; Gu and Chang, 2003; Chang ET AL., 2004; Ma and Li, 2008).

3.1. Confinements of Lower Order Compressor Design

It is required to make a note of the impediments of existing lower arrange blowers, for example,

- They require more adders to figure the correct paired weighted yield results
- It is required to include half viper with a 4-2 blower and a full snake with a 5-2 blower to get appropriate parallel weighted outcomes
- Uneven flag engendering into the adders prompts some undesirable changes which increment dynamic power utilization
- The third phase of full snake needs some additional time (say  $\Delta$ ) to register the last



aggregate and out c. Time  $\Delta$  will be more for 6-2 and 7-2 blowers

### 3.2. Proposed Compressor Design

The proposed higher request blowers, 8-4 and 9-4 give preferable execution over the lower arrange blowers as far as speed and zone. A portion of the restrictions said above have been limited in 7-3 blower (Dadda, 1976). In any case, the deferral can be additionally diminished by utilizing 8-4 and 9-4 blowers. We have created 8-4 and 9-4 blowers for multipliers. A right mix of snake has been produced a proficient 8-4 and 9-4 blowers.

#### 3.3. Structure of 8-4 and 9-4 Compressors

Utilizing full and half viper Fig. 1 demonstrates that 8-4 blower has 8 inputs (I0-I7) and four yields (X1-X4). This blower utilizes counter property so that, yield of blower gives number of 1's at input. For instance, if all info bits are 1, at that point yield of the blower is "1000". In this plan, blower takes four phases of adders to pack the info bits into four yield bits. In first stage, two full adders and one half adders are utilized in parallel. Two full adders are utilized in second stage. All "Sum" yields from the main stage are sustained with one full viper and all "Carry" yields are bolstered to another snake. One half snake is utilized in third and fourth stage to create the outcome. Absolutely, we have utilized four full adders and three half adders.

For instance, 4-2 blower takes four phases and six full adders to pack 8 bits into 4 bits. Proposed blower has more number of half adders. Half snake frequently utilizes less number of doors and possesses less zone than full viper.

The basic way deferral of the proposed execution is 6 XOR door delay. The conditions administering the yields in the proposed 8-4 design are appeared underneath Equation (1 to 4):

yields in the proposed 8-4 design are appeared underneath Equation (1 to 4):

$$x1 = a \oplus c \oplus e$$
 (1)

$$x2 = ((ac)|(ae)|(ce)) \oplus (b \oplus d \oplus f)$$

$$x3 = (((ac)^{(ae)}(ce)) \oplus (b \oplus d \oplus f)) \oplus ((bd^{(bf)}(df))^{(3)}$$

www.ijasem.org



$$x4 = (((ac)(ac)(ce)) \square (b \oplus d \oplus f)) \square ((bd)(bf)(df)) (4)$$

Where

$$a = 10 \oplus 11$$
;  $B = 10 \square 11$ ;

$$c = 12 \oplus 13 \oplus 14$$

$$d = \left( \left( 12 \square 13 \right) \left( 12 \square 14 \right) \left( 13 \square 14 \right) \right);$$

$$f = ( (15 \square 16) (15 \square 17) (16 \square 17) )$$



Fig. 1. 8-4 Compressor design



Fig 2: 9-4 Compressor design

The essential path deferral of the proposed 9-4 blower is 6 XOR entryway delay and the amount of lessening stages is

A. Main focal points of the proposed blowers than low solicitation blowers are, (1) Uniform XOR gateway concede paying little regard to the information bits (2) Number of reduction arrange is less (3) Less number of entryways.

# 3.4. Structure of 8-4 and 9-4 Compressors Using Multiplexer

This structure is recognized in light of the help of multiplexer keeping the ultimate objective to hint at progress result similarly as power dispersal and imperativeness delay thing. Figure 3, 4 and Table 1 exhibits the utilization of 8-4 and 9-4 blowers.

In a multiplexer, using the assurance lines only the bit of the structure is dynamic, letting the rest well enough alone for rigging mode. Thusly saving critical proportion of power and appropriately diminishing the imperativeness concede thing by various folds.

### 3.5. Multiplier Architecture

We have laid out three one of a kind  $(8\times8,$ 16×16 and 24×24) multipliers using Wallace tree building (Law ET AL., 1999). These multipliers uses higher solicitation blowers. Figure 5 demonstrates plan of a 16×16 multiplier. Differing sorts of blowers are used to enroll the fragmentary thing. Partial things are incorporated five stages. Proper arrangements of blowers/adders have been used remembering the true objective to decrease the vertical essential way. Allow us to consider fragment number fifteen of Fig. 5, which has fifteen bit things. We can use one 7-3 and 6-3 blowers and half snake to that portion. This mix produces eight yields. As opposed to that,





one 8-4 and 7-3 blowers could be a predominant decision. This mix makes only seven yields. By picking suitable blend of blowers/adders, we can restrain fundamental way.

In Fig. 3 Vertical box demonstrates the blowers/snake. In case any of the portions isn't campaigned in the cases, those things can be passed to the accompanying stage. In case the case is even in bearing, it shows that the parallel adders have been used. We have used swell pass on adders in parallel snake. All of the three multipliers are arranged capably. We have made the multiplier which has less number of adders/blowers. By and by let us consider fragment 31 which has two vertical touches and area 32 which has one spot. Instead of using one half adders in section 31, we explicitly incited those two spots into the accompanying stages. This restricts the amount of adders in the multiplier.

# Multiplier Performance and Comparison

We have used 3-2 (Hsiao ET AL., 1998) and 4-2 blowers of (Chang ET AL., 2004; Ng and Lau, 1999; Prasad and Parhi, 2001; Baran ET AL., 2010; Ma and Li, 2008) in our 8 bit, 16 bit and 24 bit multipliers and found that our proposed higher solicitation blowers give higher speed and lesser district Table 2. Figure 6-8 independently exhibits the speed, locale and power examination of a multiplier using both low and high solicitation blowers. Speed of the higher solicitation blower multiplier increases when duplication bit increases. For 8 bit expansion the speed change is 4% than low solicitation blower diagram. Basically, speed change of higher solicitation blower for 16 bit is 9.04 and 9.3% for 24 bit multiplier. High solicitation blowers have less passage count and it includes less standard blowers. region than solicitation blower eats up more power than low solicitation blower. For 8bit, 16 bit and 24 bit multiplier the power usage is extended by 10, 22.6 and 26.7% independently.

Vol 19, Issue 3, 2025

Fig. 3. Implementation of 8-4 compressor using multiplexer







**Fig. 4.** Implementation of 9- 4 compressor using multiplexer



Fig. 6. Speed comparison of different multiplier



Fig. 7. Area comparison of different multiplier



**Fig. 8.** Power comparison of different multiplier



Vol 19, Issue 3, 2025

## RESULTS AND DISCUSSION

In the diversion results show that the 8-4 and 9-4 blowers worked with the novel XOR cell can work down to 0.6 V, and features quick and lowcontrol characteristics. The 5-2 blower proposed by the maker beats the different models. The maker traces that a library of incredible power capability 4-2 and 5-2 blower cells in light of CMOS process advancement has been delivered for realizing fast and low-control multipliers operable at ultra low supply voltages. In the diversion is done in Tanner season and the development being used is 0.35-um CMOS propelled advancement with a 3.3-V supply voltage. The inciting delay is assessed when the changing data accomplishes half of the advancement to when the yield accomplishes its half. Finally the maker construed that the proposed layouts have the good position that despite diminished change development, they have no quick relationship with the powersupply and are totally dictated by the data signals, provoking a conspicuous reduction in hamper use.

In the amusement results rely upon TSMC 2P4M 0.35-m process models, show that the proposed arrangement has the least working Vdd and most astonishing working repeat among all layouts using 10T. It similarly shows the power use is low per development among these frameworks. What's more, besides, the execution edge of the proposed blueprint in both speed and imperativeness usage ends up being significantly more vital as the word length of the snake increases. The CLRCL design requires the most negligible power supply among every one of the 10T frameworks. The CLRCL arrangement has the fit for the most critical working repeat. In all of the multiplications have been done using Cadence Tools with 0.18-µm development. The calculation of force (tallying glitch power) and deferral are finished using the Virtual Analog Simulation instrument adequately organized into Cadence Tools. The reenactments are performed under various voltages reaching out from 0.9V to 3.3V. The proposed models perform better than the present ones in every viewpoint i.e., region, power, deferral and power-delay thing over the complete voltage run reenacted.

# Results and analysis:

GDI MUX:

# Schematic:



### Simulation:



#### Characteristics:

Vol 19, Issue 3, 2025



# Making a Verilog file:



# Simulate:



# Verilog file:



# Layout of GDI MUX:

# **CONCLUSION**

In conventional multiplier architectures, lowerorder compressors are typically used during the intermediate phase to reduce partial products. However, this often leads to non-uniform signal propagation across the multiplier structure. To address this, higher-order compressors have emerged as an effective solution, significantly minimizing the number of required stages and reducing delays along the vertical critical path. The carefully designed compressors presented here utilize fewer logic gates, thereby offering notable improvements in both processing speed and area optimization when compared to traditional designs. Despite these advantages, a



Vol 19, Issue 3, 2025

critical trade-off is observed: higher-order compressors generally consume more power their lower-order counterparts. than Additionally, they tend to exhibit a slightly Delay (EDP). higher Energy Product Nevertheless, the proposed architecture excels in handling high-bit multiplications efficiently, enabling faster computation of higher-order product bits and improving overall system performance.

#### REFERENCES

- [1] N. H. E. Weste and D. M. Harris, CMOS VLSI Design, a Circuits and Systems Perspective, 4th ed. Boston, MA, USA: Addison-Wesley, 2011.
- [2] R. Zimmermann and W. Fichtner, "Low-power logic styles: CMOS versus pass-transistor logic," IEEE J. Solid State Circuits, vol. 32, no. 7,pp. 1079–1090, Jul. 1997.
- [3] K. Yano et al., "A 3.8-ns CMOS  $16 \times 16$ -b multiplier using complementary pass-transistor logic," IEEE J. Solid-State Circuits, vol. 25, no. 2,pp. 388–393, Apr. 1990.
- [4] M. Suzuki et al., "A 1.5 ns 32b CMOS ALU in double pass-transistor logic," in Proc. IEEE Int. Solid-State Circuits Conf., 1993, pp. 90–91
- [5] X. Wu, "Theory of transmission switches and its application to design of CMOS digital circuits," Int. J. Circuit Theory Appl., vol. 20, no. 4,pp. 349–356, 1992.
- [6] V. G. Oklobdzija and B. Duchene, "Passtransistor dual value logic for low-power CMOS," in Proc. Int. Symp. VLSI Technol., 1995, pp. 341–344.
- [7] M. A. Turi and J. G. Delgado-Frias, "Decreasing energy consumption in address decoders by means of selective precharge schemes," Microelectron. J., vol. 40, no. 11, pp. 1590–1600, 2009.

- [8] V. Bhatnagar, A. Chandani, and S. Pandey, "Optimization of row decoderfor 128 × 128 6T SRAMs," in Proc. IEEE Int. Conf. VLSI-SATA, 2015, pp. 1–4.
- [9] A. K. Mishra, D. P. Acharya, and P. K. Patra, "Novel design technique of address decoder for SRAM," Proc. IEEE ICACCCT, 2014, pp. 1032–1035.
- [10] D. Markovic, B. Nikoli c, and V. G. Oklobdžija, "A general method in syn- thesis of pass-transistor circuits," Microelectron. J., vol. 31, pp. 991–998,2000.

# **Authors Profile**

- **Mr. Mohammed Waheed Sohail** is pursuing his M. Tech in VLSI from Shadan College of Engineering and Technology, J.N.T.U.H affiliated college.
- Dr. Mohammad Iliyas working as Dean Academics & Professor, HOD of ECE Department at Shadan College Engineering and Technology, Hyderabad. He received his Bachelor degree in Electronics and Communication Engineering from J.N.T.U.H affiliated M.Tech. college. Hyderabad, V.L.S.I System Design from J.N.T.U.H affiliated college. He has completed Ph.D from Sunrise University. His research interest is Low power VLSI Design.
- **Dr. Amairullah Khan Lodhi, Professor, R & D Coordinator,** Dept. of ECE from Shadan College of Engineering and Technology, Peerancheru, Telangana.