



# INTERNATIONAL JOURNAL FOR RESEARCH

IN APPLIED SCIENCE & ENGINEERING TECHNOLOGY

Volume: 11 Issue: IX Month of publication: September 2023

DOI: https://doi.org/10.22214/ijraset.2023.55707

www.ijraset.com

Call: © 08813907089 E-mail ID: ijraset@gmail.com

Volume 11 Issue IX Sep 2023- Available at www.ijraset.com

### High Performance VLSI Architecture for DC and Planar Modes of Intra Prediction in HEVC

Sunil B<sup>1</sup>, Vijaya Prakash A M<sup>2</sup>

<sup>1</sup>MTech Student, <sup>2</sup>Professor, Department of ECE, Dean R & D, Bangalore Institute of Technology, Bengaluru, India

Abstract: High Efficiency Video Coding (HEVC) achieves improved compression efficiency, but it introduces higher computational complexity due to intricate partitioning and increased angular modes in intra prediction. In this study, we propose a hardware architecture for handling the direct current (DC) and planar modes of intra prediction in the HEVC standard. To evaluate our proposed architecture, we performed synthesis using both TSMC 180nm and TSMC 90nm technologies. We later conducted physical implementation using Physical Cell 90nm technology. The results show significant improvements when moving from TSMC 180nm to TSMC 90nm technology, with the chip area reducing by approximately three times and power consumption decreasing by over six times. After completing the physical layout phase, we obtained a chip area of 112.19 mm² and a power consumption of 3.88 mW. Comparing the synthesis results with the physical design phase, we observed a slight increase in chip area by around 1.5 times, while the power consumption decreased by approximately 0.4 times. The proposed architecture achieves a throughput of 27 pixels per clock cycle and supports a block size of 16x16, with the potential for further extension.

Keywords: HEVC, Intra Prediction, Planar Mode, DC Mode, Video Compression, ASIC

### I. INTRODUCTION

In the realm of information theory, data compression, also known as source coding or bit-rate reduction, refers to the process of encoding information using a smaller number of bits compared to its original representation. This reduction in data size is commonly known as data compression. In the context of data transmission, it is referred to as source coding as the encoding takes place at the data source before storage or transmission. Data compression provides several advantages by reducing the resources necessary for data storage and transmission. However, the compression and decompression processes require computational resources. Achieving effective data compression involves finding a balance between memory usage and processing time, resulting in a trade-off between the two factors.

Digital video data contains significant redundancy, which makes it well-suited for compression techniques. These techniques effectively address the challenges associated with large video file sizes. Lossy compression stands out among the other compression techniques because it provides higher compression ratios for video data. It is important to remember, though, that when the compression ratio rises and the file size decreases, the video quality could suffer.

HEVC is a video compression standard designed to offer improved performance compared to previous standards like H.264/AVC. The video compression process in HEVC involves employing an HEVC video encoder to encode a series of video frames, which constitute the source video. This encoding generates a compressed video bitstream. The compressed bitstream can be transmitted or stored efficiently. When the compressed video is received, a video decoder is employed to decompress the bitstream, reconstructing the original sequence of decoded frames. Through the utilization of HEVC's advanced compression techniques, video data can be effectively transmitted, stored, and decoded while maintaining a higher video quality level.



Fig. 1. HEVC block diagram





Volume 11 Issue IX Sep 2023- Available at www.ijraset.com

Fig.1. shows the HEVC block diagram. It consists of the following stages: The video input stage involves capturing or retrieving the source video frames, which form the basis for compression. The encoding stage utilizes an HEVC video encoder to compress the video frames. This involves various techniques such as partitioning, intra-prediction or inter-prediction, transform coding, quantization, and entropy coding. Intra- prediction predicts pixel values within a frame, inter-prediction utilizes information from previous frames and transform coding convert's spatial information into frequency-domain coefficients. After encoding, the compressed video data is organized into a compressed bitstream. This bitstream contains encoded information representing the compressed frames. The decoding stage involves using an HEVC video decoder to reconstruct the original video frames from the compressed bitstream. This includes inverse processes such as entropy decoding, inverse quantization, inverse transform coding, and motion compensation. The reconstructed video frames are then presented as output, which can be displayed on a screen or used for further processing.

Fig. 2. Depicts a partitioned picture with a highlighted slice. This particular slice consists of multiple Coding Tree Units (CTUs), each measuring 64x64 pixels. Within each 64x64 CTU, there is the possibility of dividing it into four smaller Coding Units (CU) sized 32x32 pixels. Further subdivision is also possible, where each 32x32 CU can be split into 16x16 CUs, and subsequently into 8x8 CUs. Each CU can be divided into Prediction Units (PU) and Transform Units (TU).



Fig. 2. Partitioning on an Image into slice, Coding Tree Unit (CTU) and Coding Unit (CU)

TUs carry the residual signal, representing the difference between the original pixel values and the predicted values. PUs and TUs can vary in size, ranging from 4x4 to 32x32 pixels, as shown. All the CUs, PUs, and TUs contain corresponding luma (Y-component) and chroma (Cb and Cr components) blocks, along with associated syntax elements similar to those found in the CTU. HEVC incorporates two main categories of intra prediction methods. The first category encompasses angular prediction methods, which enable the codec to accurately represent various directional structures commonly found in images. These methods facilitate precise modeling of directional patterns within the picture, leading to improved prediction accuracy. The second category comprises planar prediction and DC (Direct Current) prediction methods. These techniques aim to provide predictions for image areas that exhibit smooth and gradual content changes. Planar prediction estimates pixel values based on the assumption of a flat plane, while DC prediction predicts pixel values based on the average value of neighboring pixels. These methods are particularly effective in capturing areas with consistent and gradual variations in image content. HEVC supports a total of 35 intra prediction modes, which combine both angular prediction methods and planar/DC prediction methods. This wide range of modes provides diverse options for accurately predicting pixel values within a given block or unit of the image. By leveraging these prediction modes, HEVC achieves improved compression efficiency while preserving image quality.

### II. RELATED WORK

Proposed two architecture, [1] Parallel Pipelined Architecture (PPA) and Parallel Datapath Architecture (PDA) for DC and planar modes of intra prediction in HEVC. For lower block size like block size 4, PDA consumed fewer resources. But for higher block size, consumed fewer resource. [2] Proposed an algorithm for HEVC using high-level synthesis (HLS) designing method. [3] Proposed three different architecture for intra prediction in HEVC. Fully Sequential Architecture (FSA), Semi Parallel Architecture (SPA) and Fully Parallel Architecture (FPA). FSA utilized the least resources among the three, and FPA had significantly faster processing time among the three. [4] Proposed a computationally scalable algorithm and its architecture. This algorithm and architecture enable efficient and effective intra coding for high-resolution videos, providing scalability in computational performance to handle the demanding requirements of encoding at such resolutions.





Volume 11 Issue IX Sep 2023- Available at www.ijraset.com

Proposed an architecture [5] which introducing a novel buffer structure designed specifically for reference samples. This buffer structure enhances the handling and management of reference samples, improving the overall efficiency and effectiveness of the encoding process. Implementing a mode-dependent scanning order. This scanning order is tailored to the specific encoding modes utilized, optimizing the arrangement of data and improving the compression efficiency. Introducing an inverse method for extending reference samples. This method allows for accurate and effective extension of reference samples, enabling more precise predictions and enhancing the overall quality of the encoded video.

The focus of [6] is on the VLSI implementation of Discrete Cosine Transform (DCT) algorithms specifically designed for HEVC applications. The research emphasizes the hardware realization of efficient DCT algorithms tailored to meet the requirements of HEVC video coding. By focusing on VLSI implementation, the study aims to develop optimized hardware architectures for DCT that can be seamlessly integrated into HEVC systems, enhancing their overall performance and efficiency.

Thoroughly examines the data dependency in HEVC [7] intra-mode decision and proposes a set of fast algorithms aimed at eliminating data dependency and reducing computational complexity. These algorithms include the Rough Mode Decision based on the source signal, a coarse to fine rough mode search, Prediction Mode Interlaced RDO mode decision, parallelized context adaption, and Chroma-free Coding Unit (CU)/Prediction Unit (PU) decision. In order to increase throughput, the research also suggests a parallelized VLSI architecture that uses scheduling for CU reordering and Chroma reordering. By leveraging these architectural optimizations, the proposed solution demonstrates improved efficiency and performance in the context of HEVC intramode decision making.

The architecture presented in [10] takes into account all the innovative features of HEVC Intra Prediction, including all modes and Prediction Unit (PU) sizes. Performance and memory access pose challenges in HEVC intra prediction, and hardware architecture designs offer promising solutions, particularly for achieving energy-efficient implementations. To address these challenges, the designed architecture incorporates buffers and internal memories, effectively reducing the reliance on external memory accesses. Additionally, the architecture features two independent data paths capable of processing eight samples in parallel. This parallel processing capability, combined with a deep and multiplierless pipeline, significantly increases the throughput of the system.

### III. PROPOSED METHODOLOGY

Fig. 3. Illustrates the proposed high-performance architecture for the DC and planar modes of intra prediction in HEVC. The architecture is divided into two main parts.



Fig. 3. Block diagram of proposed architecture

The first part involves the extraction of pixel data from an input image, and this functionality is implemented using MATLAB. The second part of the architecture is dedicated to performing the DC and planar prediction using a reference buffer. This portion of the architecture is designed using Verilog, which is a hardware description language (HDL) commonly employed for the design of digital systems. By utilizing MATLAB for pixel data extraction and Verilog for DC and planar prediction, the proposed architecture leverages the strengths of both software and hardware design methodologies to achieve a high-performance solution for intra prediction in the DC and planar modes of HEVC.

ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538 Volume 11 Issue IX Sep 2023- Available at www.ijraset.com

### A. Image Conversion into Input Data for Prediction

The architecture of the system takes an input image as the initial source of data, which is used for predicting the values of DC and planar modes in HEVC. In HEVC, it is common to utilize the 4:2:0 chroma subsampled YCbCr color space for the input image. Initially, the input image is in the RGB color space. To convert the image to the YCbCr color space, each RGB pixel undergoes a transformation to obtain its corresponding YCbCr values. This RGB to YCbCr conversion results in an image consisting of three color components: Y (luma), Cb (blue-difference chroma), and Cr (red-difference chroma), with each component having full resolution. In HEVC and many other video codecs, a common practice is to apply 4:2:0 chroma subsampling. Indicating that for every 4 Y samples, there are 2 Cb and 2 Cr samples. The Y component (luma) remains unchanged and retains its full resolution, while the Cb and Cr components undergo subsampling. After the chroma subsampling process, the Y, Cb, and Cr components are separated into individual image planes, each representing a specific color component. By following this color space conversion and chroma subsampling procedure, the architecture prepares the input image in the appropriate format for further processing and prediction in the DC and planar modes of HEVC.



Fig. 4. Flowchart of image conversion into input data

Volume 11 Issue IX Sep 2023- Available at www.ijraset.com

### B. Planar and DC Modes of Intra Prediction

The DC and planar prediction using a reference buffer are executed in the second section of the architecture. Verilog HDL is used in this area of the architecture's implementation. The flowchart in fig. 5. illustrates the sequential steps involved in extracting the reference buffers from the Y, Cb, and Cr components and performing prediction using these buffers. For each component, namely Y, Cb and Cr, a corresponding reference buffer is extracted.



Fig. 5. Flowchart of HEVC intra prediction

These reference buffers contain the neighboring pixels required for prediction within the specific mode. The extraction process involves accessing the neighboring pixels surrounding the current block or unit being processed. These pixels are stored in the reference buffer, forming a spatial neighborhood that is essential for accurate prediction. The prediction algorithm utilizes these reference buffers to estimate the pixel values based on the DC or planar prediction mode

A reference sample in the context of intra prediction refers to a specific pixel value extracted from the reference buffer. It serves as a basis for predicting the value of another pixel during the encoding process. In the intra prediction phase, neighboring pixels from the reference buffer utilized as reference samples in order to estimate the value of the current pixel being encoded. The choice of reference samples depends on several factors, including the prediction mode and the size of the block being analyzed.



Fig. 6. Reference sample and block to be predicted



ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538

Volume 11 Issue IX Sep 2023- Available at www.ijraset.com

Different prediction modes and block sizes dictate the positioning of the reference samples relative to the current pixel. Typically, the reference samples can be located above, to the left, or diagonally above-left of the current pixel, among other possible positions, depending on the specific prediction mode being employed. By incorporating the appropriate reference samples from the reference buffer, the intra prediction algorithm can make informed estimations of pixel values, enhancing the efficiency and accuracy of compression in HEVC.

### C. Planar Prediction

Planar prediction is a method used in the HEVC intra prediction process. It assumes that the pixel values within a block exhibit a linear relationship in both the horizontal and vertical directions. This prediction is achieved by combining the horizontal prediction and vertical prediction using weighted averaging, based on the position of the pixel being predicted. The planar prediction, denoted as PPre[m][n], is calculated using equation (1). The horizontal prediction, HPre[m][n], determined by equation (2), and the vertical prediction, VPre[m][n], determined by equation (3). The block size, represented by N, can have values of 4, 8, 16, 32, or 64.

$$PPre[m][n] = (HPre[m][n] + VPre[m][n] + N) \gg (\log_2 N + 1) \dots (1)$$

$$HPre[m][n] = (N-1-m) * ref[-1][n] + (m+1) * ref[N][-1] \dots (2)$$

$$VPre[m][n] = (N-1-n) * ref[m][-1] + (n+1) * ref[-1][N] \dots (3)$$

Reference sample is denoted as ref[m][n]. m and n indicates the location of prediction sample, where m is row and n is column. m and n can range from 0 to N-1. The notation ">>" represents the bitwise shift right operation. This operation shifts the bits of a binary number to the right by a specified number of positions. It is commonly used in various computational operations and optimizations. The planar mode is recognized to have higher computational complexity compared to other modes used in HEVC.

By utilizing planar prediction, HEVC can accurately estimate the pixel values within a block by considering the linear relationships along both the horizontal and vertical directions. This technique contributes to improved compression efficiency and higher quality video encoding.

### D. DC Prediction

In the DC mode of intra prediction in HEVC, the pixel values within a block are estimated by calculating the average of the top and left reference samples. This average value, referred to as DC<sub>avg</sub> can be calculated by equation (4). It is obtained by summing up the pixel values of the top reference samples ( $\Sigma ref[m][-1]$ ) and the left reference samples ( $\Sigma ref[-1][n]$ ), and dividing the sum by the total number of reference samples used in the calculation.

$$DC_{avg} = (\sum_{m=0}^{N-1} ref[m][-1] + \sum_{n=0}^{N-1} ref[-1][n]) \gg (\log_2 N + 1) \dots (4)$$

The DC mode is commonly used for flat regions in a video frame where the brightness variations are minimal. However, block partitioning in HEVC can lead to discontinuities at block boundaries, resulting in visible artifacts. To mitigate this issue, boundary smoothing techniques are applied. These techniques involve specific filtering operations on the corner sample, top boundary, and left boundary samples of the block. The corner sample smoothing is computed using equation (5), while filtering for top horizontal row and left vertical column boundary sample smoothing is computed by equations (6) and (7) respectively, where *m* and *n* can take value from 1 to *N*-1. Boundary smoothing is generally applied only to the luma (Y) component of the video frame and not to the chroma (Cb and Cr) components. Because applying boundary smoothing to the chroma components will not yield significant benefits and may introduce unnecessary processing overhead.

$$DCPre[0][0] = (ref[-1][0] + 2 * DC_{avg} + ref[0][-1]) \gg 2 \dots (5)$$

$$DCPre[m][0] = (ref[m][-1] + 3 * DC_{avg} + 2) \gg 2 \dots (6)$$

$$DCPre[0][n] = (ref[-1][n] + 3 * DC_{ava} + 2) \gg 2$$
 .....(7)

Overall, the DC mode is considered to be the least computationally expensive mode among all the intra prediction modes in HEVC.

### IV. RESULTS AND DISCUSSION

The proposed architecture for the DC and planar modes of intra prediction in HEVC is implemented using Verilog HDL, and its functionality is verified through simulation using the Cadence SimVision tool. The architecture is designed to handle the Y luma component with a block size of 16x16, while the Cb and Cr chroma components, which undergo 4:2:0 chroma subsampling, have a block size of 8x8. To evaluate the performance of the architecture, synthesis is conducted using the Cadence Genus Synthesis Solution, utilizing both TSMC 180nm and TSMC 90nm technologies. Additionally, the physical implementation of the architecture is carried out using the Cadence Innovus Implementation System, specifically targeting the physical cell 90nm technology.



ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538 Volume 11 Issue IX Sep 2023- Available at www.ijraset.com

Fig. 7, fig. 9 and fig. 10. shows the planar prediction output for Y luma, Cb and Cr chroma components respectively at the RTL design stage. Fig. 8. shows the DC prediction output for Y luma component at RTL design stage. After synthesizing using TSMC 180nm technology the architecture consumed a chip area of 246.41mm<sup>2</sup> and power consumption of 27.86mW. The gate count is 24,692. After synthesizing using TSMC 90nm technology the architecture consumed a chip area of 75.97mm<sup>2</sup> and power consumption of 5.1 mW. The gate count is 20,075.

| Baseline ▼= 0  Cursor-Baseline ▼= 135,000ps |          |         |   | Baseline | = 0 |        |             |      |          |      |         |      |      |
|---------------------------------------------|----------|---------|---|----------|-----|--------|-------------|------|----------|------|---------|------|------|
| Name                                        | <b>*</b> | ursor 🕻 | * | 0  20    |     | 20,000 | ),000ps  40 |      | 40,000ps | 6    | 0,000ps | 80,0 |      |
| ;• <b>ITI</b> • clk                         | 1        |         | 7 |          |     |        |             |      |          |      |         |      |      |
| ⊕                                           |          | d 89    |   | x        |     |        | (3          | 161  | 160      | 156  | 151     | 137  | 123  |
| ▼                                           | ٠.       | d 89    |   | ×        |     |        | (2          | 152  | 151      | 147  | 143     | 130  | 118  |
| ▼                                           | 14       | d 90    |   | ×        |     |        | (3          | 149  | 148      | 145  | 141     | 129  | (118 |
|                                             | 1.0      | d 91    |   | x        |     |        | (3          | 144  | 143      | (140 | 137     | 126  | (115 |
|                                             | 1.0      | d 93    |   | ×        |     |        | (1          | 143  | 142      | 139  | 136     | 126  | 116  |
|                                             |          | d 90    |   | ×        |     |        | (5          | 127  |          | 125  | 123     | 114  | (107 |
| ⊕                                           |          | d 94    |   | ж        |     |        | (3          | 131  |          | 129  | 126     | 118  | (111 |
| ⊕ √ V_Planar_Prediction_r8[11:0]            |          | d 93    |   | x        |     |        | (3          | 120  |          | 119  | 118     | 111  | 105  |
| ⊕ V_Planar_Prediction_r9[11:0]              | 14       | d 95    |   | ×        |     |        | (3          | 123  |          | 121  | 120     | 113  | 108  |
| ⊕ ™ V_Planar_Prediction_r10[11:0]           | 14       | d 100   |   | x        |     |        | (1          | 133  | 132      | (130 | 128     | 122  | (116 |
| ⊕                                           | 14       | d 102   |   | ×        |     |        | (1          | 133  | 132      | (130 | 128     | 122  | (117 |
| ⊕                                           |          | d 103   |   | ж        |     |        | (6          | 129  | 128      | (126 | 124     | 120  | (116 |
| ⊕                                           |          | d 102   |   | х (115   |     | 115    |             | (114 | 113      | 110  | (108    |      |      |
| ⊕ √ V_Planar_Prediction_r14[11:0]           |          | d 101   |   | x (106   |     |        |             |      | 105      | (103 |         |      |      |
| ⊕ ™ V_Planar_Prediction_r15[11:0]           | 14       | d 104   |   | ×        |     | (1     | 107         |      |          |      | 106     | 105  |      |
| ⊕ ™ V_Planar_Prediction_r16[11:0]           | 14       | d 104   |   | ×        |     |        | (1          | 101  | 102      |      |         |      | (103 |
| To V_ref_buffer_corner[7:0]                 |          | d 175   |   | ×        |     | 175    |             |      |          |      |         |      |      |
| ⊕                                           |          | d 111   |   | ж        |     | 111    |             |      |          |      |         |      |      |
| ⊕                                           |          | d 99    |   | ж        | 99  |        |             |      |          |      |         |      |      |
| ⊕ √ V_ref_buffer_left_0[7:0]                |          | d 147   |   | ×        | 147 |        |             |      |          |      |         |      |      |
| ⊕ • √_ref_buffer_left_1 [7:0]               | 1.       | d 131   |   | x        | 131 |        |             |      |          |      |         |      |      |
| ⊕ √ ref_buffer_left_2[7:0]                  | 100      | d 131   | _ | x        |     | 131    |             |      |          |      |         |      |      |

Fig. 7. Planar prediction output of Y component



Fig. 8. DC prediction output of Y component



Fig. 9. Planar prediction output of Cb component



Cr\_Planar\_Prediction\_r7\_and\_r8[11:0]

### International Journal for Research in Applied Science & Engineering Technology (IJRASET)

ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538 Volume 11 Issue IX Sep 2023- Available at www.ijraset.com

131

Baseline ▼= 0 Cursor-Baseline ▼= 135,000ps **⇔**▼ Cursor **⇔**▼ 20,000ps 142 Cr\_ref\_buffer\_corner[7:0] d 142 d 130 130 Cr\_ref\_buffer\_left\_bottom[7:0] d 134 134 🔯 Cr\_ref\_buffer\_top\_right[7:0] cr\_ref\_buffer\_left\_0[7:0] d 129 Cr\_ref\_buffer\_left\_1 [7:0] d 126 126 Cr ref buffer left 2[7:0] 126 Cr\_ref\_buffer\_left\_3[7:0] d 125 125 cr ref buffer left 4[7:0] d 127 cr\_ref\_buffer\_left\_5[7:0] Cr\_ref\_buffer\_left\_6[7:0] Cr\_ref\_buffer\_left\_7[7:0] 130 144 Cr\_ref\_buffer\_top\_0[7:0] d 142 142 Cr\_ref\_buffer\_top\_1 [7:0] Cr\_ref\_buffer\_top\_2[7:0] 137 Cr\_ref\_buffer\_top\_3[7:0] d 133 133 Cr\_ref\_buffer\_top\_4[7:0] Cr\_ref\_buffer\_top\_5[7:0] d 132 132 Cr\_ref\_buffer\_top\_6[7:0] d 133 Cr\_ref\_buffer\_top\_7[7:0] 138 137 ¥ 135 ¥ 133 132 Cr\_Planar\_Prediction\_r1\_and\_r2[11:0] (133 Cr\_Planar\_Prediction\_r3\_and\_r4[11:0] d 131 131 Cr\_Planar\_Prediction\_r5\_and\_r6[11:0] d 131 131 130

Fig. 10. Planar prediction output of Cr component

d 131



Fig. 11. Post routing in physical design

Fig. 11. shows the design after the routing step in the physical design. After performing physical design using physical cell 90nm technology, the chip area increased to 112.19mm<sup>2</sup> mainly due to the net area and power consumption decreased to 3.88mW compared to TSMC 90nm technology synthesis results. Fig. 12, fig. 14 and fig. 15 shows the planar prediction output for Y, Cb and Cr components after the physical layout process. Fig. 13. shows the DC prediction output for Y component after the physical layout process. The pixel values for planar and DC prediction of Y, Cb and Cr components post physical layout is equivalent to the values observed during functional verification at the RTL design stage.

ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538 Volume 11 Issue IX Sep 2023- Available at www.ijraset.com



Fig. 12. Post physical layout planar mode output of Y component



Fig. 13. Post physical layout DC mode output of Y component



Fig. 14. Post physical layout planar mode output of Cb component



Cr\_Planar\_Prediction\_r5\_and\_r6[11:0] Cr\_Planar\_Prediction\_r7\_and\_r8[11:0]

### International Journal for Research in Applied Science & Engineering Technology (IJRASET)

ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538 Volume 11 Issue IX Sep 2023- Available at www.ijraset.com

Baseline ▼= 0
Cursor-Baseline ▼= 500,000ps 40,000ps Cursor Cr\_ref\_buffer\_corner[7:0] d 142 142 d 130 Cr\_ref\_buffer\_left\_bottom[7:0] 130 d 134 134 Cr\_ref\_buffer\_top\_right[7:0] Cr\_ref\_buffer\_left\_0[7:0] Cr\_ref\_buffer\_left\_1 [7:0] 126 cr\_ref\_buffer\_left\_2[7:0] d 126 Cr\_ref\_buffer\_left\_3[7:0] d 126 126 a 125 125 Cr\_ref\_buffer\_left\_4[7:0] d 127 Cr\_ref\_buffer\_left\_5[7:0] 120 Cr\_ref\_buffer\_left\_7[7:0] d 130 130 d 144 Cr\_ref\_buffer\_top\_0[7:0] Cr\_ref\_buffer\_top\_1[7:0] 144 d 142 142 Cr\_ref\_buffer\_top\_2[7:0] 137 Cr\_ref\_buffer\_top\_3[7:0] 133 Cr\_ref\_buffer\_top\_4[7:0] d 130 130 130 Cr ref buffer top 5[7:0] Cr\_ref\_buffer\_top\_6[7:0] Cr\_ref\_buffer\_top\_7[7:0] Cr\_Planar\_Prediction\_r1\_and\_r2[11:0] 138 133 Cr\_Planar\_Prediction\_r3\_and\_r4[11:0] 133 132 131 131 130

Fig. 15. Post physical layout planar mode output of Cr component

Table 1. Results comparison of TSMC 180nm, TSMC 90nm and Physical cell 90nm technologies

| 1                             | ,          | 2         |                    |
|-------------------------------|------------|-----------|--------------------|
| Technology                    | TSMC 180nm | TSMC 90nm | Physical Cell 90nm |
| Cell Area (mm <sup>2</sup> )  | 246.41     | 75.97     | 82.36              |
| Net Area (mm <sup>2</sup> )   | 0          | 0         | 29.83              |
| Total Area (mm <sup>2</sup> ) | 246.41     | 75.97     | 112.19             |
| Leakage Power (uW)            | 9.73       | 342.33    | 442.55             |
| Internal Power (mW)           | 23.13      | 3.54      | 1.91               |
| Switching Power (mW)          | 4.71       | 1.21      | 1.52               |
| Total Power (mW)              | 27.86      | 5.10      | 3.88               |
| Gate Count                    | 24692      | 20075     | 29645              |

Table 2. Comparison of results

|                                 | Proposed<br>Work | Proposed<br>Work | Proposed Work | Ref [4] | Ref [7] | Ref [6] |
|---------------------------------|------------------|------------------|---------------|---------|---------|---------|
| Technology                      | TSMC             | TSMC             | Physical Cell | TSMC    | TSMC    | TSMC    |
|                                 | 180nm            | 90nm             | 90nm          | 90nm    | 55nm    | 180nm   |
| Year                            | 2023             | 2023             | 2023          | 2016    | 2016    | 2014    |
| Chip Area (mm <sup>2</sup> )    | 246.41           | 75.97            | 112.2         | -       | -       | 3011    |
| Gate Count (K gates)            | 24.69            | 20.07            | 29.64         | 1086    | 1572    | -       |
| Power Consumption (mW)          | 27.86            | 5.10             | 3.88          | 273     | 194     | 643     |
| Throughput (Pixels/clock cycle) | 27               | 27               | 27            | -       | 124     | -       |
| Maximum Block Size              | 16               | 16               | 16            | 32      | 32      | 16      |

Table 1. compares the results obtained after the synthesis using TSMC 180nm, TSMC 90nm technology and post physical design using physical cell 90nm technology. Table 2. Shows the comparison of results obtained from proposed work with previous works.

### V. CONCLUSION

In this project, a high-performance architecture for intra prediction in HEVC was designed and synthesized using Cadence Genus Synthesis Solution. The architecture supports both DC and planar modes and operates at a throughput of 27 pixels per clock cycle.



ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538

Volume 11 Issue IX Sep 2023- Available at www.ijraset.com

It takes Y luma and Cb, Cr chroma reference buffers as input and can predict all 256 pixel values in a 16x16 block for the Y component in 16 clock cycles. The synthesis results indicate that transitioning from TSMC 180nm to TSMC 90nm technology significantly reduces the chip area by almost three times and power consumption by more than six times. However, after the physical design phase, the chip area increases by approximately 1.5 times, primarily due to the net area introduced during the routing phase. On the other hand, the power consumption decreases by around 0.4 times. Ultimately, the physical design of the architecture results in a chip area of 112.19 mm2 and a power consumption of 3.88 mW. These values reflect the trade-off between area and power achieved during the design process.

### REFERENCES

- [1] Lakshmi and P. Aparna, "Efficient Architectures for Planar and DC modes of Intra Prediction in HEVC", IEEE 7th Int. Conference on Signal Processing and Integrated Networks, Noida, India, 2020.
- [2] A. B. Atitallah and M. Kammoun, "High-level design of HEVC intra prediction algorithm", 5th International Conference on Advanced Technologies for Signal and Image Processing, Sousse, Tunisia, 2020.
- [3] S. Shwetha, Lakshmi and P. Aparna, "Complexity Analysis of Hardware Architectures for Intra Prediction unit of High Efficiency VideoCoding (HEVC)" IEEE International Conference on Electronics, Computing and Communication Technologies, Bangalore, India, 2020.
- [4] G. Pastuszak and A. Abramowski, "Algorithm and architecture design of the H.265/HEVC intra encoder," IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 1, pp. 210–222, Jan 2016.
- [5] B. Min, Z. Xu and R. C. C. Cheung, "A Fully Pipelined Hardware Architecture for Intra Prediction of HEVC," IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 12, 2017.
- [6] P. T. Vanishree and A. M. Vijaya Prakash, "VLSI Implementation of Discrete Cosine Transform and Intra prediction", IEEE International Conference on Advances in Electronics, Computers and Communications, Bangalore, India, 2014.
- [7] X. Huang, H. Jia, B. Cai, C. Zhu, J. Liu, M. Yang, D. Xie and W. Gao, "Fast algorithms and VLSI architecture design for HEVC intra-mode decision", Springer, Journal of Real-Time Image Processing, 2016.
- [8] V. Sze, M. Budagavi, and G. J. Sullivan, High efficiency video coding (HEVC). Berlin, Germany: Springer-Verlag, Jul. 2014.
- [9] G. J. Sullivan, J. R. Ohm, W. J. Han, and T. Wiegand, "Overview of the High Efficiency Video Coding (HEVC) Standard," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, 2012.
- [10] D. Palomino, F. Sampaio, L. Agostini, S. Bampi, and A. Susin, "A memory aware and multiplierless VLSI architecture for the complete Intra Prediction of the HEVC emerging standard", IEEE Int. Conf. on Image Processing, pp, 2012.
- [11] M. Abeydeera, M. Karunaratne, G. Karunaratne, "4K Real-Time HEVC Decoder on an FPGA", IEEE Transactions on Circuits and Systems for Video Technology, Volume: 26, Issue: 1, 2016.
- [12] D. Patel, T. Lad and D. Shah, "Review on Intra-prediction in High Efficiency Video Coding (HEVC) Standard", International Journal of Computer Applications, Volume 132 No.13, 2015.
- [13] T. Nguyen, D. Marpe, "Performance Analysis of HEVC-Based Intra Coding for Still Image Compression", 2012 Picture Coding Symposium, Krakow, Poland, 2012
- [14] M. Perleberg, V. Borges, V. Afonso, "6WR: A Hardware Friendly 3D-HEVC DMM-1 Algorithm and Its Energy-Aware and High-Throughput Design", IEEE Trans. on Circuits and Systems II, 2020.
- [15] K. Singh, S. R. Ahamed, "Scalable VLSI Architecture for Hadamard Transforms of HEVC/H.265 Video Coding Standard", 2020 24th International Symposium on VLSI Design and Test, Bhubaneswar, India, 2020.









45.98



IMPACT FACTOR: 7.129



IMPACT FACTOR: 7.429



## INTERNATIONAL JOURNAL FOR RESEARCH

IN APPLIED SCIENCE & ENGINEERING TECHNOLOGY

Call: 08813907089 🕓 (24\*7 Support on Whatsapp)