## A GENERALIZED PARETO FRONT APPROACH FOR PERFORMANCE ESTIMATION IN ANALOG DESIGN AUTOMATION SYSTEMS

by

Engin Deniz

B.S, in Electronics and Telecom. Eng., Yıldız Technical University, 2001M.Sc., in Electrical and Electronics Eng., Boğaziçi University, 2004

Submitted to the Institute for Graduate Studies in Science and Engineering in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Graduate Program in Electrical and Electronics Enginnering Boğaziçi University

2010

### ACKNOWLEDGEMENTS

I would like to express my sincere gratitude to my thesis supervisor Prof. Dr. Günhan Dündar for his patience, guidance and support during my educational and my thesis development.

I would like to thank to Prof. Dr. Cem Göknar for giving me support during my thesis development and being a jury member. I have learnt a lot about academic life from his experiences.

I also would like to thank to Prof. Dr. Helmut Graeb for giving me valuable feedbacks about my thesis.

I am grateful to all jury members for evaluating the thesis.

Finally, I would like to thank especially my family and my dearest Sinem for their continuous support and love.

### ABSTRACT

# A GENERALIZED PARETO FRONT APPROACH FOR PERFORMANCE ESTIMATION IN ANALOG DESIGN AUTOMATION SYSTEMS

The design of analog circuits requires a deep insight into both physical and technological parameter interactions. In addition, every application of analog circuits has specific constraints and requirements with large number of design equations, and also there is no unique solution of the design. At the same time, analog circuits are the key components of mixed-signal systems. Nowadays, not only design issues but also tradeoff analysis of competing performances is considered to be a significant issue in circuit modeling such that analog design automation tools, which increase efficiency and productivity, have become an attractive solution for integrated circuits (IC) providers. Furthermore, performance estimation tool becomes a requirement in order to speed up the automation system by eliminating the unfeasible circuits and the circuits which cannot meet the specifications.

In this thesis, a general methodology for the performance estimation of mixedsignal systems is proposed while exploiting the Pareto Front concept. Performance estimation requires a well-determined performance design space (PDS) exploration for a given technology. Since the complexity of mixed-signal systems grows progressively, the exploration of a huge design space is required for the performance estimation of system blocks with a dramatically increased exploration time. Therefore, a Matlabbased library is presented for a fast and accurate PDS exploration. Then, Pareto Front approach is applied to the system blocks. In addition, not only are optimum solution sets extracted but also an approximate design of the blocks is obtained in this thesis. Finally, Pareto Front composition is discussed by supporting the dominance rule with algebraic representations.

## ÖZET

# ANALOG TASARIM OTOMASYON SİSTEMLERİNDE PERFORMANS TAHMİNİ İÇİN GENELLEŞTİRİLMİŞ PARETO EĞRİSİ YAKLAŞIMI

Analog devre tasarımı, hem fiziksel hem de teknolojik parametrelerin etkileşimlerini derin bir biçimde anlayabilmeyi gerektirir. Bununla beraber, analog devrelerin uygulamaları çok sayıda tasarım denklemi ile beraber, kendine özgü gereksinimlere ve kısıtlamalara da sahiptir. Tasarımın da tek bir çözümü yoktur. Aynı zamanda karışıksinyal sistemlerinin en önemli anahtar parçasıdır. Günümüzde, tasarım sorunlarının yanı sıra performans parametrelerinin birbirleriyle olan bağlantısı ve etkileşimleri de devre modellenmesi açısından önemli bir konu olmuştur. Öyle ki, tümleşik devre üreticileri analog tasarım otomasyon araçlarını, verimliliği arttırabilmek için bir çözüm olarak görmektedirler. Buna ek olarak, otomasyon araçlarını hızlandırmak için tasarıma uygun olmayan veva performansı yeterli olmayan devrelerin gereksiz yere zaman harcamasını engelleyecek, performans değerlendirme araçları kullanılmaya başlanmıştır. Bu çalışmada, karışık-sinyal sistemlerinin performans tahmini için genel bir yöntem önerilmektedir. Bu yöntem, optimizasyon algoritmalarını ve Spice benzeri simülatörleri kullanmadan, Pareto Eğrisi tanımından faydalanmaktadır. Performans tahmini, verilen bir teknoloji için, iyi belirlenmiş performans tasarım-uzayı araştırması gerektirir. Karışık-sinyal sistemlerinin karmaşıklığı giderek arttıkça bu sistemlerin geniş çaplı tasarım-uzayı araştırması için ihtiyaç duyduğu süre de hızla artmaktadır. Bu sebepten, daha hızlı ve daha doğru bir tasarım-uzayı araştırması için Matlab tabanlı bir kütüphane hazırlanmıştır. Daha sonra sistemin her bir bloğuna Pareto Eğrisi yaklaşımı uygulanacaktır. Ek olarak, optimum çözüm kümelerinin (Pareto Eğrileri) ortaya çıkarılmasının yanı sıra sistemin her bir bloğuna ait yaklaşık bir tasarım da elde edilecektir. Son olarak, elde edilen Pareto Eğrilerinin birleştirilmesi cebirsel ifadelerle desteklenerek ele alınacaktır.

## TABLE OF CONTENTS

| ACKNOWLEDGEMENTS iii |                               |        |                                               |      |  |  |
|----------------------|-------------------------------|--------|-----------------------------------------------|------|--|--|
| AE                   | ABSTRACT iv                   |        |                                               |      |  |  |
| ÖZ                   | ÖZET v                        |        |                                               |      |  |  |
| LIS                  | ST O                          | F FIGU | IRES                                          | viii |  |  |
| LIS                  | ST O                          | F TABI | LES                                           | xiii |  |  |
| LIS                  | ST O                          | F SYMI | BOLS/ABBREVIATIONS                            | xvi  |  |  |
| 1.                   | INT                           | RODUC  | CTION                                         | 2    |  |  |
|                      | 1.1.                          | Purpos | se of the Thesis                              | 4    |  |  |
|                      | 1.2.                          | Backgr | round of the Problem                          | 7    |  |  |
|                      |                               | 1.2.1. | Performance Estimation Model                  | 8    |  |  |
|                      |                               | 1.2.2. | Pareto Front Approach in PE                   | 15   |  |  |
|                      | 1.3.                          | Contri | butions of the Thesis                         | 18   |  |  |
|                      | 1.4.                          | Overvi | ew of the Thesis                              | 21   |  |  |
| 2.                   | THE                           | PROP   | OSED APPROACH                                 | 22   |  |  |
| 3.                   | DESIGN PLANS OF ANALOG BLOCKS |        |                                               |      |  |  |
|                      | 3.1.                          | Topolo | ogy Divider Block                             | 25   |  |  |
|                      | 3.2.                          | The Li | brary: Generation of EKV-Based Analog Designs | 26   |  |  |
|                      |                               | 3.2.1. | EKV MOSFET Model                              | 28   |  |  |
|                      |                               | 3.2.2. | $g_{ds}$ Interpolation                        | 31   |  |  |
|                      | 3.3.                          | Design | Examples                                      | 33   |  |  |
|                      |                               | 3.3.1. | Differential Input Stage                      | 33   |  |  |
|                      |                               | 3.3.2. | Miller op-amp                                 | 35   |  |  |
|                      |                               | 3.3.3. | Telescopic (Cascode) op-amp                   | 42   |  |  |
|                      |                               | 3.3.4. | Folded (Cascode) op-amp                       | 44   |  |  |
|                      |                               | 3.3.5. | Latched Comparator                            | 47   |  |  |
|                      |                               | 3.3.6. | An interface for the Library                  | 51   |  |  |
| 4.                   | PAR                           | ETO F  | RONT EXTRACTION PART                          | 53   |  |  |
|                      | 4.1.                          | Pareto | Front Extraction for 2-D                      | 53   |  |  |
|                      | 4.2.                          | Pareto | Front Extraction for 3-D                      | 62   |  |  |

|    | 4.2.1. Calculation of Inner points                                      | 65  |  |  |
|----|-------------------------------------------------------------------------|-----|--|--|
| 5. | PARETO FRONT COMPOSITION PART                                           | 72  |  |  |
| 6. | APPLICATIONS OF THE PROPOSED APPROACH                                   | 88  |  |  |
|    | 6.1. A Sigma-Delta ADC design automation tool with embedded PE $\ldots$ | 88  |  |  |
|    | 6.2. Designing F-ADC system with proposed PE                            | 95  |  |  |
|    | 6.3. Pipelined ADC automation system with proposed PE                   | 104 |  |  |
| 7. | CONCLUSION                                                              | 119 |  |  |
| RE | REFERENCES                                                              |     |  |  |

## LIST OF FIGURES

| Figure 1.1. | Flowchart of analog design automation system | 4  |
|-------------|----------------------------------------------|----|
| Figure 1.2. | Simplified flowcharts of WATSON and PRISM    | 10 |
| Figure 1.3. | Simplified flowcharts of $[7]$ and $[8]$     | 12 |
| Figure 1.4. | Sections of proposed PE tool                 | 18 |
| Figure 1.5. | Demonstration of PF extraction               | 19 |
| Figure 1.6. | Demonstration of PF composition              | 20 |
| Figure 2.1. | Flowchart of the proposed PE                 | 23 |
| Figure 3.1. | Flowchart of topology divider algorithm.     | 26 |
| Figure 3.2. | Analog design loop                           | 27 |
| Figure 3.3. | $g_{ds}$ estimation procedure                | 32 |
| Figure 3.4. | Schematic of differential input stage        | 34 |
| Figure 3.5. | Schematic of Miller op-amp                   | 36 |
| Figure 3.6. | Flowchart of Miller op-amp design plan       | 39 |
| Figure 3.7. | Schematic of Cascode op-amp                  | 43 |
| Figure 3.8. | Flowchart of the Telescopic op-amp           | 44 |

| Figure 3.9.  | Schematic of Folded Cascode op-amp                                                                     | 45 |
|--------------|--------------------------------------------------------------------------------------------------------|----|
| Figure 3.10. | Flowchart of Folded cascode op-amp.                                                                    | 47 |
| Figure 3.11. | Schematic of latched comparator.                                                                       | 48 |
| Figure 3.12. | Flowchart of the comparator                                                                            | 49 |
| Figure 3.13. | Graphical user interface for design space exploration                                                  | 52 |
| Figure 4.1.  | Pareto Front & Design Space for N=2                                                                    | 54 |
| Figure 4.2.  | PFs for performance goals: a. maximize X- minimize Y b. minimize X-maximize Y c. maximize X-maximize Y | 55 |
| Figure 4.3.  | Geometrical constraints for PF extraction                                                              | 56 |
| Figure 4.4.  | PF must be located in the shaded region where set-T is defined                                         | 57 |
| Figure 4.5.  | Normalized PDS of area-power pair for Miller op-amp                                                    | 57 |
| Figure 4.6.  | set-T and Pareto points.                                                                               | 58 |
| Figure 4.7.  | PF of Miller op-amp for area-power pair                                                                | 59 |
| Figure 4.8.  | Normalized PDS of gain-power pair for Miller op-amp                                                    | 61 |
| Figure 4.9.  | PF of Miller op-amp for gain-power pair                                                                | 61 |
| Figure 4.10. | PF of the comparator (a) input resolution-delay, (b) latch frequency-<br>offset voltage.               | 62 |

| Figure 4.11. | Individual minima and trade-off boundaries for 3-dimesional PF                                     |    |
|--------------|----------------------------------------------------------------------------------------------------|----|
| Figure 4.12. | Boundary of PF and the inner points for 3-dimensional space. $\ . \ .$                             |    |
| Figure 4.13. | Illustration of inner Pareto points in 2-D space                                                   |    |
| Figure 4.14. | Projection of boundary points on X-Y plane                                                         | 66 |
| Figure 4.15. | PDS of Miller op-amp for gain, power and area                                                      | 68 |
| Figure 4.16. | Discrete PFs of performance pairs. (a) PF of Gain-Power, (b) PF of Gain-Area, (c) PF of Power-Area | 69 |
| Figure 4.17. | Projection of boundary points and data set on X-Z plane                                            | 70 |
| Figure 4.18. | Boundaries of 3D PF and inner points                                                               | 70 |
| Figure 4.19. | PF of comparator for offset voltage-area-power                                                     | 71 |
| Figure 5.1.  | Bottom-up composition of PFs                                                                       | 72 |
| Figure 5.2.  | PF composition example for area-power pair                                                         | 73 |
| Figure 5.3.  | Schematic of Miller op-amp and its subblock                                                        | 77 |
| Figure 5.4.  | Net-list of simple output stage                                                                    | 79 |
| Figure 5.5.  | $1^{st}$ order LPF, Miller op-amp is used                                                          | 83 |
| Figure 5.6.  | Hierarchical top-down decomposition bottom-up composition                                          | 84 |

| Figure 5.7.  | PF composition process of LPF a) PF(Miller) for gain-power b)<br>PF(BC-part) for gain-power c) PF(LPF-composition) for gain-power | 85  |
|--------------|-----------------------------------------------------------------------------------------------------------------------------------|-----|
|              | r (ree pare) for gam power of r (2r r composition) for gam power                                                                  | 00  |
| Figure 5.8.  | (a) Net-list file of LPF, (b) Output file of LPF                                                                                  | 86  |
| Figure 5.9.  | Equivalent circuit of LPF                                                                                                         | 87  |
| Figure 6.1.  | Communication diagram of blocks                                                                                                   | 89  |
| Figure 6.2.  | Flowchart of the algorithm                                                                                                        | 90  |
| Figure 6.3.  | SNDR versus power                                                                                                                 | 93  |
| Figure 6.4.  | PDS of op-amps for gain-power pair.                                                                                               | 95  |
| Figure 6.5.  | PF of op-amps for gain-power pair.                                                                                                | 95  |
| Figure 6.6.  | Block diagram of F-ADC                                                                                                            | 96  |
| Figure 6.7.  | Flash ADC system blocks.                                                                                                          | 98  |
| Figure 6.8.  | Elmore delay assumption                                                                                                           | 99  |
| Figure 6.9.  | PF of the subblocks (a) Comparator (b) R-string                                                                                   | .00 |
| Figure 6.10. | Searching PDS of the resistor string                                                                                              | .01 |
| Figure 6.11. | Modeling diagram of F-ADC 1                                                                                                       | .02 |
| Figure 6.12. | PF of F-ADC for BW vs. Power                                                                                                      | .03 |

| Figure 6.13. | (a) PF of comparator for Area-Power, (b) PF of R-string for Area-                                      |     |
|--------------|--------------------------------------------------------------------------------------------------------|-----|
|              | Power, (c) PF of F-ADC for Area-Power                                                                  | 104 |
| Figure 6.14. | General architecture of a NoS-stage P-ADC                                                              | 105 |
| Figure 6.15. | Operations of MDAC                                                                                     | 106 |
| Figure 6.16. | Block diagram of P-ADC as ML-PE application                                                            | 109 |
| Figure 6.17. | Flowchart of P-ADC design automation.                                                                  | 111 |
| Figure 6.18. | PFs of different configurations for 10-bits                                                            | 112 |
| Figure 6.19. | PFs of different configurations for 14-bits                                                            | 113 |
| Figure 6.20. | PF of the proposed approach and PF of the reference model includ-                                      |     |
|              | ing scaled solutions                                                                                   | 114 |
| Figure 6.21. | A comparison to different models                                                                       | 115 |
| Figure 6.22. | A sample of transient analysis for 8-bit P-ADC. (a) Output of                                          |     |
|              | MDAC, residue amplification, (b) Output of F-ADC, digital output                                       | 116 |
| Figure 6.23. | Expected values of each stage for 8-bit conversion                                                     | 117 |
| Figure 6.24. | Output of the F-ADCs (a) $1^{st}$ stage: 0111111, (b) $2^{nd}$ stage: 0000111, (c) $3^{rd}$ stage: 001 | 118 |

## LIST OF TABLES

| Table 1.1.  | Comparison of PE approaches                                                                           | 5  |
|-------------|-------------------------------------------------------------------------------------------------------|----|
| Table 3.1.  | Input set and variable design parameters                                                              | 34 |
| Table 3.2.  | Input set and variable design parameters of the Miller op-amp                                         | 37 |
| Table 3.3.  | Input space of the op-amp                                                                             | 39 |
| Table 3.4.  | Estimated circuit parameters for H-Spice                                                              | 40 |
| Table 3.5.  | Comparison of estimation and simulation results $\ldots \ldots \ldots$                                | 40 |
| Table 3.6.  | Input set and solution of the reference design for the high speed<br>op-amp                           | 41 |
| Table 3.7.  | Selected solution of the high speed op-amp and its verification                                       | 41 |
| Table 3.8.  | Input set and solution of the reference design for low speed op-amp                                   | 42 |
| Table 3.9.  | Selected solution of the high speed op-amp and its verification                                       | 42 |
| Table 3.10. | Input space and estimated circuit parameters for H-Spice                                              | 44 |
| Table 3.11. | Comparison of estimation and simulation results of Cascode op-amp                                     | 45 |
| Table 3.12. | Circuit parameters of the selected candidates and their comparison results for the latched-comparator | 50 |
| Table 4.1.  | Dominance rule and geometrical elimination for 2-dimensional space                                    | 56 |

| Table 4.2. | Input set of case study-1 for PF extraction                                      | 58  |
|------------|----------------------------------------------------------------------------------|-----|
| Table 4.3. | Design parameters of IM and comparison of estimation vs. simula-<br>tion results | 60  |
| Table 4.4. | Input set of case study-1 for PF extraction for gain-power $\ . \ . \ .$         | 60  |
| Table 4.5. | Comparison in execution time for both algorithms $\ldots \ldots \ldots$          | 60  |
| Table 4.6. | Individual Minima- Limits of the design                                          | 67  |
| Table 4.7. | Number of boundary points                                                        | 68  |
| Table 4.8. | Verification of the approach                                                     | 71  |
| Table 5.1. | Sample of a Look-up table for Miller op-amp                                      | 77  |
| Table 5.2. | A sample of numerical examples from $1^{st}$ and $2^{nd}$ approach $\ldots$      | 81  |
| Table 5.3. | A sample of look-up table for differential pair stage                            | 81  |
| Table 5.4. | Verification of the results collected from subblocks and Miller op-amp           | 82  |
| Table 5.5. | AC performance comparison of LPF                                                 | 86  |
| Table 6.1. | Input parameters of the Sigma-Delta automation tool                              | 91  |
| Table 6.2. | Input parameters of the op-amp topologies                                        | 94  |
| Table 6.3. | Input parameters for F-ADC system                                                | 99  |
| Table 6.4. | Other performance parameters for PF of BW vs. Power pair $\ldots$                | 101 |

| Table 6.5. | Main error sources                                                       | 106 |
|------------|--------------------------------------------------------------------------|-----|
| Table 6.6. | Input parameters for the blocks of P-ADC                                 | 110 |
| Table 6.7. | Different configurations for 10-bits and their power-area values $\ .$ . | 112 |
| Table 6.8. | Different reference models and their solutions                           | 114 |
| Table 6.9. | A sample of DC parameters of the op-amp and the comparator               | 116 |

# LIST OF SYMBOLS/ABBREVIATIONS

| $A_{comp}$           | Area of Comparator                                                                                                                                                                                                                                                                  |
|----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| $A_{res}$            | Area of Resistor String                                                                                                                                                                                                                                                             |
| $A_V$                | Open loop gain                                                                                                                                                                                                                                                                      |
| BW                   | Bandwidth                                                                                                                                                                                                                                                                           |
| C                    | Coulomb                                                                                                                                                                                                                                                                             |
| $C_C$                | Compensation Capacitance                                                                                                                                                                                                                                                            |
| $C_f$                | Feedback Capacitance                                                                                                                                                                                                                                                                |
| $C_i$                | Input Capacitance                                                                                                                                                                                                                                                                   |
| $C_L$                | Output Capacitance                                                                                                                                                                                                                                                                  |
| $C_{MDAC}$           | MDAC-input Capacitance                                                                                                                                                                                                                                                              |
| $C_{OX}$             | Oxide Capacitance                                                                                                                                                                                                                                                                   |
| $C_P$                | Parasitic Capacitance                                                                                                                                                                                                                                                               |
| $C_{Unit}$           | Unity Capacitance                                                                                                                                                                                                                                                                   |
| D                    | Diffusion Coefficient                                                                                                                                                                                                                                                               |
| $E_i$                | AC Gain of the $i^{th}$ stage                                                                                                                                                                                                                                                       |
| $e_m$                | Mismatch Error                                                                                                                                                                                                                                                                      |
| $e_t$                | Thermal Error                                                                                                                                                                                                                                                                       |
| $ec{E}_x$            | Electric Field                                                                                                                                                                                                                                                                      |
| F                    | Farad                                                                                                                                                                                                                                                                               |
| f3dB                 | Cutoff Frequency                                                                                                                                                                                                                                                                    |
| $f_i$                | Nonlinear functions                                                                                                                                                                                                                                                                 |
| $f_{nd}$             | Frequency of Non-dominand Pole                                                                                                                                                                                                                                                      |
| $g_{ds}, \; g_{out}$ | Output Transconductance                                                                                                                                                                                                                                                             |
| $G_i$                | Closed Loop Gain of the $i^{th}$ stage                                                                                                                                                                                                                                              |
| $g_m$                | Input transconductance                                                                                                                                                                                                                                                              |
| $I_D$                | Drain Current                                                                                                                                                                                                                                                                       |
| $I_F$                | Forward Current                                                                                                                                                                                                                                                                     |
| $I_i$                | Circuit Parameters                                                                                                                                                                                                                                                                  |
| $I_R$                | Reversed Current                                                                                                                                                                                                                                                                    |
|                      | $A_{comp}$<br>$A_{res}$<br>$A_V$<br>BW<br>C<br>$C_C$<br>$C_G$<br>$C_f$<br>$C_i$<br>$C_L$<br>$C_{MDAC}$<br>$C_{OX}$<br>$C_P$<br>$C_{Unit}$<br>D<br>$E_i$<br>$e_m$<br>$e_t$<br>$\vec{E}_x$<br>F<br>f3dB<br>$f_i$<br>$f_{nd}$<br>$g_{ds}, g_{out}$<br>$G_i$<br>$I_D$<br>$I_F$<br>$I_R$ |

| $I_S$        | Specific Current                       |
|--------------|----------------------------------------|
| $J_x$        | Current Density                        |
| K            | Boltzman's Constant                    |
| $K_{Ci}$     | Parasitic Capacitances of the Switches |
| $KP_{n,p}$   | Technology Parameter                   |
| L            | Length of a Transistor                 |
| $m_n^x$      | Effective Mass of Elecrons             |
| N            | Bit nunber                             |
| $N_{S/H}^2$  | Quantization Noise of Sample and Hold  |
| $N_{MDAC}^2$ | Quantization Noise of Multiplying DAC  |
| $N_A$        | Acceptor                               |
| $N_D$        | Donor                                  |
| $p^*$        | Individual Minimum                     |
| $P_{comp}$   | Power of Comparator                    |
| $P_{MDAC}$   | Power of Multiplying DAC               |
| poly         | polynomial                             |
| $P_{res}$    | Power of Resistor String               |
| $R_C$        | Compensation Resistor                  |
| $R_{ON}$     | Switch on Resistance                   |
| $R_S$        | Sheet Resistance                       |
| T            | Temperature                            |
| $V_A$        | Early Voltage                          |
| $V_{CH}$     | Channel Voltage                        |
| $V_{D,S}$    | Drain,Source Voltage                   |
| $V_{gs}$     | Gate-Source Voltage                    |
| $V_{in}$     | Input Voltage                          |
| $V_o$        | Output Voltage                         |
| $V_{OL}$     | Maximum Output Low Voltage             |
| $V_{OH}$     | Minimum Output High Voltage            |
| $V_{os}$     | Offset Voltage                         |
| $V_P$        | Pinch-off Voltage                      |

| $V_{RA}$   | Residue Amplification Voltage                    |  |  |
|------------|--------------------------------------------------|--|--|
| $V_{ref}$  | Reference Voltage                                |  |  |
| $V_t$      | Threshold Voltage                                |  |  |
| $v_x$      | Net Velocity of the Electrons                    |  |  |
| W          | Width of a Transistor                            |  |  |
| $x_d$      | Design Paramters                                 |  |  |
|            |                                                  |  |  |
| $\epsilon$ | Dielectric Constant                              |  |  |
| $\gamma$   | GAMMA, Technology Parameter                      |  |  |
| $\mu_n$    | Electron Mobility                                |  |  |
| Ω          | Unit of Resistors                                |  |  |
| $\phi$     | PHI, Technology Parameter                        |  |  |
| ρ          | Volumetric Charge Density                        |  |  |
| θ          | 1 for single-ended op-amp                        |  |  |
|            |                                                  |  |  |
| ADA        | Analog Design Automation                         |  |  |
| AMS        | Austria Micro Systems                            |  |  |
| ANN        | Artifical Neural Network                         |  |  |
| ASIC       | Application Specific Integrated Circuits         |  |  |
| BSIM       | Berkeley Short-channel IGFET Model               |  |  |
| BTS        | Basic Two Stage                                  |  |  |
| CAD        | Computer Aided Design                            |  |  |
| CLG        | Closed Loop Gain                                 |  |  |
| CMOS       | Complementary MetalOxideSemiconductor            |  |  |
| CPPLL      | Charge-Pump Phase-Locked Loop                    |  |  |
| CPU        | Central Processing Unit                          |  |  |
| DAC        | Digital to Analog Converter                      |  |  |
| EKV        | C.C. $Enz$ , F. $Krummenacher$ and E.A. $Vittoz$ |  |  |
| ENOB       | Effective Number of Bits                         |  |  |
| EPFL       | The Ecole Polytechnique Fédérale de Lausanne     |  |  |
| GA         | Genetic Algorithm                                |  |  |

| GBW                  | Gain Bandwidth Product                          |  |  |
|----------------------|-------------------------------------------------|--|--|
| GUI                  | Graphical User Interface                        |  |  |
| IC                   | Inversion Coefficient                           |  |  |
| IM                   | Individual Minima                               |  |  |
| LPF                  | Low Pass Filter                                 |  |  |
| LS-SVM               | Least Square-Support Vector Machines            |  |  |
| MDAC                 | Multiplying Digital to Analog Converter         |  |  |
| ML-PE                | Multi-Level Performance Estimation              |  |  |
| MOSFET               | MetalOxideSemiconductor Field-Effect Transistor |  |  |
| NBI                  | Normal Boundary Intersection                    |  |  |
| NoS                  | Number of Stages                                |  |  |
| OSR                  | Over-Sampling Ratio                             |  |  |
| ОТА                  | Operational Transconductance Amplifier          |  |  |
| P-ADC                | Pipelined Analog to Digital Converter           |  |  |
| PDS                  | Performance Design Space                        |  |  |
| PE                   | Performance Estimation                          |  |  |
| $\mathbf{PF}$        | Pareto Front                                    |  |  |
| RAM                  | Read Access Memory                              |  |  |
| SL-PE                | Single Lavel-Performance Estimation             |  |  |
| SNDR                 | Spurious Free Dynamic Range                     |  |  |
| SNR                  | Signal to Noise Ratio                           |  |  |
| SQP                  | Sequential Quadratic Programming                |  |  |
| $\operatorname{SR}$  | Slew Rate                                       |  |  |
| $\operatorname{sub}$ | subblock                                        |  |  |
| SVM                  | Support Vector Machines                         |  |  |
| TCP                  | Transmission Control Protocol                   |  |  |
| TTN                  | Total Thermal Noise                             |  |  |
| UMS                  | United Monolithic Semiconductors                |  |  |
| WMIN                 | Minimum Width                                   |  |  |

### 1. INTRODUCTION

The signals in the real world around us are analog signals and the connection between the real world and the electronic system is realized by analog circuits while sensing, amplifying or processing these signals in a continuous domain. In other words, most electronic systems must interface with the real world via analog circuits. At the same time, analog circuits are the key components of mixed-signal systems. Besides, the total design time is generally dominated by analog circuits in these systems because design of analog circuits requires a deep insight into both physical and technological parameter interactions. Moreover, every application of analog circuits has specific constraints and requirements with a large number of design equations, and also there is no unique solution of the design. Furthermore, analog circuits are sensitive to supplies, noise, loads, temperature etc. Consequently, analog design has traditionally been considered as a difficult discipline.

In the past decades, CMOS (Complementary Metal Oxide Semiconductor) technology was the dominant semiconductor technology for analog systems due to the low-cost and high performance solutions. In addition to this, low-voltage, low-current circuits (both analog and digital) and application specific integrated circuits (ASICs) have risen to dominate the market. However, design in the analog domain requires more creativity in evaluating the large number of free parameters and the interactions between them. Due to this fact, design of analog circuits is much more complicated compared to that of digital systems, and analog designs have been carried out by experienced designers. Also, many performance parameters such as gain, speed, power dissipation, bias voltages etc. are to be considered at the same time. Hence, while designing analog circuits, parameters require many trades-offs between specifications.

In the past, analog design lacked Computer Aided Design (CAD) tools, which improve efficiency and productivity of the design. However, nowadays, Analog Design Automation (ADA) has become an important research field aiming to overcome the design problems of mixed-signal systems such as complexity, faster and more accurate modeling, topology selection etc.

An ADA system, which is illustrated in Figure 1.1, has generally three main levels- system level synthesis, circuit level synthesis and layout level synthesis- and five intermediate blocks-Performance estimator (PE), library, simulator, layout advisor and circuit extractor. At the top, there is a system level synthesis tool which takes behavioral specifications as its inputs and provides a block diagram solution while interacting with the library block. Circuit level synthesis tool is present at the second level. Detailed implementations of the given specifications and a block diagram are taken from the system level tool. Fully-sized device level circuit schematic is prepared by using the selected technology. A simulator block is used for calculating DC solutions, solving user-defined equations or finding AC behavior. A circuit level tool is responsible for fabrication considerations (such as mismatch and tolerances). At the bottom, layout level synthesis takes place, which is responsible for translation of the electrical schematic of the circuit into a geometrical representation of the layout. Circuit extractor block is the crucial part for making detailed circuit simulation by considering layout parasitic effects. Before the generation of the layout, layout advisor block speeds up the verification process by giving necessary information on different parasitic effects of the circuit.

One of the significant blocks of an ADA system is the PE tool which is generally based on optimization techniques supported by Spice-like circuit simulators and behavioral modeling, interactions between system level synthesis and circuit level synthesis [1]. At the circuit level, sizing of analog cells requires consideration of all variations of the parameters for each transistor. When it is assumed that a selected technology provides a lower bound of a transistor size is  $0.5 \ \mu m$  and an upper bound of 1mm and each time the sizes are swept by  $1 \ \mu m$  steps, it is easily calculated that 1000 data points for a single transistor are needed. As a result,  $1000^{10}$  data points must be calculated for a simple 10 transistor-analog block. Without a PE, numerous possibilities, most of which do not meet the performance specifications, are evaluated at the circuit level. Unfortunately, too much time will be wasted until a desired result can be obtained. Therefore, a crucial block which has to make fast approximations for circuit level and



Figure 1.1. Flowchart of analog design automation system.

allows some error levels within certain bounds has to be constructed so that the circuit level does not have to realize unfeasible results. Consequently, the utilization of PE speeds up the overall performance of an ADA system. PE gives hints for optimization or sizing to the circuit level. Thus, PE avoids unwanted sections of the design space and obtains a tradeoff between competing performance parameters.

#### 1.1. Purpose of the Thesis

The main objectives of this thesis are summarized as follows.

PE techniques are widely used by the designers in order to speed up the automation process, which has to satisfy the following two significant issues: reasonable accuracy and time-consumption. Definition of PE problem can be given as follows: PE must be able to handle any circuit topology, accepting some performance parameters,  $p_n$ , (gain, bandwidth, slew rate etc.) as variables, and estimating the remaining performance parameters. The problem is to find an expression or at least estimation for a  $p_n$ in terms of each other without having to calculate  $i_m$ , (width, length of the transistor, current, bias voltages etc.). There are several modeling approaches of analog blocks such as behavioral, simulation-based, knowledge-based, and manually-generated modeling (design planning) etc. Each approach has its own advantage and disadvantage. As mentioned before, the main tradeoff is the accuracy of the model and its execution time. In Table 1.1, a summary of comparison of PE approaches is given.

| PE Approach               | Advantage                        | Disadvantage                |
|---------------------------|----------------------------------|-----------------------------|
| Table-Based Approach      | Simple algorithm, useful for     | Requires large memory,      |
|                           | small blocks                     | huge tables                 |
| Auto-Generated Modeling   | Intelligent design, decreased    | May not be valid in all re- |
|                           | modeling time                    | gions of operation          |
| Manually-Generated Mod-   | Better model accuracy, less ex-  | Topology specific           |
| eling                     | ecution time                     |                             |
| Simulation-Based Template | The best model accuracy data     | Execution time, local       |
| Fitting or Knowledge-     | fitting, optimization, large in- | $\min/\max$ problems        |
| Based                     | formation in literature          |                             |

Table 1.1. Comparison of PE approaches

Although forming a look-up table (brute force) should be the simplest solution, the size of the table could be a problem. However, a simple algorithm is adequate for simple circuits such as output stages, R-C circuitry etc. Using the analytical equations of MOSFET models can be a good solution for execution time-accuracy tradeoff. If they are specific to a topology, analog designers have to derive new equations manually for a given new analog block and therefore, this approach becomes impractical for such complex blocks. However, it consumes less execution-time and it has an acceptable range of accuracy when compared to the other approaches. On the other hand, auto-generated modeling with analytical equations can be applied to any topology. Execution time can be reduced by applying intelligent design methods; nevertheless, their accuracy cannot be as good as that of the topology-specific ones because they may not be valid in all regions of operation (from weak to strong inversion). The best model accuracy can be achieved by simulation-based or knowledge-based approaches since they are provided by powerful Spice-like simulators and/or optimizers. In addition, a designer can find extensive information about different optimization and data fitting techniques for PE, but they suffer from long execution time. In this thesis, manually-generated PE approach is selected in order to utilize the advantage of less execution time. The accuracy is also adequate within certain error bounds. The disadvantages of the manually-generated models (this can be named as design planning) are that they are specific to the selected topologies and designers have to derive new equations manually for a given new analog block and thus, this approach becomes impractical for such complex blocks. Therefore, hierarchical decomposition of a complex system is applied so that the advantage of manual design of simple subblocks can still be exploited by determining the performance of each subblock. Moreover, an analog designer is still the key element and has the control of the design process.

PE tool helps the circuit level synthesis by giving hints about design plan of the circuit and eliminating unfeasible solutions. Since PDS of analog blocks refers to the elimination part, an approach is required for obtaining optimum solution set which can be derived from that PDS set. In the literature, PE tools widely use optimization techniques for optimum solution sets. In the past, single-objective optimization algorithms were used; however, local minima/maxima problems might be encountered. Therefore, multi-objective optimization is preferred and Pareto Front (PF) approach is the most desired one. The concept of Pareto optimality is the evaluation of optimization of one performance feature related to other performance features, i.e. a performance feature is called Pareto optimal if it can only be improved at the expense of deteriorating another performance feature [2]. In our study, there is no optimizer or simulator; only the definition of PF is used and then, an elimination method called dominance rule will be applied to PDS set. It will be shown that execution time is reduced when compared to other surveys. This approach can be applied to all performance parameters such as bandwidth, delay, input resolution, gain etc. In other words, it is not limited to the area-power tradeoff as proposed in some publications [7, 53, 54].

PE approach together with PF can also be used as a standalone tool as a topology selector. In the literature, there are several applications for that purpose. In this thesis,

the approach can advise us which topology must be preferred in an automation process. PE modeling and PF approaches use different algorithms and simulators. Most of them are only practical for small blocks. However, today's circuitry requires PDS of complex mixed-signal circuits. Therefore, researchers have decided to cope with the complexity problem by hierarchical decomposition of the large systems as mentioned before. This decomposition can be done either manually or automatically. Topology-specific design automation systems, whose subblocks are already extracted for the system, such as sigma-delta ADC, P-ADC, charge-pump phase-locked loop (CPPLL) etc. dominate the literature. For example, sigma-delta ADC has three main subblocks: integrator, comparator and digital to analog converter (DAC). In this thesis, PE tool can be implemented to specific systems whose subblocks are already found or whose subblocks are determined by an algorithm called *topology divider*.

The designer needs a fast way to get insight into capability of the whole system after the decomposition process. Since PF of subblocks are obtained, a composition algorithm must be applied to subblocks in order to achieve the PF of the main block. In the literature, composition is again based on an optimization procedure with powerful simulators, which is a very time consuming issue. Determining PE using the design space sets is a costly process. Instead, Pareto points can be used for that purpose. This is the advantage of PFs, especially when the systems start to become more and more complex. The approach suggested here, for the first time, is simple and fast: any operation which preserves the *domination rule* can be used in connecting the performance parameters.

#### 1.2. Background of the Problem

Literature survey is divided into two sections; modeling approaches of analog blocks and PF approach in PE tools. Definition of PE will be given, and then, modeling approaches will be investigated. After that, different approaches, which use PF concept, will be investigated.

#### 1.2.1. Performance Estimation Model

Let us assume  $P_i$  is a performance parameter of any analog circuit such as gain, bandwidth, slew rate, output resistance etc. and  $I_i$  is an independent circuit parameter influencing the performance such as current, width, length of MOSFET etc. The performance of the circuit can be defined by the following expressions:

$$P_{1} = f_{1} (I_{1}, I_{2}, \dots, I_{k})$$

$$P_{2} = f_{2} (I_{1}, I_{2}, \dots, I_{m})$$
...
$$P_{n} = f_{n} (I_{1}, I_{2}, \dots, I_{t})$$

The problem is to find an expression or at least estimation for  $P_i$  in terms of each other without having to calculate  $I_j$ . In general, the functions  $f_i$  are non-linear and most of the time it is very complicated to express them analytically. From this problem statement, therefore, it is obvious that developing a PE tool is a very difficult task.

One simple solution would be to form a look-up table for all combinations of the parameters  $I_j$  once and to search from the table the required performance criterion  $P_i$  combinations each time. Although forming a look-up table seems to be the simplest solution, the huge size of the table could be a problem. This method can be named as the brute force approach. There is an important point which has to be taken into account; the performance of analog blocks is strongly tied to the behavior of the transistors. Therefore, an accurate modeling has to be used; otherwise the performance or feasibility of the design cannot be evaluated correctly.

PE modeling can be grouped into two main areas: generating macromodels and design space exploration. Generally speaking, macromodels are extracted for one specific circuit that is fully designed at cell-level; as design variables are changed, the model must be re-extracted. Moreover, at system level synthesis, CPU time and engineering process may be costly. Unfortunately, extracting a macromodel is necessary but not a sufficient solution to the problem. Therefore, exploring the design space plays an important role in system-level analog synthesis. For instance, a method which gives the designer access to the design space boundaries of a circuit topology is proposed in [3]; multi-objective genetic optimization is employed in that study. The authors proposed a performance evaluation tool which uses a given set of design variables and calculates a set of circuit performances then combines them into an overall cost function. Optimization method plays an important role in minimizing the cost function while satisfying the given performance constraints. It is claimed that multi-objective optimization method is superior to the single-objective optimization since it, for instance, reduces the number of cost function evaluations. After gathering the data points, multivariate regression method is applied in order to construct the Pareto-optimal hyper surface. To automate this procedure, a framework called WATSON was developed. The main advantage of this approach is that no restriction is imposed on the shape of the design boundary space which can include non-convex portions and discontinuities. In addition to this, the algorithm finds the global optimum rather than getting stuck at a local optimum. On the other hand, computational effort can still be a problem since the Miller op-amp, for example, is modeled in five and a half hours on a Sun Blade 1000 machine.

Authors of [4] had proposed a method which is a mixture of a simulation-based design space exploration and macromodeling techniques. Posynomial format was used in modeling because they tried to guarantee convexity without any local optima. Authors claimed that no automatic posynomial recasting approach had been reported by the end of 2003. The proposed automation method is a simulation-based generation of posynomial performance model for the sizing of analog circuits. Basically, their approach generates numerical performance data samples for all circuit characteristics based on numerical SPICE simulations using accurate device models. The actual performance data are extracted from the simulation results using a set of performance extractor scripts after the SPICE simulation. Then, a posynomial template is fitted to these data. This fitting was applied in two ways: indirect-fitting method and directfitting method. The signomial and posynomial fitting techniques were implemented in Posynomial Response Surface Modeling (PRISM) prototype. The performance of model generation is given like this: CMOS OTA was taken as an example. PRISM was run on Intel Celeron 466 MHz running Linux. The analysis servers ran on 16 UNIX workstations ranging from a SUN Ultra SPARC I to HP B-1000 using their native OS in parallel manner. The simulations needed to obtain a full orthogonal hypercube of sampling points took 3 minutes. Although the execution time seems to be reasonable, TCP-based client-server system including sixteen workstations was used for analysis computation. In Figure 1.2, simplified flowcharts of [3] and [4] are illustrated so that these approaches can be comprehensible. As mentioned before, these approaches are provided with a simulator and/or an optimizer.

A different performance modeling approach which is based on support vector machines (SVM) is proposed in [5, 6]. Authors of [5] studied SVMs as a way of approximating the performance relation and compared 2-class SVM and one-class SVM solutions. On the other hand, least-squares support vector machine training type of regression was applied in [6]. It was shown that LS-SVM is advantageous over least-square regression and regression using polynomial models in terms of accuracy, prediction and reduction of free model tuning parameters. One can even make mod-



Figure 1.2. Simplified flowcharts of WATSON and PRISM.

els after optimization with all the simulation data from the run. Authors of [7] had implemented a system which generates data points from a tool called EsteMate and calculates power and area requirements using an optimal sizing which is performed by another tool called OPTIMAN (Optimization tool for analog system). EsteMate tool has two responsibilities: it uses a grid to sample the parameter specification space of an analog block and then, it sorts the selected samples. The samples are used as a training set for an ANN system which helps to represent a model between block parameters and power/area constraints. On the other hand, optimal device sizes for the given set of specifications and optimization targets are performed for each selected sample by OPTIMAN that uses global optimization algorithm (Simulated Annealing) followed by a local optimization algorithm. Actual optimal power/area values, which are reference for ANN system, are derived from optimal device sizes. Evaluating the ANN, accepted and rejected samples are grouped and rejected samples are eliminated automatically. Finally, power and area estimation are obtained. However, for a comprehensive estimation performance, the number of useful samples that is obtained from the parameter space is not enough because a much higher number of samples are required for all performance parameters. This usually leads to a time consuming procedure. For instance, 2500 samples were selected for class AB op-amp and calculation of the training set took 48 hours of CPU time (Sun Ultra1-170).

Symbolic analysis is also an attractive solution to PE problem. The authors of [8] had declared that they had proposed the first-ever tool to do template-free symbolic modeling, called Canonical Functional Form Expressions in Evolution (CAFFEINE). This method automatically generates compact symbolic performance models of analog circuits without any equation template. SPICE simulation data is taken as input space and genetic algorithm is applied as a means of traversing the space of possible symbolic expressions which are designed to be in the form of canonical functions. Symbolic models, which are the output of the system, provide a tradeoff between error and model complexity. In addition to this, it is claimed that the method demonstrates better prediction quality than posynomials. There have been proposals to leverage numerical performance modeling to improve efficiency. As we know, a performance model is a mathematical model relating the performance characteristics of a circuit to the design variables. Accuracy is maintained by constructing the performance model based on sample sets of SPICE simulation data. Authors of [8] compared some different modeling approaches such as polynomials, posynomials, genetic programming, feed forward neural networks, multivariate adaptive regression splines, support vector machines. The best prediction results are obtained by genetic programming, CAFFEINE tool, according to the authors of [8], however it is costly to construct. Multivariate adaptive regression spline is next-best. SVM and FFNN are all very close. Polynomial reference model is the worst approach. On the other hand, posynomials are easy to optimize on; however, the important point is reducing circuit simulation time. In Figure 1.3, simplified flowcharts of [7] and [8] are illustrated. It is a well-known fact that there is a



Figure 1.3. Simplified flowcharts of [7] and [8]

tradeoff between accuracy and speed in PE modeling. Although the approaches, which are based on the Spice-like simulators, are the best solution to the accuracy problem, simulating an entire system makes the execution time so high that it is not applicable to analog sub-blocks which have more than three or four transistors such as differential pair stage, advanced current mirror stages etc. Therefore, behavioral macromodeling techniques which represent the mathematical equivalent of the original block are deployed in order to simulate analog blocks in practical amounts of time. As mentioned before, macromodels are usually created by the same person who designs the original block. Many complex blocks are modeled by this approach. It is also possible to automatically build models such as using neural networks and splines before sizing. In this method, a target error level is tried to be obtained so that the model tries to capture whole mapping even for the infeasible regions in the performance space. Therefore, much more samples than necessary are taken.

Harjani and Shao [9] proposed a numerical and general macromodeling solution for the accurate prediction of feasibility and performance which have to be checked and evaluated for a given topology. They also mentioned that the simple solution was to try out all the possibilities; however, it is obvious that the design time increases exponentially. Their methodology, which uses general basis functions to perform macromodeling and general techniques for experimental design, is made of two macromodels: feasibility and performance macromodels. The behavioral specifications include the domain of the input variables and the constraints on the output responses. Experiment design techniques reduce the number of experimental runs, and thus they can save some execution time. Firstly, vertical binary search method is applied to the feasibility region which has boundary points. Once an adequate number of data points are collected, a macromodel can be built for the feasibility region. Then, static experiment factorial design technique is applied in order to measure the variable significance. While constructing a macromodel, some input variables which have more influence to the output response are taken into account whereas the rest are discarded from consideration. Even among the selected significant input variables, the degree of influence on the response is different. Therefore, in order to obtain savings in regression analysis, variable screening and variable grouping that is called dynamic volume slicing technique are implemented. The set of significant input variables, which are also grouped into layers, is obtained. Eventually, the method is ready to construct the macromodel by employing radial basis function model as regression analysis. Authors claimed that the performance macromodel of OTA was obtained in 524.34 seconds CPU time with 196 experiments. One major disadvantage of this method arises while obtaining the data points from the feasibility surface since it is a non-convex curve.

Wavelet basis functions have also been used for behavioral modeling of analog circuits. Approximating the input-output function of any topology by wavelet collocation method is proposed in [10] with multi-companding algorithm. The crucial part of the method is to decide on how to determine the companding function automatically for any circuit. Multi-companding algorithm consists of four steps. Firstly, the original modeling error distribution is obtained by estimating the original input-output function of an analog block. This procedure is called the adaptive scheme. Next, using the modeling error distribution, the nonlinear companding function is automatically generated. Thirdly, nonlinear-companding algorithm is used to compand the original input-output surface to control the error distribution continuously. Finally, adaptive scheme is applied again to the model in order to decrease the number of used wavelets. As a result, modeling is achieved with reasonably low error value with fewer wavelets. However, execution time of the companding function generation is high because of several optimization procedures. Therefore, this model seems to be useful when a large number of repeated analog blocks are present.

An alternative modeling methodology, the black box modeling, is proposed in [11]. Only the input-output behavior is needed to construct the model; however, in order to generate the data required to fit the model equations, the circuit has to be simulated by a compact model before the modeling procedure starts. Artificial Neural Network (ANN) technique is used for fitting. The theory behind the black box modeling is straightforward. Firstly, the circuit has to be simulated and several data points have to be obtained. Then, the output has to be written as a function of the states of the system and as an external drive signal in the form of two-port network representation. Finally, those functions build the model by fitting the measured or simulated output variables to the measured or simulated independent variables. This simple method could be useful instead of using a huge model. On the other hand, this method is not good enough as the desired estimator tool should accept any topology without any knowledge of the behavioral model. It would be a time wasting procedure to simulate the given topology to obtain the data points for ANN. In addition, ANN models reduce the error (difference between measured and simulated outputs) to an approximate level. However, the error level at some regions of the model could be so high and if the model operates at these regions, undesired results can be obtained. That is the main problem of the ANN models.

A transistor modeling tool [12] has been proposed using the combination of parameter extraction and simple analytical equations. Extraction of Early voltage is implemented in the X-Ray tool which is based on inversion level modeling. Blue Print, the second tool, combines the data points from X-Ray and analytical equations to size the transistor. One drawback of the tool is that Blue Print takes the first result found among multiple ones. An improved equation arrangement formulation could be the subject of another study. Moreover, analytical equations are topology-specific. For each transistor, early voltage has to be extracted for second order effects.

There is another analog PE tool which is not a PE in our definition of the concept, but it tries to make a prediction for the nonlinear performance equations. APE [13] contains interesting features and the most significant one is the hierarchical decomposition of the circuits to its analog subblocks. In APE, basic circuit elements such as resistors and capacitors are modeled analytically at the lowest level. Then, using these basic circuit elements, basic analog building blocks such as current mirrors and differential amplifiers are constructed. At the top of the system, op-amps are modeled.

Using analytical equations which are specific to the topology can be a solution to construct the circuit model in reasonable execution time. However, those equations are derived manually for a given analog block and for each new circuit; analog designers have to derive new equations. This is not a good solution for complex analog blocks because deriving an analytical equation according to the MOSFET models increases the set-up time of the system.

Proposed methods in [14]-[15] are based on using the analytical equations of the EKV model to estimate the behavior of some basic blocks and later on utilizing the block estimations to arrive at the final circuit PE. The estimator is coded in C++ programming language. Design equations are implemented into the code which takes the performance parameters as input and gives us the estimation results, solution sets, or optimum results as outputs. The main advantage of this method is its combined execution time and accuracy.

#### 1.2.2. Pareto Front Approach in PE

PE tool helps the circuit level synthesis by giving hints about optimization and eliminating unfeasible solutions. Since extracting PDS of analog blocks refers to the elimination part, an approach is required for obtaining optimum solution sets which can be derived from that PDS set. As a result, PF approach can be preferred because PF can handle all interacting parameters and constraints of an analog design. Assume that the optimization can be formulated as follows:

 $minf(x_d)$  subject to  $c(x_d) \ge 0$ 

where  $x_d$  representing the design parameters  $x_d = [x_{d,1}, \ldots, x_{d,k}]$  is subject to the sizing process. Appropriate design parameter values have to be calculated such that the performance is optimal. The constraints  $c(x_d) \ge 0$  describe AC/DC behavior of the transistor for proper operation, stability and robustness.

Multi-objective optimization techniques can be grouped into two: stochastic and deterministic methods, which are commonly found in the literature for the systematic generation of the Pareto points. In the past, only single Pareto point was calculated but today designers focus on extraction methods of many Pareto points.

Two different deterministic simulation-based methods are described in [16] for PDS. One of them is based on normal boundary intersection method and the other one is based on a linearized circuit model. The authors have mentioned that exploring design space is useful and a key factor for hierarchical sizing in terms of; topology selection and feasible system parameter space. In [17], detail of the normal boundary intersection method is given. It is based on circuit simulations including sizing, technological and topological constraints. In addition to this, sequential quadratic programming SQP algorithm is used to solve the NBI problem so that a designer can explore the performance capability of a circuit block and the trade-off analysis by evaluating the feasible performance space and the Pareto fronts of computing performances. Moreover, goal attainment method, which is described in [18], formulates the optimization problems instead of NBI. A sequential quadratic programming (SQP) algorithm is used to minimize each of these optimization problems independently. This new approach called Wave Front SQP algorithm solves the optimization problems simultaneously instead of solving each GA optimization problem independently with a SQP algorithm. Therefore, significant improvement has been achieved in the efficiency of Pareto front extraction.

In addition, the authors of [19]-[20] tried to implement the technological variations into the Pareto optimization problem. This approach uses the worst-case performance values of the Pareto points. In other words, it is a combination of Pareto-Front extraction and worst-case analysis. Variations of the given technology are modeled by statistical distributions. In conclusion, at each Pareto point, a realistic worst-case analysis is done.

Pareto optimization is also very popular in industrial applications. In real world applications, genetic programming is usually used in developing nonlinear models. However, high accuracy and high fitness models are very complex and very sensitive to minor changes. Furthermore, lower complexity requires time-consuming experiments. The solution is Pareto Front optimization, which is a significant approach for generating high-performance models. An example of this can be seen in [21, 22, 23].

PE modeling and PF approaches use different algorithms and simulators. Most of them are only practical for small blocks. However, today's circuitry requires PDS of complex mixed-signal circuits. Therefore, researchers have decided to cope with the complexity problem by hierarchical decomposition of the large systems as mentioned before. For example, in [24]-[25], charge-pump phase-locked loop system and a continuous-time delta sigma modulator are taken as a case study.

PE tools can also be used as a topology selector. For example, the authors of [26]-[27] create thousands of analog circuit topologies using simple blocks and obtain Pareto-optimal sized topologies from them. Experimental results show that 15 topologies were returned as an optimal solution among a couple of thousand topologies in approximately 5 days on 2 GHz Linux machine covering 100.000 search points. On the other hand, they reach the conclusion that a dozen op-amp topologies serve most purposes. Moreover, in [28], Liu et al. use PFs for searching the best solution from different topologies in a topology library of analog cells.

Furthermore, in [29], hierarchical synthesis of PF is studied considering reconfigurable circuits, but it has not been applied to any systems yet. PF composition is studied in [30]; however, only three possible operations are presented. Consequently, it should be observed that there is an increasing interest in PFs for PE of mixed-signal systems.

#### 1.3. Contributions of the Thesis

In this thesis, the main goal is to develop a PE tool for ADA systems. The proposed approach has three main sections: Library, PF extraction and PF composition, which are illustrated in Figure 1.4. There is also an optional block, namely, the topology divider which will be described in section 3.1. The approach is composed of



Figure 1.4. Sections of proposed PE tool

different methods but uses their most advantageous points. Novel contributions of this paper are summarized below.

• The Library: It provides PE design plans of analog blocks which are generated by EKV MOSFET model. Analytical equations of EKV model are preferred instead of BSIM3 model in order to generate a design plan of an analog block. Nowadays, MOSFET is modeled at the moderate inversion region in most of the design process. The advantage of EKV model is the continuity in all regions [31]. On the other, BSIM3 has some problems in the moderate inversion region. In addition, the number of EKV model parameters is less than the number of BSIM3 model

parameters so that EKV is more suitable for paper work analysis.

In our work, it has been observed that a maximum of three hours is adequate for collecting PDS estimation and the estimation results are verified within an acceptable error rate. The library, which has been used as an embedded PE tool for sigma-delta ADC design automation system and already proposed in [32] where the library also enables us select the best op-amp topology through the automation process, serves most purposes for mixed-signal systems.

If the given system does not exist in the library, it is partitioned into subblocks so that the designer has the advantage of observing the behavior of subblocks which may affect the performance of the whole system. In other words, performance tradeoffs of subblocks can be defined by PFs. In this thesis, the designer not only evaluates PE of subblocks but also attains knowledge about an approximate design including circuit and design parameters of each subblock. In this thesis, PE tool can be implemented into a specific system whose subblocks are already found or whose subblocks are determined by an algorithm called *topology divider*.

• PF extraction: PF extraction provides less execution time when compared to complex optimization algorithms. Since PDS is already explored, only a simple algorithm is applied to that data set and then Pareto set is extracted quickly. Each Pareto point refers to circuit and design parameter sets, which are saved by the library for circuit level synthesis. Illustration of PF extraction is given in Figure 1.5. PDS of any block in n dimensional space is considered as an input for the PF extraction. At the output, PF of the given block is extracted from the input by an algorithm in n-dimensional space.



Figure 1.5. Demonstration of PF extraction
• PF composition: A designer needs a fast way to get insight into the capability of the whole system after the generation of PF of subblocks, which is shown in Figure 1.6. Therefore, a fast composition process of PF is required. However, in the literature, this process is based on optimization procedure with powerful simulators, which is very time-consuming. Estimating overall performance using the design space sets is a costly process. Instead, Pareto points are used for that purpose. This is the advantage of PFs, especially when the systems start to become more and more complex. The approach suggested here, for the first time, is simple and fast: any operation which preserves the domination rule can be used in connecting the performance parameters. The composition process will be explained in details in Chapter 5.



Figure 1.6. Demonstration of PF composition

• Applications: Utilization of PE tool can be classified into two: single-level performance estimation (SL-PE) and multi-level performance estimation (ML-PE). If the given system is partitioned to its subblocks, each subblock constitutes a new level and the system is evaluated as a ML-PE, which strongly requires a composition process. PE tool can also be used as a topology selector in both SL-PE and ML-PE. For example, design automation system in [32] provides this property by comparing different op-amp performances in relation to the given input set for an optimum solution.

### 1.4. Overview of the Thesis

The thesis is organized as follows: Chapter 2 provides a short overview of the proposed methodology, which has three main parts; the design plan library, PF extraction and PF composition. Chapter 3 covers the design part. In this chapter, early stages of the tool are described starting from the topology divider algorithm. Then, EKV MOSFET model, which will be used in PE design-based modeling of analog blocks, is briefly explained. Details of the design procedure are introduced and some design examples are examined. Verification of the models is provided with case studies. In Chapter 4, the concept of Pareto optimality is introduced. Next, an algorithm, which is extracted from the definition of PF, is developed for n-dimensional space and it will be shown that the speed of PF extraction is faster than the proposed approaches in the literature. This will be demonstrated by different examples. Chapter 5 continues the Pareto concept by introducing the composition process. PF composition is applied to subblocks of a given system and effectiveness of the proposed method is verified by examples. Chapter 6 applies the proposed library, PF extraction and composition methods to different mixed signal systems. The library will be used in a sigma-delta automation system as an embedded PE tool. Then, a Flash ADC system is taken into account. The last example is the Pipeline ADC system automation. Performance of the thesis is discussed in this chapter. Finally, in Chapter 7, some conclusions and contributions of this thesis are summarized. Eventually, some recommendations for the future work are specified.

# 2. THE PROPOSED APPROACH

In this thesis, the main goal is to find a suitable tradeoff between accuracy and execution time of the approach. Therefore, a hybrid approach to the problem is proposed. Reasonable execution time, high accuracy property of topology specific analytical equations and/or topology-free design space exploration property will be used together. The flowchart showing the main blocks of the PE tool is given in Figure 2.1.

Any given topology (system), which can either be that of a well-known analog block or that of an application specific analog block, is taken as input to our system and whether or not its existence in the library is checked. If the topology exists in library, then the performance model is readily available and there is no need to waste time on modeling; if it does not, any approach summarized in literature survey can be applied to that topology. However, the execution time is still a problem. Therefore, the topology will be divided into its sub-blocks such as differential input stages, current mirrors, op-amps etc. and whether or not a sub-block's existence in library will be checked. If yes, one can use the predefined performance model in the library; if not, any behavioral modeling can be applied to the subblock.

If the given block has no subblocks, SL-PE is evaluated. PDS of the block is generated regarding input specifications and then PF extraction is considered. If the given system has subblocks, the procedure is called as ML-PE. In addition, determination of subblocks can be implemented manually or automatically. Assuming that a system's subblocks are already determined, there is no need to partition the system. In other words, topology divider block is bypassed. However, if there is no knowledge about the given system, "divide and conquer" method can be applied by topology divider block. As a result, topology divider can be considered as an optional block (block 1). If a subblock does not exist in library, any PE approach from literature can be applied to that subblock (block 2). In our research group, an optimizer, which was also developed for ADA systems, can be used for that purpose. As a result, execution time will be reduced as any approach will be applied to a sub-block rather than the main analog



Figure 2.1. Flowchart of the proposed PE.

block. It should not be overlooked that the construction time of modeling increases exponentially with the number of elements in analog blocks. Finally, composition of PF is taken into account. PF of each subblock is combined together in order to obtain the PF of the main block. Two or three-dimensional graphical representation of PFs, look-up tables for N-dimensional optimum solution sets and an approximate design of each subblock are obtained as an output of the PE tool.

# 3. DESIGN PLANS OF ANALOG BLOCKS

This chapter covers the strategy of design plans. Early stages of the tool are described starting from the topology divider algorithm. Then, EKV MOSFET model, which will be used in PE design-based models of analog blocks, is briefly explained. Details of the design procedure are introduced and some design examples are examined. Verification of the design-based models is provided with some case studies.

# 3.1. Topology Divider Block

Topology divider block (block 1) is executed with the C++ programming language. The details of the division process are illustrated in Figure 3.1. To start the process, Spice-like net-list file of a topology is constructed and used as a source file of the code. The code reads the transistor information such as its drain, gate and source connections. Then, basic analog components which are - for example, current sources and current mirrors - and differential input stages are searched. To give an example, assume that the code finds two nMOS transistors. If the gate connections of both transistors are common and if the drain and gate connections of one transistor are combined to each other, these two transistors are saved as simple nMOS current mirrors. After that all simple basic components are displayed so that initial information about the topology can be evaluated. At the second stage, cascode-type connection possibilities such as cascode current mirror, low-power current mirror, cascode differential input pair blocks etc. are scanned. Next, using the basic components and cascode-type connected elements, more complicated sub-blocks such as differential input stages, which consist of a current source, differential input pair and a current mirror, are explored. At the third stage, pre-defined blocks obtained in stage three are evaluated if their connection possibilities yield main analog blocks like Miller op-amp, folded cascode op-amp or cascode op-amp. Finally, the number of transistors in the net-list and that of those found in the code are compared to see if there is an inconsistency.



Figure 3.1. Flowchart of topology divider algorithm.

### 3.2. The Library: Generation of EKV-Based Analog Designs

The library is based on analytical equations of EKV MOSFET model to estimate the behavior of analog blocks such as current mirrors, differential input stage, output stage, op-amps, comparator etc. EKV-based analog designs are coded in C++ programming language. Model equations are implemented into a code which takes the performance parameters as input and gives the estimation results, solution sets, or optimum results as outputs. Moreover, an analog designer is still the key element and has the control of the design process.

In this approach, analytical equations are applied to the analog blocks. The 'divide and conquer' method, which means partitioning the main analog circuit into its analog sub-blocks according to its device library, can be utilized to this end. Modeling each analog sub-block with its simple equations analytically is easier to translate the input specifications of the main analog system to its sub-blocks so that the partition is at its lowest level. For example, a typical Miller (BTS) op-amp is composed of four analog building blocks, namely differential input pair, current mirror, common source driver and a current source. At one step higher level, a differential input pair and a push-pull output stage might be present. As a result, the estimator was coded in C++ programming language. Model equations were implemented with a code, which takes the performance parameters as input and gives the estimation results, solution sets, or

optimum results as outputs.

PE design plans in [32] have been improved by increasing the accuracy of solutions to analytical equations which result in  $g_m/I_D$  and  $g_{ds}$  parameters. Before going into details of the design algorithms of different analog circuit examples, EKV MOSFET model will be described briefly.

There are several MOSFET models in the literature, however, the fundamentals of the modeling concept stems from Solid state physics. In analog design, the expression given in Eq. 3.1 is quite simple but accurate for older technologies. When submicron parameters are considered, this expression is not adequate. In Figure 3.2, analog design and its verification process loop can be seen. Here, the problem is to define a model in order to minimize the trial-and-error simulations and the mismatch between the spaces.



Figure 3.2. Analog design loop.

$$I = \frac{1}{2} \cdot \frac{W}{L} \cdot \mu \cdot C_{ox} \cdot \left(V_{GS} - V_t\right)^2 \tag{3.1}$$

Most researchers believe that the expression which is given in (3.2) is the root of all MOSFET modeling approaches. In other words, the technique by which the total electron charge density  $(Q_n)$  at the junction and the voltage variation are derived determine the modeling type. In the expression,  $x_t$  represents deep of the surface space-charge region. J is the current density. Although the formulation seems easy to derive, the strong relation between  $Q_n$  and voltage variation has to be taken into account.

$$I = W \int_0^{x_t} J \cdot dx \equiv W \cdot \mu_n \cdot Q_n \cdot \frac{dV}{dy}$$
(3.2)

# 3.2.1. EKV MOSFET Model

The EKV MOSFET Model is a fully analytical model dedicated to the design and analysis of low-voltage, low-current analog circuits. EKV is the abbreviation of the surnames of Christian C. Enz, François Krummenacher and Eric A. Vittoz, who are from Swiss Federal Institute of Technology of Lausanne (EPFL). Continuity is the significant point of the model. All the large and small signal variables such as currents, the intrinsic capacitance, transconductance etc. are continuous in all regions of operation including weak inversion, moderate inversion and strong inversion. All equations of the model are also derived by the same approach. This means that the asymptotes of the inversion regions are first derived, and then the relevant large or small signal variables are normalized and linked using an interpolation function. The model is based on the inversion charge  $Q_i$ , which is controlled by the voltage difference  $V_P - V_{CH}$ .  $V_{CH}$  is defined as the difference between the quasi-Fermi potentials of the carriers as the channel voltage. The particular value of  $V_{CH}$  is defined as the pinch-off voltage, where the inversion charge is zero for a given gate voltage. Pinch-off voltage,  $(V_P)$  and the various operation points of transistors are then expressed in terms of voltages  $V_P - V_S$ ,  $V_P - V_D$  [33]. Pinch-off voltage and gate voltage equations are derived as in expression (3.3) and (3.4) with model parameters  $\psi$ ,  $\gamma$  and  $V_{TO}$ .

$$V_P = V_G - V_{TO} - \gamma \left[ \sqrt{V_G - V_{TO} + \left(\sqrt{\psi} + \frac{\gamma}{2}\right)^2} - \left(\sqrt{\psi} + \frac{\gamma}{2}\right) \right]$$
(3.3)

$$V_G = V_{TO} + V_P + \gamma \left[\sqrt{\psi + V_P} - \sqrt{\psi}\right]$$
(3.4)

 $I_D$  is derived using the charge sheet model with the assumption of constant doping in the channel and is expressed as the difference between  $I_F$ , forward current and a reverse component  $I_R$ . As mentioned before,  $I_D$  is also proportional to voltages  $V_P - V_S$ and  $V_P - V_D$  and a specific current  $I_S$ , which are given in (3.5) and (3.6) respectively. In expression (3.6), *n* represents the substrate factor.

$$I_D = I_S \cdot \left[ \ln \left( 1 + e^{\frac{V_D - V_S}{2U_t}} \right)^2 - \ln \left( 1 + e^{\frac{V_D - V_D}{2U_t}} \right)^2 \right] = I_S \cdot \left[ I_F - I_R \right]$$
(3.5)

$$I_S = 2 \cdot n \cdot \mu \cdot C_{ox} \cdot \frac{W}{L} \cdot U_t^2 \tag{3.6}$$

Inversion coefficient  $I_C$ , which is a numerical measure of MOS inversion, given below is a key parameter for design-based modeling. Selecting inversion coefficient and channel length operating plane of MOSFET can be formed. Adding the drain current, performance tradeoffs of a MOSFET can be evaluated easily.

$$I_{C} = \begin{cases} Strong \ Inversion: \ I_{C} > 10\\ Moderate \ Inversion: \ 0.1 < I_{C} < 10\\ Weak \ Inversion: \ I_{C} < 0.1 \end{cases}$$

Effective gate-source voltage  $(V_{GS} - V_T)$  and transconductance efficiency, which is given in (3.7), are dependent on the inversion coefficient, the substrate factor (typically 1.3-1.5 for bulk CMOS in  $I_C > 0.1$ ) and the thermal voltage. For example, at room temperature (T=300K) and n=1.4, inversion coefficient can give an idea about effective gate-source voltage. For instance, strong inversion starts from 225 mV ( $V_{GS} - V_T$ ) where  $I_C$  is 10.

$$\frac{g_m}{I_D} = \frac{1 - e^{-\sqrt{I_C}}}{nU_t \sqrt{I_C}}$$
(3.7)

Moreover, width of the transistor can also be calculated as it is given in the following expression;

$$W = \frac{I_D \cdot L}{2 \cdot n \cdot I_C \cdot \mu \cdot C_{ox} \cdot U_t^2}$$
(3.8)

EKV MOSFET model is based on surface potential model combined with inversion charge linearization. In order to construct a model of an analog circuit block, different technology parameters and foundries such as AMS 0.35  $\mu$ m, UMS 0.18  $\mu$ m etc must be applicable to the EKV MOSFET model parameters. Although EKV has more advantages than BSIM Models such as strong dependence on device physics and small number of parameters with very good accuracy, BSIM is widely used by the foundries. EKV parameters consist of intrinsic model parameters, temperature, noise, overlap and junction capacitance parameters.

PE may use any technology defined by a technology file which contains many EKV parameters. BSIM models are much more popular than EKV models; hence, it is sometimes difficult to find EKV models of every technology. This bottleneck of the EKV based designs may be overcome by some approaches available in the literature [34, 35]. If the foundry does not provide the EKV models, they can be derived from various measurements of fabricated transistors. However, this approach is not suitable most of the time since it is time and budget consuming. A more practical way is to use BSIM models to derive EKV models [34]. BSIM models provide accurate results under certain conditions. For these conditions, certain transistor sizes and inversion types for transistors should be selected. If these conditions can be managed for certain simulations, the results may be utilized like real measurement results. However, it is clear that the results will not be as accurate as foundry provided parameters. The important point to note in this respect is that the tolerance of PE is large enough to accept these variations. Since PE is not a synthesizer but an "estimator", the error coming from the model can be tolerated. In order to test the PE, two technologies were used. The first one is the 0.5  $\mu$ m technology which is delivered with EKV 2.6 [36]. The error between estimated values and simulation results is small and, thus, results prove that PE provides accurate calculations. However, real technology parameters

would present the effectiveness of the developed methodology. Thus, BSIM AMS 0.35  $\mu$ m parameters were converted to EKV parameters. In order to convert models, the approach given in [34]-[39] can be utilized.

In modeling one of the difficult and significant tasks is to estimate the output conductance, especially at the output stages. MOS intrinsic voltage gain depends on the  $g_{ds}$  quantity, and  $g_{ds}$  depends on Early voltage,  $V_A$ , which is a measure of normalized drain-source conductance given in (3.9).

$$g_{ds} = \frac{I_D}{V_A + V_{DS}} = \frac{\partial I_D}{\partial V_{DS}}$$
(3.9)

Early voltage is not a constant value for the process but depends upon the channel length, inversion level and drain-source voltage.

## **3.2.2.** $g_{ds}$ Interpolation

In this thesis, output conductance is estimated using the expression (3.9). One can take the derivative of (3.5) as formulated in (3.10) and the following expression can be obtained:

$$g_{ds} = \frac{I_S \cdot k \cdot \log(1+k)}{U_t \cdot (k+1)} \text{ where } k = e^{\frac{V_P - V_D}{2U_t}}$$
(3.10)

In the modeling procedure, it is observed that the difference between estimation results and simulation results of the output conductance value can increase unpredictably. According to the several simulation results, it is observed that unpredictable errors are encountered especially when gds is calculated smaller than  $10^{-6}(\Omega)^{-1}$  because exponential and logarithmic calculations can deviate rapidly for small changes in drain and pinch-off voltage differences. More complicated  $g_{ds}$  expression can be evaluated instead of (3.10) including channel length modulation, early voltage for large-geometry process and vertical dimensions, electric field components of gate and drain for small-geometry process. Applying such kind of  $g_{ds}$  approach results in accurate solutions, but it takes too much time to calculate each candidate's response. However, the main goal is to construct a fast and accurate estimation tool. Therefore, a new output conductance formulation which is based on Spice simulation and curve fitting is formed. The advantage of this approach is that the expressions include simple mathematics. It is known that gds depends on drain current and early voltage. Moreover, early voltage is a function of length of the transistor (L). It is proposed that the modified  $g_{ds}$  formulation is a function of both L and drain current.  $g_{ds}$  estimation procedure is given in Figure 3.3. First, a single nMOS circuit is set up. Bias voltage (strongly depends on  $V_P$  and  $I_C$ ),



Figure 3.3.  $g_{ds}$  estimation procedure.

L and W are defined as parametric variables for Spice simulations. Bias voltages are arranged such that effective gate-source voltages search through the all inversion region from weak to strong. In addition, L is varied from Lmin to 8Lmin. This input space is automatically calculated using a C++ based program called  $lis_file_generator$ . After the simulation, output file (.lis file for H-spice) is obtained. This output file is given as an input for a smart C++ based parser-like program called  $mos_reader$  in order to read the necessary information inside the output file such as drain current, Lvalue and output conductance. Early voltage value for each solution is calculated using (3.9) (neglecting drain-source voltage) so that a solution matrix, which is illustrated in (3.11), is obtained.

$$S = [g_{ds}, I_D, L, V_A]$$
(3.11)

The matrix is divided into 4 regions due to the inversion level selection as weak, moderate, strong and very strong inversion. In each inversion region, a polynomial  $(4^{th}$  degree) is fitted to  $g_{ds}$  using the current and early voltage values as given in (3.12).

$$g_{ds} = \frac{I_D}{4^{th} \ poly(L)} \tag{3.12}$$

The maximum error for the fitting procedure is calculated as 15 percent, which is generally observed near borders of the inversion regions. Finally, this modified output conductance prediction formulation is embedded into our estimation tool. The same approach is applied to the PMOS circuit. Details of the EKV MOSFET model can be found in [40]-[46].

# 3.3. Design Examples

The library has PE design plans of basic analog subblocks such as differential input stage, output stage, current-mirror etc.; different op-amp topologies such as Miller, Folded-Cascode and Telescopic; a comparator, a low pass filter and an F-ADC as mentioned before. In this section, details of design-based PE models of some analog blocks are given together with validation and verification of the designs.

#### 3.3.1. Differential Input Stage

First, a differential input subblock, which is given in Figure 3.4, is taken into account as an example. EKV-based analytical equations are used for constructing design plan of the subblock. The design procedure takes some performance parameters as input and estimates the remaining performance parameters together with the design and circuit parameters. Assume that, gain, bandwidth and slew rate are given as an input set. This input set determines some constraints such as a minimum current value, which flows at the output of the block, is calculated by the slew-rate and the output capacitance. In Table 3.1, the input set and variable design parameters are given. Design space is estimated with respect to the following parameter setup; inversion



Figure 3.4. Schematic of differential input stage.

coefficient of  $M_3$ ,  $M_5$  and  $M_7$  are searched from moderate to strong inversion and L values of  $M_3$  and  $M_5$  are varied from 0.35  $\mu$ m to 2.4  $\mu$ m. First, minimum current

Table 3.1. Input set and variable design parameters

| Input Set       | Gain            | BW    | SR                          |
|-----------------|-----------------|-------|-----------------------------|
| Variable Design | L values of     | $C_L$ | $I_C$ values of             |
| Parameters      | $M_3$ and $M_7$ |       | $M_3, M_5 \text{ and } M_7$ |

(output current) value is determined using the given SR and output capacitance  $(C_L)$ values as mentioned before. For instance, SR is given as 5 V/ $\mu$ s and the  $C_L$  is varied from 0.2 pF to 2 pF. As a result, multiple minimum current values are obtained. Next, minimum transconductance of  $M_5$  is calculated using the selected minimum current value and inversion coefficient of  $M_5$ . Again, multiple transconductance values are obtained since  $I_{C5}$  is also varied from weak to strong inversion. Each transconductance  $(g_m)$  and current values determine the bandwidth of the subblock with respect to the given gain. Bandwidth is calculated at the output node because the output pole (Drain of  $M_3$ ) is often dominant. However, the calculated bandwidth must be higher the given bandwidth. If not, the minimum current (output current) is slightly increased until the desired bandwidth is achieved. After that, multiple gate voltages of  $M_3$  (output voltage) and  $M_5$  are calculated using different  $I_C$  values of both transistors. Width values of the transistors can be obtained by currents, inversion coefficients and different L values of the transistors as given in (3.8). Subsequently, output transconductance  $(g_{ds})$  values of  $M_3$  and  $M_5$  are calculated. Since  $g_m$  and  $g_{ds}$  values are calculated, gain of the differential pair can be obtained. This gain must be higher than or equal to the given input gain. In addition, all W values of the transistors are compared whether or not their values are smaller than the WMIN specified by the selected technology. Values of remaining performance parameters such as power and area can be calculated using the estimated values of circuit parameters. Consequently, PDS of the subblock can be extracted for the given input set.

Assume that gain value is 40 V/V and bandwidth is 10 kHz. Slew rate is given as 4 V/ $\mu$ s and output load as 1 pF. After half an hour, all circuit parameters and remaining performance parameters are extracted and PDS is formed. PDS has 137984 candidates which meet the given specifications. This solution set can be extended by increasing the search space of inversion coefficients (from sub-threshold to very strong inversion region) and/or by varying L values higher than 2.4  $\mu$ m. A designer can select a subset of the design space for different operation regions and optimum solution sets can be extracted for the desired performance parameters.

#### 3.3.2. Miller op-amp

Details of lead compensated Miller op-amp which is illustrated in Figure 3.5, is considered. The block, which is divided into two subblocks: differential input stage and output stage, is also named as basic two stage (BTS) op-amp.

Places of the dominant and the non-dominant poles are significant for frequency calculations. Non dominant pole and dominant pole are located at the output node (Drain of  $M_1$ ) and at the drain of  $M_3$ , respectively. In addition, overall gain of the op-amp consists of gain of input stage,  $A_{V1}$ , and gain of output stage,  $A_{V2}$ . The overall gain can be distributed to its subblocks; nevertheless, the total gain should be kept constant. For instance, assume overall gain is 2000 V/V. If  $A_{V1}$  is selected as 40 V/V,  $A_{V2}$  must be 50 V/V, or vice versa. Varying gain values of the subblocks, different solutions satisfying the given input specifications can be found. In other words, the gain pairs will be analyzed to see which of them give the optimum solution. Input space consists of  $A_V$ , BW,  $C_L$  and SR. Design space exploration strongly depends on inversion coefficients, length of the transistors, compensation capacitance and drain current. Electrical constraints (e.g. stability control) and geometrical constraints (e.g. W values) are being controlled during the design process. Moreover, it is crucial to



Figure 3.5. Schematic of Miller op-amp.

know the position of the non-dominant pole for stability. Generally, non-dominant pole  $(f_{nd})$  is forced to have a value at least three times larger than the gain-bandwidth product (GBW). Also, the compensation capacitor  $(C_C)$  is chosen to be at least three times larger than the gate capacitance of  $M_1$   $(C_{n1})$  and it has to be smaller than the output capacitance  $(C_L)$ . In Table 3.2, the input set and design parameters of the op-amp are given. Design procedure starts from the output stage. Firstly, GBW

| Input Set                   | Gain            | BW    | SR              | $C_L$                 |
|-----------------------------|-----------------|-------|-----------------|-----------------------|
| Variable Design L values of |                 | $C_L$ | $I_C$ values of | $A_{V1}$ and $A_{V2}$ |
| Parameters                  | $M_1$ and $M_5$ |       | $M_1$ and $M_5$ |                       |

Table 3.2. Input set and variable design parameters of the Miller op-amp

product is calculated using the given performance parameters. Non-dominant pole is placed three times away from the GBW. An analog designer can derive the slew rate condition for the given op-amp and write it as in (3.13). Current flowing from  $M_2$  and  $M_5$  are named as  $I_{DOUT}$  and  $I_{DIFF}$ .

$$SR_{ext} = \frac{I_{DOUT}}{C_L} and SR_{int} = \frac{I_{DIFF}}{C_C}$$
 (3.13)

Minimum output current is calculated for the given slew rate and output capacitance so that the external slew rate is satisfied. Then, multiple transconductance values of  $M_1$  are obtained since  $I_{C1}$  is also varied from weak to strong inversion. Next, values of non-dominant frequency are calculated using (3.14). It is assumed that  $C_{n1}$  is three times smaller than  $C_C$ .

$$f_{nd} = \frac{g_{m1}}{2\pi C_L} \cdot \frac{1}{1 + \frac{C_{n1}}{C_C}}$$
(3.14)

Calculated- $g_{m1}$  value must be checked in order to control the position of the nondominant pole; if not satisfactory, a new current value which of course is higher than the calculated one must be searched until stability of the op-amp is satisfied. After that, values of input voltages (gate-source voltages) of  $M_1$  are calculated using inversion coefficient values. Width values of  $M_1$  can be obtained by currents, inversion coefficients and L values of the transistor as given in (3.8). Output voltage value is expected to be at the midpoint of the power supplies for maximum output swing. As a result, width values of  $M_1$  are obtained by the ratio as it is given in (3.15).

$$W_2 = W_1 \cdot \frac{KP_N}{KP_P} \tag{3.15}$$

Gate-source voltage of  $M_2$  is obtained since  $I_{DOUT}$ , W and L values are known. This means that value of inversion coefficient of  $M_2$  is obtained. The last design step of the output stage is to calculate values of the output transconductance of both output transistors. Internal slew rate condition is taken into account for calculating current value of the differential pair. In other words, multiple minimum current values are obtained for input stage since compensation capacitance has multiple values. For instance,  $C_C$ is varied from  $C_L/10$  to  $C_L/2$ . Transconductance values of  $M_5$  are determined related to calculated current. Then, dominant pole is considered. 3dB frequency is calculated using (3.16) and whether or not it's satisfactory is checked. If the calculated frequency is lower than the given 3dB frequency, the current value of the input stage is increased until the expected frequency is obtained.

$$|A_V| = 1 \Rightarrow GBW = \frac{g_{m1}}{2\pi C_C} \tag{3.16}$$

Gate-source voltage and width values of  $M_5$  are calculated using the new current value of input stage together with the inversion coefficient of  $M_5$ . Output voltage value of the input stage is equal to the input voltage value of output stage. In other words, drainsource voltage of  $M_3$  is equal to gate-source voltage of  $M_1$ . Since the input voltages of  $M_3$  are known, inversion coefficient and width values of the transistor can be calculated. Current of  $M_7$  is double the current of  $M_5$ . Also, gate-source voltage of  $M_7$  is equal to the gate-source voltage of  $M_2$ . Since current,  $I_C$  and L values of  $M_7$  are found, W values can be obtained. Finally, output transconductance values of  $M_5$  and  $M_3$  are calculated resulting in gain values of input stage. All width values of the transistors can be checked, if their values are smaller than the WMIN which is defined by the selected technology. In addition, calculated gain values are compared to the given value. Unfeasible values are eliminated. The flowchart of the modeling procedure of Miller op-amp is illustrated in Figure 3.6. All feasible solutions are collected to create a lookup table. Using the values of current and transistor width, power and area can be obtained. Now, all possible candidates which satisfy the input space requirements have been obtained; however the optimum solution set or Pareto curves which are very useful to the end user still need to be constructed and this will be the subject of the following chapter. In order to validate the design, estimation results are simulated in



Figure 3.6. Flowchart of Miller op-amp design plan.

H-Spice such that estimated values of W, L, biasing voltages, capacitances and resistors are included into a net-list file (input for simulation) and then design and performance parameters are read from the output file of the simulator for comparison. Assume input space of the op-amp is given in Table 3.3. PDS of the op-amp was obtained in

Table 3.3. Input space of the op-amp

| Input Set  | Gain                                 | BW                  | SR                    | $C_L$                   |
|------------|--------------------------------------|---------------------|-----------------------|-------------------------|
|            | $2000 \mathrm{~V/V}$                 | 10 kHz              | $4  { m V}/\mu{ m s}$ | $2 \mathrm{pF}$         |
| Key        | $L \ values \ of$                    | $C_C$               | $I_C$ values of       | $A_{V1}$ and $A_{V2}$   |
| Parameters | $M_1$ and $M_5$                      |                     | $M_1$ and $M_5$       | $A_{V1} = r \cdot A_V$  |
|            | $0.35~\mu{\rm m}$ to $2.8\mu{\rm m}$ | $C_L/10$ to $C_L/3$ | 0.1  to  1000         | $A_{V2} = A_V / A_{V1}$ |
|            |                                      |                     |                       | $r = 0.02 \ to \ 0.1$   |

40 minutes on Pentium 2.5GHz machine for the given input space. Three arbitrary candidates were chosen from PDS. In other words, these solutions were selected from different regions of operation to verify the accuracy and continuity of the design plan. The generated solutions were examined by H-spice simulations and their accuracy was compared with the estimated values. Circuit parameters of the selected candidates are given in Table 3.4 and their comparison is shown in Table 3.5. It was observed that gain errors remained below 3% percent and bandwidth errors below 10%. In terms of

| Parameters         | $C_C R_C$        | L values                | $W_1$     | $W_2$     | $W_3$     | $W_5$     | $W_7$     | $V_{BIAS}$ |
|--------------------|------------------|-------------------------|-----------|-----------|-----------|-----------|-----------|------------|
|                    | $(pF) (k\Omega)$ |                         | $(\mu m)$ | $(\mu m)$ | $(\mu m)$ | $(\mu m)$ | $(\mu m)$ | (V)        |
| $1^{st}$ candidate | 0.3 3.5          | $L = L_{min}$           | 16.45     | 59.5      | 1.05      | 9.45      | 5.95      | 2.5        |
| $2^{nd}$ candidate | $0.5 \ 2.7$      | $L_{1,2,3,7} =$         | 13.30     | 47.95     | 2.45      | 8.50      | 18.55     | 2.48       |
|                    |                  | $L_{min}$               |           |           |           |           |           |            |
|                    |                  | $L_5 = 6 \cdot L_{min}$ |           |           |           |           |           |            |
| $3^{rd}$ candidate | 0.9  1.7         | $L_{1,2,3,7} =$         | 13.30     | 47.95     | 5.60      | 15.05     | 41.30     | 2.48       |
|                    |                  | $L_{min}$               |           |           |           |           |           |            |
|                    |                  | $L_5 = 8 \cdot L_{min}$ |           |           |           |           |           |            |

Table 3.4. Estimated circuit parameters for H-Spice

Table 3.5. Comparison of estimation and simulation results

| $1^{st}$ candidate | $I_{dout}$ | $I_{dif}$ | $g_{out1}$         | $g_{out2}$         | $V_{DS}$ | f3db  | Gain |
|--------------------|------------|-----------|--------------------|--------------------|----------|-------|------|
|                    | $(\mu A)$  | $(\mu A)$ | $(\mu\Omega)^{-1}$ | $(\mu\Omega)^{-1}$ | (V)      | (kHz) | (dB) |
| Estimation         | 73.7       | 3.2       | 15.5               | 0.9                | 1.65     | 10    | 71   |
| Simulation         | 67         | 1.2       | 18.8               | 0.94               | 2.1      | 11    | 69   |
| $2^{nd}$ candidate | $I_{dout}$ | $I_{dif}$ | $g_{out1}$         | $g_{out2}$         | $V_{DS}$ | f3db  | Gain |
|                    | $(\mu A)$  | $(\mu A)$ | $(\mu\Omega)^{-1}$ | $(\mu\Omega)^{-1}$ | (V)      | (kHz) | (dB) |
| Estimation         | 81         | 14        | 17                 | 1.53               | 1.65     | 10    | 68.5 |
| Simulation         | 77         | 12        | 17.3               | 1.3                | 1.76     | 9     | 67   |
| $3^{rd}$ candidate | $I_{dout}$ | $I_{dif}$ | $g_{out1}$         | $g_{out2}$         | $V_{DS}$ | f3db  | Gain |
|                    | $(\mu A)$  | $(\mu A)$ | $(\mu\Omega)^{-1}$ | $(\mu\Omega)^{-1}$ | (V)      | (kHz) | (dB) |
| Estimation         | 81         | 31        | 17                 | 3.2                | 1.65     | 10    | 66.7 |
| Simulation         | 72         | 27        | 17.4               | 2.7                | 2        | 10    | 65   |

DC parameters, value of the output node, which was desired to be 1.65V, simulation results show that the variation in this value is bounded by  $\pm 350$  mV. In addition, current errors remained below 10%.

A different design case, whose parameters were taken from a reference design in [47], was taken into account. In this example, PDS of high speed Miller op-amp was explored. The aim was to find approximate design parameters which were equal to or better than the expected values of the reference design. Input set and the expected values are shown in Table 3.6. PDS was obtained in 30 minutes on Pentium 2.5 GHz

| Input Set         | Gain               | BW                 | $S\!R$                 | $C_L$            |
|-------------------|--------------------|--------------------|------------------------|------------------|
|                   | 4000  V/V          | 100 kHz            | $220~{ m V}/\mu{ m s}$ | $5 \mathrm{pF}$  |
| Design Parameters | $I_{DIF}$          | $I_{DOUT}$         | $C_C$                  | $W_1$            |
| Expected Values   | $0.63 \mathrm{mA}$ | $2.3 \mathrm{~mA}$ | $2.5 \ \mathrm{pF}$    | $417~\mu{\rm m}$ |

Table 3.6. Input set and solution of the reference design for the high speed op-amp

machine. Expected current values were searched from the design space and approximate ones with a smaller compensation capacitance and W values were selected. The selected solution was examined by H-spice simulations and its verification was shown by comparing with the selected value. In Table 3.7, selected parameters and their comparison are given. The results in Table 3.7 show similar accuracy with the previous

Table 3.7. Selected solution of the high speed op-amp and its verification

| Circuit                  | $C_C$        | $R_C$        | L                              | $W_1 W_2$                       | $W_3 W_3$   | $W_7$            | $V_{BIAS}$   |
|--------------------------|--------------|--------------|--------------------------------|---------------------------------|-------------|------------------|--------------|
| Parameters               | (pF)         | $(\Omega)$   | $(\mu { m m})$                 | $(\mu m)$                       | $(\mu m)$   | $(\mu m)$        | (V)          |
| Selected                 | 2.3          | 50           | $L_{1,2,3,7} = L_{min}$        | 368 1365                        | 85  512     | 675              | 2.48         |
| Solution                 |              |              | $L_5 = 2L_{min}$               |                                 |             |                  |              |
| Design                   | $I_{DOUT}$   | $I_{DIF}$    | $g_{out1}$                     | $g_{out2}$                      | $V_{DS}$    | f3db             | Gain         |
|                          |              |              |                                |                                 |             |                  |              |
| Parameters               | (mA)         | (mA)         | $(\mu\Omega)^{-1}$             | $(\mu\Omega)^{-1}$              | (V)         | $(\mathrm{kHz})$ | (dB)         |
| Parameters<br>Estimation | (mA)<br>2.25 | (mA)<br>0.53 | $\frac{(\mu\Omega)^{-1}}{423}$ | $\frac{(\mu\Omega)^{-1}}{64.1}$ | (V)<br>1.65 | (kHz)<br>100     | (dB)<br>73.5 |

example. Another design example, which was significant in order to validate designbased model in weak inversion region, was again taken from a reference design in [47]. Input parameters were given to the design procedure and PDS of the low speed op-amp was obtained. Input set and the expected values are shown in Table 3.8. A better solution was selected from PDS and was simulated by H-spice. Estimated parameters and comparison results are given in Table 3.9. The results show that the design strategy can be used in our library for PE in ADA systems. Execution time and accuracy of the design plan perform a suitable solution.

| Input Set         | Gain                | BW                  | SR                      | $C_L$           |
|-------------------|---------------------|---------------------|-------------------------|-----------------|
|                   | $2000 \mathrm{V/V}$ | $500 \mathrm{~kHz}$ | $0.1  { m V}/\mu{ m s}$ | $5 \mathrm{pF}$ |
| Design Parameters | $I_{DIF}$           | $I_{DOUT}$          | $C_C$                   | $W_1$           |
| Expected Values   | $1.6~\mu\mathrm{A}$ | $2.7~\mu\mathrm{A}$ | $2.5 \ \mathrm{pF}$     | $5~\mu{ m m}$   |

Table 3.8. Input set and solution of the reference design for low speed op-amp

Table 3.9. Selected solution of the high speed op-amp and its verification

| Circuit    | $C_C$            | $R_C$      | L                       | $W_1 W_2$          | $W_3 W_3$  | $W_7$            | $V_{BIAS}$ |
|------------|------------------|------------|-------------------------|--------------------|------------|------------------|------------|
| Parameters | (pF)             | $(\Omega)$ | $(\mu { m m})$          | $(\mu m)$          | $(\mu m)$  | $(\mu m)$        | (V)        |
| Selected   | 2.5              | 13.5       | $L_{1,2,3,7} = L_{min}$ | 3.85  13.65        | 2.1  10.85 | 13.65            | 2.58       |
| Solution   |                  |            | $L_5 = 2L_{min}$        |                    |            |                  |            |
| Design     | $I_{DOUT}$       | $I_{DIF}$  | $g_{out1}$              | $g_{out2}$         | $V_{DS}$   | f3db             | Gain       |
| Parameters | (mA)             | (mA)       | $(\mu\Omega)^{-1}$      | $(\mu\Omega)^{-1}$ | (V)        | $(\mathrm{kHz})$ | (dB)       |
| Estimation | $\overline{3.6}$ | 1.6        | 1.6                     | 0.55               | 1.65       | 100              | 66         |
| Simulation | 3.7              | 1.6        | 1.3                     | 0.41               | 1.84       | 80               | 70         |

## 3.3.3. Telescopic (Cascode) op-amp

Another common structure is the Telescopic (Cascode) op-amp which is given in Figure 3.7. Inversion coefficients of transistors  $M_5$ - $M_7$  and L values of PMOS and NMOS transistors are taken as the key parameters for the design procedure. Input performance parameters are gain, bandwidth, output capacitance and slew rate. Constraints are determined by the gate-source voltage, drain-source voltage, W and Lvalues. In addition, output estimations must fulfill the given gain and bandwidth requirements. First, minimum current value is calculated from the given SR and the output capacitance. Then, minimum transconductance value of  $M_7$  is calculated. f3dB, which is compared with the given BW, is calculated using the minimum gm value. If the given BW is not satisfied, minimum current value will be increased until the desired BW value is reached. When the given BW is satisfied, the corresponding current value is used for the rest of the design. Next, multiple values of  $W_7$ ,  $W_9$  and bias voltages of current source and differential pair transistors are calculated with respect to the inversion coefficients. Output swing is not as good as the Miller op-amp because of the



Figure 3.7. Schematic of Cascode op-amp.

cascode structure. Therefore, output voltage is set to at around 1.2 V.  $V_{DS}$  values of cascode current mirrors are set to be equal. Then, W and bias voltage values of NMOS transistors are calculated. After that, output transconductance values are calculated. Finally, output resistance of the op-amp can be obtained. The calculated gain must be higher than or equal to the given input gain. All values of the gate-source voltages are checked to ensure that they are higher than their threshold voltages and all W values of the transistors are checked to ensure that they are not smaller than the minimum W value given by the technology. Flowchart of the design procedure procedure is illustrated in Figure 3.8. Assume that gain value is 2000 V/V, bandwidth is 10 kHz, slew rate is 4 V/ $\mu$ s and output capacitance is 2 pF. Input parameters were given into the model and PDS of the op-amp was obtained in 45 minutes on Pentium 2.5 GHz machine. Three candidates were chosen from different regions of PDS to verify the accuracy of the model. The generated solutions were examined by H-spice simulations and their accuracy was compared with the estimated values. Circuit parameters of the selected candidates are given in Table 3.10 and their comparison is shown in Table 3.11.

It was observed that estimated bandwidth values deviated from the desired values. However, it should be noted that, they always stay on the safe side. In other words,



Figure 3.8. Flowchart of the Telescopic op-amp.

Table 3.10. Input space and estimated circuit parameters for H-Spice

| Parameters | L values         | $W_1$     | $W_3$     | $W_5$     | $W_7$     | $W_9$     | $V_{BIAS1}$ | $V_{BIAS2}$ |
|------------|------------------|-----------|-----------|-----------|-----------|-----------|-------------|-------------|
|            |                  | $(\mu m)$ | $(\mu m)$ | $(\mu m)$ | $(\mu m)$ | $(\mu m)$ | (V)         | (V)         |
| $1^{st}$   | $L_n = 2L_{min}$ | 1.05      | 4.9       | 121.4     | 121.4     | 249.5     | 2.57        | 1.47        |
| candidate  | $L_p = 2L_{min}$ |           |           |           |           |           |             |             |
| $2^{nd}$   | $L_n = 4L_{min}$ | 3.85      | 22        | 3.85      | 23.8      | 47.25     | 2.42        | 0.97        |
| candidate  | $L_p = 2L_{min}$ |           |           |           |           |           |             |             |
| $3^{rd}$   | $L_n = 7L_{min}$ | 4.2       | 23.1      | 2.45      | 85        | 4.55      | 2.47        | 0.9         |
| candidate  | $L_p = 2L_{min}$ |           |           |           |           |           |             |             |

acceptable gain-bandwidth product is obtained. In terms of DC parameters, current errors remained below 4%.

# 3.3.4. Folded (Cascode) op-amp

The schematic of the op-amp is given Figure 3.9. Inversion coefficients of the transistors  $M_3$ - $M_7$ - $M_9$ - $M_{11}$  and L values of PMOS and NMOS transistors are taken as key parameters for the design procedure. Constraints are again determined by the gate-source voltage, drain-source voltage, W and L values. In addition, output estimations

| $1^{st}$ candidate | $I_D$     | $g_{out1}$         | $g_{out3}$         | $g_{out5}$         | f3dB  | Gain |
|--------------------|-----------|--------------------|--------------------|--------------------|-------|------|
|                    | $(\mu A)$ | $(\mu\Omega)^{-1}$ | $(\mu\Omega)^{-1}$ | $(\mu\Omega)^{-1}$ | (kHz) | (dB) |
| Estimation         | 15        | 0.32               | 0.64               | 2.34               | 10    | 84.3 |
| Simulation         | 15.5      | 0.3                | 0.61               | 2.37               | 12    | 84.5 |
| $2^{nd}$ candidate | $I_D$     | $g_{out1}$         | $g_{out3}$         | $g_{out5}$         | f3dB  | Gain |
|                    | $(\mu A)$ | $(\mu\Omega)^{-1}$ | $(\mu\Omega)^{-1}$ | $(\mu\Omega)^{-1}$ | (kHz) | (dB) |
| Estimation         | 20        | 0.11               | 0.23               | 3.99               | 10    | 69.2 |
| Simulation         | 20.6      | 0.12               | 0.25               | 2.2                | 7     | 73.5 |
| $3^{rd}$ candidate | $I_D$     | $g_{out1}$         | $g_{out3}$         | $g_{out5}$         | f3dB  | Gain |
|                    | $(\mu A)$ | $(\mu\Omega)^{-1}$ | $(\mu\Omega)^{-1}$ | $(\mu\Omega)^{-1}$ | (kHz) | (dB) |
| Estimation         | 21        | 0.12               | 0.24               | 2.66               | 10    | 66.8 |
| Simulation         | 20.5      | 0.12               | 0.25               | 1.77               | 7     | 72.3 |

Table 3.11. Comparison of estimation and simulation results of Cascode op-amp

must fulfill the given gain and bandwidth values. Non-dominant pole is evaluated during the design procedure for the stability control. First, minimum current value of



Figure 3.9. Schematic of Folded Cascode op-amp.

 $M_2$  is calculated using the given slew-rate and output capacitance. Since the GBW is known, minimum transconductance value of  $M_2$  is calculated because the dominant pole is located at the differential input pair. Then, current values are varied from

the minimum value to a maximum value which can be determined by the designer. Next, gate-source voltages of  $M_2$  are calculated using  $I_C$  values, and then, W values of the transistor are obtained. Input voltage  $(V_{IN})$  is connected to ground. This kind of ground rail is widely used in such applications that only one single supply line is required.

Moreover, p type folded cascode and n type folded cascode can be connected in parallel in order to achieve full rail to rail range. Although it is not a necessity, all currents in the input and cascode devices are set to the same value in order to avoid artifacts. Gate-source voltage of  $M_9$  is set to a specific value, 2.4 V. As a result, inversion coefficient of the transistor is obtained and used in calculation of the Wvalues of  $M_9$ . Current value of  $M_2$  is set to equal value of the cascode block current as mentioned before. Since the current is mirrored, there is no way for DC current to flow away. Therefore, the current sources have double-valued current. Next, DC analysis of current sources is completed with respect to different  $I_C$  values. After that, non dominant pole is calculated using  $M_3$  parameters and it is set to a proper place for the stability. Drain voltages of current sources and output voltage values are limited with respect to minimum and maximum output voltage values.

Furthermore, DC analysis of the cascode block is completed. Subsequently, output transconductance of all transistors are calculated. Since  $g_m$  and  $g_{ds}$  values are obtained, gain values of the op-amp can be found. These gain values must be higher than or equal to the given input gain. In addition, all gate-source voltages are controlled to ensure that they are higher than their threshold voltages and W values of all transistors are controlled to ensure that they are not smaller than the minimum W value given by the technology. Same input parameters are given as in the other op-amp topologies and eventually, estimated results show that the model can be used in our library for PE in ADA systems. Execution time and accuracy of the design plan perform a suitable solution. Flowchart of the design procedure is given in Figure 3.10.



Figure 3.10. Flowchart of Folded cascode op-amp.

## 3.3.5. Latched Comparator

Comparators are one of the most important building blocks in an ADC system. They compare a signal with a reference signal and produce an appropriate output defined as follows;

$$V_{O} = \begin{cases} V_{OH} \ if \ V_{in+} - V_{in-} > 0 \\ V_{OL} \ if \ V_{in+} - V_{in-} < 0 \end{cases}$$
(3.17)

The comparator has two inverters connected back to back with each other forming a latch, and an NMOS transistor is placed between two differential outputs. The schematic of the comparator is given in Figure 3.11.

The comparator has two main operations; reset phase and regenerative phase. In the first phase (reset phase), the clock is set to high so that the switch transistor closes. As a result, the differential outputs of the comparator become short-circuit and are set to certain DC voltage level around midpoint of the supply. In the second phase, switch transistors become open circuit which means input is disconnected. Charge imbalance



is amplified to digital voltage level on the differential nodes of the comparator. The

Figure 3.11. Schematic of latched comparator.

cross-coupled inverter pair structure (transistors  $M_1$ - $M_4$ ) forms the regenerative loop for the latch operation. The transistor sizing is adjusted in order to set the output nodes to midpoint of the supply voltage. Sizing  $M_9$  is also important. Increasing its width brings the voltages of both nodes close to each other; yet, the charge injection also increases. As a result, the sensitivity of the latch decreases.

Flowchart of design procedure is illustrated in Figure 3.12. First of all, inverter pairs are taken into consideration. While varying their sizes, output nodes are set to midpoint of the supply. Then, all dc voltages and currents are calculated including small signal parameters. Biasing conditions are implemented in order to eliminate unwanted results. Then, kickback elimination part of the comparator is designed. An n-type MOSFET is used instead of an ideal current source. Bias voltage and the tail current are calculated. After that, input transistors are sized with the appropriate reference voltage and then switch transistors are taken into account. Next, gain values and bandwidth are compared to the given specifications including slew rate and input resolution. Finally, solution set is obtained. Each candidate satisfies the given specifications.

Design plan is embedded into a C++ programming language. Generally, two performance parameters, gain and bandwidth are applied as an input set. As a result, a feasible design space is obtained including performance, design and circuit parameters for the given input set. In order to verify the accuracy of the design procedure, arbitrary candidates are chosen from different regions of PDS and simulated in H-spice simulator. 17904 candidates are found out of 5.76 million trials in 45 minutes on Pentium 2.5 GHz machine. Two candidates were chosen from different regions of PDS to



Figure 3.12. Flowchart of the comparator.

verify the accuracy of the design plan. The generated solutions were examined by Hspice simulations and their accuracy was compared with the estimated values. Circuit parameters of the selected candidates and their comparison results are given in Table 3.12. Index numbers of the selected candidates are also shown in the Table 3.12. It was observed that the maximum error was obtained in output transconductance values. On the other hand, input resolution and clock frequencies were estimated in an acceptable

|            |                           | results                   | for the late              | hed-compa                 | arator                    |                           |             |
|------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|-------------|
| Design     | $W_1/L_1$                 | $W_3/L_3$                 | $W_5/L_5$                 | $W_{7}/L_{7}$             | $W_9/L_9$                 | $W_{10}/L_{10}$           | $V_{bias}$  |
| Parameters | $(\mu { m m}/\mu { m m})$ | $(\mu { m m}/\mu { m m})$ | $(\mu { m m}/\mu { m m})$ | $(\mu { m m}/\mu { m m})$ | $(\mu { m m}/\mu { m m})$ | $(\mu { m m}/\mu { m m})$ | (V)         |
| # 752221   | 3.5/0.35                  | 9.45/0.35                 | 3.85/1.4                  | 2.45/0.7                  | 2.45/0.35                 | 2.8/0.35                  | 1           |
| Circuit    | $I_{d1}$                  | $g_{ds1} \; g_{ds3}$      | $g_{ds5}\;g_{ds7}$        | $g_{m1}$ $g_{m3}$         | $g_{m5}\ g_{m7}$          | $V_{S,M5}$                | $I_{bias}$  |
| Parameters | $(\mu A)$                 | $(\mu S)$                 | $(\mu S)$                 | $(\mu S)$                 | $(\mu S)$                 | (V)                       | $(\mu A)$   |
| Estimation | 530                       | 8 63.9                    | 0.41                      | 883                       | 148 24                    | 0.5                       | 80          |
|            |                           |                           | 60.6                      | 1210                      |                           |                           |             |
| Simulation | 482                       | 7.2  50                   | 0.4 80                    | 700 900                   | 169  31                   | 0.52                      | 88          |
| Performanc | e Area                    | Power                     | Input                     | Clock                     | Latch                     | $Gain \ of$               | Gain of     |
|            |                           |                           | resolution                | Freq                      | Time(ns)                  | Kickback                  | Latch       |
| Parameters | $\left(\mu m\right)^2$    | $(\mathrm{mW})$           | (mV)                      | (MHz)                     | Constant                  |                           |             |
|            | 32.4                      | 3.7                       | 1                         | 58                        | 87                        | 30                        | 0.12        |
| Design     | $W_1/L_1$                 | $W_3/L_3$                 | $W_5/L_5$                 | $W_{7}/L_{7}$             | $W_9/L_9$                 | $W_{10}/L_{10}$           | $V_{bias}$  |
| Parameters | $(\mu { m m}/\mu { m m})$ | $(\mu { m m}/\mu { m m})$ | $(\mu { m m}/\mu { m m})$ | $(\mu { m m}/\mu { m m})$ | $(\mu { m m}/\mu { m m})$ | $(\mu { m m}/\mu { m m})$ | (V)         |
| #          | 5.95/1.05                 | 17.5/1.05                 | 4.55/1.4                  | 4.55/0.35                 | 3.15/0.35                 | 2.45/0.35                 | 1.1         |
| 5747436    |                           |                           |                           |                           |                           |                           |             |
| Circuit    | $I_{d1}$                  | $g_{ds1} \; g_{ds3}$      | $g_{ds5}\;g_{ds7}$        | $g_{m1} \ g_{m3}$         | $g_{m5} \ g_{m7}$         | $V_{S,M5}$                | $I_{bias}$  |
| Parameters | $(\mu A)$                 | $(\mu S)$                 | $(\mu S)$                 | $(\mu S)$                 | $(\mu S)$                 | (V)                       | $(\mu A)$   |
| Estimation | 353                       | $1.65 \ 13.8$             | 0.52                      | 589  807                  | $179 \ 32$                | 0.58                      | 110         |
| Simulation | 375                       | $2\ 12.5$                 | 0.51                      | $578\ 670$                | 200  50                   | 0.51                      | 107         |
| Performanc | e Area                    | Power                     | Input                     | Clock                     | Latch                     | $Gain \ of$               | $Gain \ of$ |
|            |                           |                           | resolution                | Freq                      | Time(ns)                  | Kickback                  | Latch       |
| Parameters | $(\mu m)^2$               | $(\mathrm{mW})$           | (mV)                      | (MHz)                     | Constant                  |                           |             |
|            | 88.4                      | 2.6                       | 1                         | 50                        | 40                        | 90                        | 0.13        |

Table 3.12. Circuit parameters of the selected candidates and their comparison

error range. PE design plans of all analog blocks and subblocks were implemented into the library, which is one of the most important parts of PE tool. Therefore, the following example considers an interface for the library.

### 3.3.6. An interface for the Library

The tool presented in this example enables one to search the design space of analog blocks such as op-amp configurations and main sub-blocks. An automatic engine built in Matlab estimates the design space boundaries and calculates the Pareto optimum curves for the end-users. In other words, the tool collects EKV-based design plans of different op-amp configurations (such as miller op-amp, telescopic op-amp, folded cascode op-amp) and basic analog sub-blocks (such as current mirror types, differential stages and output stages) and then embeds them into a Matlab-based program (running C on background), which illustrates 2D/3D Pareto curves of the selected analog block. One can select any analog block from library attached to the graphical user interface (GUI), which is shown in Figure 3.13, and set the performance parameters as required by the application.

Before running the program, it is possible to apply a specific search mode such as key transistors being limited to only sub-threshold region (weak inversion search). Generally, design space is searched through all regions from sub-threshold to strong inversion (full search). Furthermore, there is a technology popup menu on the GUI. EKV model parameters are extracted from AMS 0.35  $\mu$ m model parameters as mentioned before. When the user runs the program, data points of the selected-topology are gathered in the background using the pre-defined design-based PE models. Then, any tradeoff analysis between circuit and performance parameters can be illustrated by graphical representations together with the number of solutions as given in output information. For example, a designer may question the relation between area-power and gain in the Miller op-amp or try to find the minimum power or area values due to the gain value by pressing *showgraph* button on the interface [15].

This GUI can be improved for different applications. For example, it can be used as a standalone topology selector or can be implemented into an automation system. Assume that an automatic design system has to consider performance response of an op-amp for given specifications. Optimum performance response can be searched from different type op-amps and the best one can be selected for the given purpose as given in [32].



Figure 3.13. Graphical user interface for design space exploration.

# 4. PARETO FRONT EXTRACTION PART

PE tool helps the circuit level synthesis by giving hints about optimization and eliminating unfeasible solutions. Since extracting PDS of analog blocks refers to the elimination part, an approach is required for obtaining optimum solution sets which can be derived from that PDS set. Therefore, PF approach can be preferred because PF can handle all interacting parameters and constraints of an analog design. In this chapter, extraction of PF is explained starting from 2-dimensional space.

#### 4.1. Pareto Front Extraction for 2-D

Tradeoff boundaries of the performance design space are limited by the PF because one performance cannot be improved unlimitedly at the cost of others. In Figure 4.1, a feasible design space and its Pareto Front are shown in two-dimensional space;  $p_1$  and  $p_2$  are performance parameters such as area, power, gain and bandwidth. Assume that performance parameters in Figure 4.1 are area and power. In other words, a designer wants low power with small area.  $p^{*1}$  and  $p^{*2}$  are individual minima points; that is, minimum power and area values that can be found in PDS.

There are no optimization runs such as genetic algorithm, goal attainment, sequential quadratic programming etc. Instead, the inequality operator and the *dominance rule* obtained from the definition of PF are being used. Pareto points are extracted from the performance design space which is already explored in the library. The details can be found in the previous section.

In our work, the PF set can be formulated as follows; Let D be the feasible design space in N-dimensional space where  $x_i$  is a performance parameter.

$$D = \{X_1, X_2, X_3, \dots, X_k\} \ni X_i \in \mathbb{R}^N \text{ for } i = 1, 2, \dots, k.$$
(4.1)

Let  $\mathcal{P}(\mathcal{D})$  be the Pareto points set. An inequality operator defines the dominance rule

by denoting components of X.

$$Y \neq X \text{ is dominated by } X \Leftrightarrow x_i \leq y_i \ \forall i(Dominance \ rule)$$

$$x_i \in \mathcal{P}(\mathcal{D}) \Leftrightarrow x_i \ dominates \ all \ X$$

$$(4.2)$$

For example,  $(x_3, y_3)$  is dominated by  $(x_1, y_1)$  because  $x_1 < x_3$  and  $y_1 < y_3$ . Neither



Figure 4.1. Pareto Front & Design Space for N=2

 $(x_1, y_1)$  nor  $(x_2, y_2)$  is dominated by each other since  $x_1 > x_2$  and  $y_1 < y_2$ . So that if the design space consists only of

 $\mathcal{D} = \{(x_1, y_1), (x_2, y_2), (x_3, y_3)\}$  then the Pareto set is:  $\mathcal{P}(\mathcal{D}) = \{(x_1, y_1), (x_2, y_2)\}$ The dominance rule is embedded with a simple algorithm into a Matlab code and applied to a performance design set. This can be defined using two rows of operation in Matlab such as:

$$\mathcal{D} = load ('Performance Design Space Set of a block')$$
  
$$\mathcal{P}(\mathcal{D}) = dominance operator (\mathcal{D})$$
(4.3)

PF set includes not only the performance parameters of the block but also the circuit parameters such as W, L, biasing, external circuitry elements (R, C) etc. In other words, the designer can have an insight into the capability of the system and attain knowledge about an approximate design of each block which may affect the overall performance of the system. Two and three-dimensional PF can be illustrated by graphical representations and they can be more useful than searching lookup tables for tracing the tradeoff limits. In Figure 4.1, it is assumed that performance parameters are area and power. In other words, both dimensions are tried to be minimized. Therefore, (4.2) is valid only when a designer wants both parameters to be small. In Figure 4.2, other possibilities such as gain-power pair where maximum gain is desired with a small power are illustrated. For example, in Figure 4.2a,  $(x_3, y_3)$  is not dominated by  $(x_1, y_1)$  because



Figure 4.2. PFs for performance goals: a. maximize X- minimize Y b. minimize X-maximize Y c. maximize X-maximize Y

 $x_3 > x_1$  and  $y_3 > y_1$ . However, since  $x_3 < x_2$  and  $y_3 > y_2$ ,  $(x_3, y_3)$  is dominated by  $(x_2, y_2)$  so  $(x_3, y_3)$  is eliminated from PDS.

Dominance rule is applied to PDS and dominant points are eliminated while searching all points in PDS as mentioned in (4.3). However, the number of points in PDS can vary from a couple of thousands to several millions. Therefore, comparing each point in PDS according to the dominance rule consumes time for elimination, especially when the number of PDS is huge. As a result, in our work, a simple geometrical constraint is added and, thus, the proposed algorithm is called as *geometrical elimination rule*.

In Figure 4.3, geometrical constraint is illustrated for minimum  $X_{ix}(p_1)$  and  $X_{iy}(p_2)$  performance pair. Individual minima are found for both performance parameters. Their coordinates are shown as  $p^{*1}$  and  $p^{*2}$ . PF must remain between the three coordinates which form a triangle. First of all, points which are outside the triangle are eliminated and then, dominance rule is applied to the points which are inside the triangle are represented by cross sign. Geometrical


Figure 4.3. Geometrical constraints for PF extraction

dominance algorithm is embedded into a Matlab code and (4.3) is modified as given in (4.4).

D = load (' Performance Design Space Set of a block '); T = eliminate (D)(eliminate points which are inside the shaded region in Figure 4.3) $\mathcal{P}(\mathcal{D}) = dominance operator (T)$ (4.4)

Consequently, most of the solutions are eliminated and dominance rule will be applied to set-T. However, the triangle region can only be defined if it is known that PF is a convex curve. In Table 4.1, both dominance rule and geometrical constraints are shown in general form. Figure 4.4 illustrates the third column of Table 4.1. In other words, PF must be located in the defined square region. Simulation results show that implementing geometrical elimination to the PF approach consumes less time in the process. The following example shows that the execution time of PF extraction decreases from 67 seconds to 1.4 seconds. In order to emphasize the 2-dimensional PF extraction

Table 4.1. Dominance rule and geometrical elimination for 2-dimensional space

| Performance     | Dominance                                  | Geometrical Elimination                                    |
|-----------------|--------------------------------------------|------------------------------------------------------------|
| parameter goals | Rule                                       | Set T is defined                                           |
| Min X-axis      | $Y \neq X$ is dominated by X               | $\{(X_{ix}, X_{iy})   p_x^{*1} \le X_{ix} \le p_x^{*2}\}$  |
| Min Y-axis      | $\Leftrightarrow x_i \leq y_i \forall \ i$ | $\wedge \left\{ p_y^{*2} \le X_{iy} \le p_y^{*1} \right\}$ |

methodology, some examples were taken into account. Firstly, a Miller op-amp was considered as a case study.



Figure 4.4. PF must be located in the shaded region where set-T is defined.

Case Study 1: Input parameters were given Table 4.2. The library found 1.2 million candidates out of 56.6 million trials in 45 minutes on Pentium 1.6 GHz machine with 1 GB memory. Assumed that a designer wants to analyze power-area and gain bandwidth-power tradeoff. In Figure 4.5, PDS of area-power, which was represented by discrete points, was illustrated. All values were normalized between 0 and 1. Individual minima were represented by a circle and a cross sign. After gathering PDS, (4.4) was



Figure 4.5. Normalized PDS of area-power pair for Miller op-amp.

applied together with the constraints given in Table 4.1. PDS had 1231087 solutions. Then, geometrical elimination was applied. As a result, a decrease in the number of solutions was observed. The number of solutions reduced to 9743 (set-T). Eventually, dominance rule was applied and 9 Pareto points were calculated. In Figure 4.6, most

Table 4.2. Input set of case study-1 for PF extraction

| Input Set      | Gain $(V/V)$ | BW (kHz) | $\mathbf{SR} \ (\mathbf{V}/\mu \mathbf{s})$ | CL (pF) |
|----------------|--------------|----------|---------------------------------------------|---------|
| Minimum values | 3000         | 1        | 10                                          | 5       |

of the points were eliminated and set-T and Pareto points were shown. Cross symbols represent elements of set-T and blue-circles are the Pareto points. Curve fitting was applied to Pareto points and PF was obtained in Figure 4.7. Extraction procedure took 3 minutes on the same machine. As a result, PDS exploration and PF extraction of Miller op-amp required 48 minutes for the given input set. However, we do not need to construct PE model of Miller op-amp again for the same input specification. Different performance analysis can be evaluated using the pre-defined PE model. A



Figure 4.6. set-T and Pareto points.

designer can attain performance limits of Miller op-amp for the given input set. For example, PF can be interpreted as there is no power value lower than the individual minimum one. In addition, an area value which is lower than the individual minimum of area cannot be achieved by this design. Each Pareto point has the knowledge of its design parameters. In other words, an estimated design which gives the corresponding area-power value is obtained. Therefore, Pareto points are verified by H-spice simulation in order to validate the estimated design. In Table 4.3, comparison for estimation and simulation results is tabulated for individual minima.

Table 4.3 shows the reliability between the design procedure and the analytical equations in PF extraction. In other words, the comparison proves that the design plan together with its equations work properly in a specific region (IM points) of PF.



Figure 4.7. PF of Miller op-amp for area-power pair.

Another performance analysis of the op-amp might be gain bandwidth product- power pair. In this example, a different input set was used as given in Table 4.4. PDS had 2588 solutions and after the elimination, 325 solutions were obtained. In Figure 4.8, cross symbols in the rectangle represent set-T. After that, dominance rule was applied to set-T and three Pareto points were extracted as given in Figure 4.9. Utilization of geometrical elimination was verified by PDS given in Figure 4.8. Firstly, PF was extracted using (4.3) and then (4.4) was applied for extraction. Table 4.5 shows the comparison in execution time for both algorithms, which are expected to find the same Pareto points. Execution time reduces dramatically; from 67 seconds to 1.4 seconds.

Secondly, a comparator was examined as a case study.

|            |                          | r                      | <u>esults</u> |                    |                    |           |           |            |
|------------|--------------------------|------------------------|---------------|--------------------|--------------------|-----------|-----------|------------|
| Parameters | $C_C R_C$                | L values               | $W_1$         | $W_2$              | $W_3$              | $W_5$     | $W_7$     | $V_{BIAS}$ |
|            | $(pF)~(k\Omega)$         | values                 | $(\mu m)$     | $(\mu m)$          | $(\mu m)$          | $(\mu m)$ | $(\mu m)$ | (V)        |
| Minimum    | $0.35\ 7.2$              | $L_{1,2,3} = 2L_{min}$ | 10.85         | 40.25              | 1.1                | 1.05      | 3.85      | 3.42       |
| Power      |                          | $L_{5,7} = L_{min}$    |               |                    |                    |           |           |            |
| Minimum    | $1.05\ 2.7$              | $L_{1,2,3} = 2L_{min}$ | 4.55          | 17.55              | 1.05               | 4.55      | 3.85      | 2.28       |
| Area       |                          | $L_{5,7} = L_{min}$    |               |                    |                    |           |           |            |
| Minimum    | Power                    | $I_{dout} (\mu A)$     | $I_{dif}$     | $g_{out1}$         | $g_{out2}$         | $V_{DS}$  | f3db      | Gain       |
|            |                          |                        | $(\mu A)$     | $(\mu\Omega)^{-1}$ | $(\mu\Omega)^{-1}$ | (V)       | (kHz)     | (dB)       |
| Estimation | $\operatorname{Results}$ | 50                     | 4             | 5                  | 0.75               | 1.65      | 3.6       | 69.5       |
| Simulation | $\operatorname{Results}$ | 47                     | 4.4           | 4.5                | 0.7                | 1.8       | 3.4       | 71.2       |
| Minimum    | Area                     | $I_{dout} \ (\mu A)$   | $I_{dif}$     | $g_{out1}$         | $g_{out2}$         | $V_{DS}$  | f3db      | Gain       |
|            |                          |                        | $(\mu A)$     | $(\mu\Omega)^{-1}$ | $(\mu\Omega)^{-1}$ | (V)       | (kHz)     | (dB)       |
| Estimation | $\operatorname{Results}$ | 65                     | 11            | 6.67               | 2.1                | 1.65      | 5.7       | 69.1       |
| Simulation | Results                  | 58                     | 12            | 4.5                | 2.1                | 1.64      | 4         | 71.3       |

Table 4.3. Design parameters of IM and comparison of estimation vs. simulation

Table 4.4. Input set of case study-1 for PF extraction for gain-power

| Input Set      | Gain $(V/V)$ | BW (kHz) | $\mathbf{SR} \ (\mathbf{V}/\mu\mathbf{s})$ | CL (pF) |
|----------------|--------------|----------|--------------------------------------------|---------|
| Minimum values | 2500         | 2.5      | 5                                          | 4       |

| Τ | ał | əle | е 4 | 1.5. | C | ompari | $\operatorname{son}$ | in | execut | ion ≀ | time | for | both | . al | gor | $\operatorname{itl}$ | nms |
|---|----|-----|-----|------|---|--------|----------------------|----|--------|-------|------|-----|------|------|-----|----------------------|-----|
|   |    |     |     |      |   |        |                      |    |        |       |      |     |      |      | ()  |                      |     |

| Applied algorithm | Only dominance rule    | Geometrical elimination rule |
|-------------------|------------------------|------------------------------|
| Execution Time    | $67.26  {\rm seconds}$ | 1.4  seconds                 |

Case Study 2: Only two performance parameters were taken into account; minimum gain value was selected as 5 V/V and minimum bandwidth value as 50 MHz. There are important performance parameters of the comparator such as delay, input resolution, power consumption and clock frequency. In Figure 4.10, delay-input resolution and latch frequency-offset voltage trade-off analysis were shown. It was observed that delay of the comparator increases while the input resolution gets smaller or smaller offset voltage requires less frequency response.



Figure 4.8. Normalized PDS of gain-power pair for Miller op-amp.



Figure 4.9. PF of Miller op-amp for gain-power pair.



Figure 4.10. PF of the comparator (a) input resolution-delay, (b) latch frequency-offset voltage.

### 4.2. Pareto Front Extraction for 3-D

PF extraction for 2-dimensional space was described in the previous section. All possible performance parameter relations were considered as illustrated in Table 4.5. These examples can be reproduced using different performance parameters. However, not only 2-dimensional performance is examined in mixed-signal systems. Three or more dimensional PF extraction has to be investigated.

It is a well-known fact that one performance can only be improved at the cost of the others on a PF. For example, the tradeoff can only be modified along PF while sacrificing  $p^{*1}$  to improve  $p^{*2}$  as given in Figure 4.1-4.2. Therefore,  $p^{*1}$  is determined by improving  $p_1$  (minimizing  $p_1$ ) at the cost of  $p_2$ .  $p^{*1}$  and  $p^{*2}$  are the trade-off limits for 2-dimensional PF. Besides, the same logic is applied for 3-dimensional PF. However, this time, trade-off limits are determined by three curved-edges instead of individualminima. Three PF of performance pairs form the boundary.

Generation of PF for three performances can be summarized in four steps. This methodology is a common, widely-used procedure in the literature up to step 4. In step 4, different approaches are applied to find the inner part of 3-dimensional PF. In our work, a similar approach which is utilized in 2 dimensional PF is taken into account.

- Step 1: Find all individual minima as given in Figure 4.11 Improve p<sub>1</sub> at the cost of p<sub>2</sub> and p<sub>3</sub> ⇒ p<sup>\*1</sup> Improve p<sub>2</sub> at the cost of p<sub>1</sub> and p<sub>3</sub> ⇒ p<sup>\*2</sup> Improve p<sub>3</sub> at the cost of p<sub>1</sub> and p<sub>2</sub> ⇒ p<sup>\*3</sup>
- Step 2: Generate tradeoff limits for 3-dimensional PF.
  Improve p<sub>1</sub> and p<sub>2</sub> at the cost of p<sub>3</sub> ⇒ PF<sub>12</sub>
  Improve p<sub>1</sub> and p<sub>3</sub> at the cost of p<sub>2</sub> ⇒ PF<sub>13</sub>
  Improve p<sub>2</sub> and p<sub>3</sub> at the cost of p<sub>1</sub> ⇒ PF<sub>23</sub>
  Tradeoff limits are curved edges which are determined by PFs of performance pairs.
- Step 3: Trade-off limits (Pareto curves) compose the boundary of 3-D PF

Boundary of 
$$PF_{123} = PF_{12} \cup PF_{13} \cup PF_{23}$$
 (4.5)



Figure 4.11. Individual minima and trade-off boundaries for 3-dimesional PF.

Step 4: Inner part of the PF is selected from PDS.
Inner points and boundaries construct the PF of 3-dimensional PF. In Figure 4.12, boundary and inner points of three dimensional PF are shown. Generation of inner part will be described later.



Figure 4.12. Boundary of PF and the inner points for 3-dimensional space.

This approach can be applied to n-dimensional performance as described below. Extension to n-dimensional space;

- Step 1: Individual minima are generated. (Trade-off limits for two-dimension)
- Step 2: Individual minima define the boundaries of the PF of the all performance pairs.
- Step 3: As in the second step, boundaries of PF for performance triples are estimated by PF of performance pairs. Inner points are selected.
- Step 4: PF of performance triples define the boundary of PF for all four performances. Inner points are selected.
- Step 5: Same approach proceeds until the PF of N-1 performance define the final N-dimensional performance.

For example, four dimensional performance parameter was considered. Let us apply the rules step by step:

- $p^{*1}$ ,  $p^{*2}$ ,  $p^{*3}$  and  $p^{*4}$  individual minima were found.
- $PF_{12}$ ,  $PF_{13}$ ,  $PF_{14}$ ,  $PF_{23}$ ,  $PF_{24}$  and  $PF_{34}$  were determined, i.e.;  $PF_{12} = p^{*1} \cup p^{*2} \cup innerpoints$  (explained in the previous section)
- $PF_{123} = PF_{12} \cup PF_{13} \cup PF_{23} \cup innerpoints$  $PF_{124} = PF_{12} \cup PF_{14} \cup PF_{24} \cup innerpoints$

$$PF_{134} = PF_{13} \cup PF_{14} \cup PF_{34} \cup innerpoints$$
$$PF_{234} = PF_{23} \cup PF_{24} \cup PF_{34} \cup innerpoints$$
$$\bullet PF_{1234} = PF_{123} \cup PF_{124} \cup PF_{134} \cup PF_{234} \cup innerpoints$$

### 4.2.1. Calculation of Inner points

In two-dimensional space, boundaries of the PF are the individual minima points. Elimination method is first applied to data set and then the dominance rule is applied. The same method is considered in 3-dimensional space. The aim is to show the following;

$$P_{xyz} = \{(x, y, z) \mid \nexists (x_k, y_k, z_k) \ni (x, y, z) \text{ dominates } (x_k, y_k, z_k)\}$$

$$P_{xyz} = (x, y, z) = PF_{xy} \cup PF_{xy} \cup PF_{yz} \cup \text{inner Pareto points}$$

$$(4.6)$$

This can be easily verified by 2-dimesional PF as given in (4.7). Individual minima mean Pareto front of a single parameter because one parameter is improved at the cost of the other.

$$P_{xy} = (x, y) = IM_x \cup IM_y \cup \text{ inner Pareto points}$$

$$(4.7)$$

In Figure 4.13, the above expression is illustrated. As mentioned before, performance



Figure 4.13. Illustration of inner Pareto points in 2-D space.

pairs form the boundary of three-dimensional PF and each performance pair is expanded by the third coordinate in order to find the points in 3-D. Thus, each of them refers to a curve, which is formed by the boundary points in 3-D. Boundary points can be shown as follows;

$$P = P_{xy}(x, y, z) \cup P_{xz}(x, y, z) \cup P_{yz}(x, y, z)$$
(4.8)

Next step is to find the inner points from a surface surrounded by the boundary points in 3-D. Therefore, the problem is simplified to 2-D by the projection method. Expression (4.8) is projected on to X-Y plane together with the data set as illustrated in Figure 4.14. Circle points represent the data set of 3-D performance parameters. Cross points are the inner Pareto points and the black circles are the boundary points in X-Y plane. However, when z values of the inner part are considered, some of them might be out of



Figure 4.14. Projection of boundary points on X-Y plane.

the region which is defined by the boundary points in 3-D. Therefore, projection must be applied on to the other pairs such as X-Z and Y-Z plane. Then, common points are selected as in (4.9).

$$IN(x, y, z) = PP_{xy}(x, y, z) \cup PP_{xz}(x, y, z) \cup PP_{yz}(x, y, z)$$
where  $PP_{xy}(x, y, z)$  is the projected inner points with their z values.
$$(4.9)$$

Finally, the dominance rule is applied to set-IN, and inner Pareto points are extracted.  $Inner_{xyz}(x, y, z) = \{(x, y, z) \mid \nexists (x_k, y_k, z_k) \ni (x, y, z) \text{ dominates } (x_k, y_k, z_k)\}$ 

Up to now, only two-dimensional PFs have been extracted. However, relation between gain, power and area can also be an attractive performance analysis. Threedimensional PF extraction is given step by step as illustrated in Figure 4.11 and 4.12. First of all, PDS is illustrated with individual minima as it is given in Figure 4.15. 2600 points represent PDS solutions.  $p^{*1}$ ,  $p^{*2}$  and  $p^{*3}$  are individual minima of gain, power and area, respectively. In Table 4.6, maximum gain, minimum power and area estimations of the design are given for the given input set. The dominance rule was

Table 4.6. Individual Minima- Limits of the design

| IM            | Gain $(V/V)$ | Power $(\mu W)$ | Area $(\mu m)^2$ |
|---------------|--------------|-----------------|------------------|
| Maximum Gain  | 3870         | 237.6           | 60.39            |
| Minimum Power | 2620         | 184.8           | 18.49            |
| Minimum Area  | 1965         | 237.3           | 14.82            |

first applied to data set before calculation of the boundary and inner points in order to show the efficiency of the proposed method. Dominance rule is given in (4.10) for gain, power and area.

$$Y \neq X$$
 is dominated by  $X \Leftrightarrow (x_{ix} > y_{ix}) \land (x_{iy} \le y_{iy}) \land (x_{iz} \le y_{iz}) \forall i$  (4.10)

Execution time took 6.5 minutes when (4.3) is considered. By the way, the execution time rise dramatically if the number of PDS solutions increase. Hence, expression 4.4, which may refer to the calculation of the boundary and the inner points in 3-D, was taken into account.

Discrete PF of the performance pairs were generated for three-dimensional performance space as illustrated in Figure 4.16. After that, boundary was obtained and boundary points were tabulated in Table 4.7.



Figure 4.15. PDS of Miller op-amp for gain, power and area.

Table 4.7. Number of boundary points

| PF pairs                 | $PF_{XY}$ | $PF_{XZ}$ | $PF_{YZ}$ |
|--------------------------|-----------|-----------|-----------|
| Calculated in 40 seconds | 3 points  | 20 points | 4 points  |

Projection method of boundary points is demonstrated in Figure 4.17 i.e. for X-Z plane. Boundary points and data set were both projected and the inner points were extracted. Points represent data points that are at the outside of the region. The region was formed by boundaries, which were represented by circles and circles were connected by lines. Inside that region, cross signs represent the inner points. Same projection was applied to each plane and the inner points were obtained as in (4.9). Finally, over 400 inner part solutions were obtained. After the appliance of dominance rule, inner Pareto points were collected. As a result, 41 solutions were found in only 20 seconds including the boundary points. Total time for execution was less than 1 minute. Consequently, it can be observed that applying (4.10) directly to data set consumes time (6.5 minutes), however, after the elimination, all Pareto points can be obtained quickly (1 minute). In Figure 4.18, visualization of the Pareto points are given including IMs. Table 4.8 verifies the efficiency of the proposed approach.



Figure 4.16. Discrete PFs of performance pairs. (a) PF of Gain-Power, (b) PF of Gain-Area, (c) PF of Power-Area

In addition, design of the comparator can be evaluated by analyzing additional performance parameters. For example, an optimum solution set could be found for offset voltage-area-power parameters as given in Figure 4.19. Limits of the design were summarized as follows;

Minimum offset voltage that could be obtained from the comparator is 5.7mV



Figure 4.17. Projection of boundary points and data set on X-Z plane.



Figure 4.18. Boundaries of 3D PF and inner points.

Minimum area of the comparator was  $10.9 \mu m^2$ 

Minimum power that the comparator can reach was 0.7mW

31 Pareto boundary points and 18 inner Pareto points were extracted among  $65685~\mathrm{data}$ 

| dominance rule    | $P_{xyz}\left(x,y,z\right)$ | $PF_{xy} \cup PF_{xz} \cup PF_{yz} \cup inner Pareto points$ |
|-------------------|-----------------------------|--------------------------------------------------------------|
| # of points after | 41 points in                | 41 points in                                                 |
| Dominance rule    | 6.5 minutes                 | 1 minute                                                     |

Table 4.8. Verification of the approach

points. In other words, 49 Pareto points were obtained from PDS of the comparator, which had 65685 solutions for the given performance parameters in 30 minutes.



Figure 4.19. PF of comparator for offset voltage-area-power.

## 5. PARETO FRONT COMPOSITION PART

A designer needs a fast way to get insight into the capability of the whole system after generation of PF of subblocks. Therefore, a fast composition process of PF is required. It is better to start with an example as given in Figure 5.1. Assume a system which does not exist in the library. The system is partitioned into its subblocks; assume



Figure 5.1. Bottom-up composition of PFs.

two, sub1 (op-amp) and sub2 (comparator). From the library, op-amp is selected as Miller op-amp (two stages). Moreover, the op-amp can also be divided into subblocks: sub3 (differential input stage) and sub4 (output stage) so that the example becomes a three-level hierarchical system. Then, from bottom to up, PF composition starts in order to reach the upper level until the PF of the whole system is achieved. PF of sub3and sub4 which are the elements of third level, are obtained by (4.4). Composition is applied to sub3 and sub4 and the result gives us the PF of sub1 at the second level.

For example, assume that the designer wants to evaluate area-power pair of the *sub*1. PF of *sub*3 and *sub*4 is calculated and four points are extracted for both. It is obvious that summation of area and power values of subblocks gives the total area and power values. This summation must be considered in element-wise. Therefore, number

of points in the composed set will be sixteen. After that, dominance rule is applied to that set and the composed PF is obtained. Composition procedure of sub3 and sub4 is illustrated in Figure 5.2. In our work, considering (4.1) and (4.2), the validity of the



Figure 5.2. PF composition example for area-power pair.

above procedure is based on the following which means the PF of a composition  $\odot$  of data points is a subset of the composition of PF points when the composition preserves the dominance rule;

$$PF\{subX \odot subY\} \subseteq PF(subX) \odot PF(subY)$$
(5.1)

where  $PF(subX) \odot PF(subY)$  is the set obtained by applying operation  $\odot$  to all elements of subX and subY.

Assume the dominance rule is preserved under the operation  $\odot$ . Then, the following

chain of implications is obtained.

$$X \in \mathcal{P}F(subX \odot subY) \Longleftrightarrow \begin{cases} X = X_{subX} \odot X_{subY} \\ X \text{ dominates all } Y \text{ in } D(subX \odot subY) \end{cases}$$
(5.2)

$$X = X_{subX} \odot X_{subY} \le Y_{subX} \odot Y_{subY}$$
  
$$\forall Y \in D (subX \odot subY)$$
(5.3)

In order to prove (5.1), the following must be shown.

$$X_{subX} \in PF(subX) \land X_{subY} \in PF(subY)$$
(5.4)

Contradiction will be used, so assume;

wlog let 
$$X_{subX} \notin PF(subX) \Rightarrow \exists Z_{subX}$$
 which dominates  $X_{subX} \Rightarrow$   
 $Z_{subX} \leq X_{subX}$  where  $(Z_{subX} \neq X_{subX})$ 

$$(5.5)$$

Re-write (5.5);

$$Z_{subX} \odot Y_{subY} \leq X_{subX} \odot Y_{subY}, \ \forall \ Y_{subY} \in D \ (subY) \Rightarrow$$
$$X_{subX} \odot Y_{subY} \notin PF \ (subX \ \odot \ subY), \ \forall \ Y_{subY} \in D \ (subY) \Rightarrow$$
$$X = X_{subX} \odot X_{subY} \notin PF \ (subX \ \odot \ subY)$$
(5.6)

Hence, contradiction (to (5.3)) implies  $X_{subX} \in PF(subX)$  and the same procedure is applied for  $X_{subY} \in PF(subY)$ . Eventually, it can be stated that domination relation does not change under  $\odot$  operation if

$$X_{subX} \text{ dominates } Z_{subX} \Rightarrow X_{subX} \odot Y \text{ dominates } Z_{subX} \odot Y$$
 (5.7)

Performance parameters could be power, area, bandwidth, gain etc. In Figure 5.1, assume  $p_1$  and  $p_2$  are area and power for *sub3* and *sub4*, respectively. Power and area values of *sub3* and *sub4* are added in order to calculate the power and area values for *sub1*. In the literature, when performance estimation is evaluated, area-power parameters are frequently considered. However, gain and bandwidth are also important and multiplication operation is usually required. Consequently, it has to be shown that addition and multiplication operations preserve the dominance rule. To give an example, say  $p_1$  and  $p_2$  are gain and bandwidth. In both dimensions, assume that component-wise multiplication  $\otimes$  is required.

$$X_{sub3} = \begin{bmatrix} X_{sub3_1} \\ X_{sub3_2} \\ \vdots \\ X_{sub3_N} \end{bmatrix}, X_{sub4} = \begin{bmatrix} X_{sub4_1} \\ X_{sub4_2} \\ \vdots \\ X_{sub4_K} \end{bmatrix}$$

$$X_{sub3} \otimes X_{sub4} = \begin{bmatrix} X_{sub3_1} \otimes X_{sub4_1} \\ X_{sub3_1} \otimes X_{sub4_2} \\ \vdots & \vdots & \ddots \\ X_{sub3_L} \otimes X_{sub4_1} \\ \vdots & \vdots & \vdots \\ X_{sub3_N} \otimes X_{sub4_K} \end{bmatrix}$$

If X dominates  $Y \Rightarrow \begin{bmatrix} X_1 \\ X_2 \\ .. \\ X_N \end{bmatrix} \ge \begin{bmatrix} Y_1 \\ Y_2 \\ .. \\ Y_N \end{bmatrix}$ , 5.7 must be shown

$$X \otimes Z \text{ dominates } Y \otimes Z \Rightarrow \begin{bmatrix} X_1 & \otimes & Z_1 \\ X_2 & \otimes & Z_2 \\ \dots & \otimes & \dots \\ X_N & \otimes & Z_N \end{bmatrix} \ge \begin{bmatrix} Y_1 & \otimes & Z_1 \\ Y_2 & \otimes & Z_2 \\ \dots & \otimes & \dots \\ Y_N & \otimes & Z_N \end{bmatrix}$$

Since all the points in design data set are positive values, component-wise multiplication preserves dominance. In addition, same verification can be applied to vector addition and division. Finally, it can be said that with positive values the dominance rule does not change under multiplication, addition and division operations. Consequently, PF algebra can be applied to various performance parameters which can be combined by composition operations as necessitated by system requirements.

In order to validate the composition procedure, two examples were considered. Initially, the composition approach was applied to a Miller op-amp, and then, first order low pass filter was taken in to account.

Case Study 1: It was assumed that the Miller op-amp did not exist in the library. Therefore, it was divided into its subblocks, which are differential pair stage and output stage. Assume differential pair stage was in the library. This means that EKV-based approach has already been applied and macromodel of differential pair stage exists in the library. Then, performance parameters for differential pair were given as an input for the macromodel. EKV-based macromodel calculated all feasible solutions and formed different lookup tables for the end users. On the other hand, it was assumed that the second subblock, output stage, did not exist in the library. Then, table-based method which calculates all solutions including the infeasible ones was applied to the subblock. A huge lookup table was formed for the output stage. This approach takes an enormous amount of time in contrast to EKV-based modeling even though differential stage has five transistors whereas the output stage has two. Finally, two subblocks, which were modeled by different approaches, had to be merged in order to obtain the performance estimation results of Miller op-amp. The aim was to show the accuracy of the proposed hybrid approach when two different PE modeling methods were used. This example was evaluated as ML-PE where the system had two subblocks. Hence, composition process was also required. The approach could be verified in five steps;

- Extract PF of Miller op-amp using library, i.e. area-power pair
- Extract PF of output stage using Brute-Force approach
- Extract PF of differential stage using library
- Compose PFs from step 2 and step 3.

• Compare PFs from step 4 and step 1.

It was expected that same values were achieved in comparison part (step 5).

First of all, let us start with the analysis of Miller op-amp. PDS was obtained using the following performance parameters: Minimum gain and bandwidth were given 2000 V/V and 10 kHz respectively, lower bound of slew-rate was taken as 4 V/ $\mu$ s and output load was 2 pF. A sample of the lookup table, which represents biasing, gain, total power consumption and area, was given in Table 5.1.  $V_{BIAS}$  was shown in Figure 5.3 and  $V_{gn}$  is the gate-source voltage of  $M_1$ . Estimation results, which were collected

Table 5.1. Sample of a Look-up table for Miller op-amp

| $V_{BIAS}$ (V) | $V_{gn}$ (V) | Gain of first stage | Power ( $\mu$ W) | Area $(\mu m)^2$ |
|----------------|--------------|---------------------|------------------|------------------|
| 2.5            | 0.63-0.65    | 38.95               | 161.7            | 124.85           |
| 2.5            | 0.63-0.65    | 37.24               | 161.7            | 150.56           |

from each subblock of the Miller op-amp, must satisfy the overall performance of the main block. In other words, the results of differential stage modeled by EKV-based method and the results of output stage modeled by Brute Force method must satisfy the results given in Table 5.1. For example, area solutions gathered from the differential part plus area solutions collected from the output stage must be equal to Miller op-amp area solutions. Figure 5.3 illustrates the verification process. Secondly, anal-



Figure 5.3. Schematic of Miller op-amp and its subblock.

ysis of the output stage was considered using Brute-Force approach. As mentioned in the literature survey, this method was the simplest solution to the problem. All input possibilities were considered and their corresponding outputs were obtained and performance parameters were calculated. Considering the input space and gathering output and performance information from SPICE, a look-up table was formed. It is obvious that this table provides all information about the circuit block and has a large memory which inevitably increases the execution time exponentially as the number of elements in the analog block increases. However, simple algorithm of the approach may be useful in some analog blocks which have one or two elements. Two different methods could be applied by forming a performance model.

- A lookup table is obtained according to the input space and corresponding to output and performance values. After that, infeasible solutions which do not meet the given performance specifications are eliminated. Next, a simplified table is obtained.
- A lookup table can be formed by controlling the performance criteria in each step. If the given input value results in a feasible output value, the result is written into the table, otherwise next input value is considered.

H-spice was used in order to obtain the simulated results for constructing the table. In the net-list, which is given in Figure 5.4, the initial value W of P type and N type of the transistors and also initial value of input voltage are set to 0.5  $\mu$ m and 0.8 V respectively. After each simulation, different W and bias voltage values were assigned to the net-list because design space of the circuit could be obtained by applying all possible W values and bias voltages. Therefore, sizing parameters and bias voltages were defined parametrically. In addition to this, net-list had to be recalled for several times and W and bias voltage values had to be altered each time. This problem could be solved by creating a simple algorithm which alters the W and bias voltage parameters varying from 0.7  $\mu$ m to 50  $\mu$ m (200  $\mu$ m for p-type) in 0.7  $\mu$ m steps and 0.6 V to 0.8 V in 0.05 V steps. In addition, L values were set to 0.7 $\mu$ m just for this example. Output of this simple program could be appended to the original net-list file. When H-spice simulation was completed, a huge lis-file (output file of the H-spice) was formed including all possible output values of the circuit. Spice-like simulators could automatically calculate and list voltage, current and some small signal parameters in their output files. Using the appropriate ones, voltage gain, slew rate and bandwidth values were calculated.

Another problem could occur in reading the necessary parameters from the "lis-file" for each different input value. A parser-like program which read the circuit parameters from lis-file such as  $g_m, g_{ds}$  etc., was designed. Finally, a lookup table was formed and using this table an optimum solution space or Pareto-optimum curves were obtained.

```
vbias 2 0 vbias
vin 4 0 vgs
m1 3 4 0 0 nfet L=1_m W=w_m1
m2 3 2 1 1 pfet L=1_m W=w_m2
.include c35_ekv.lib
.param w_m1=0.5 w_m2=0.5 1_m=0.35 vbias=2.35 vgs=0.8
.op
.OPTIONS SCALE=1e-6 LIMPTS = 30000 post=1 co=132
{
ALTER STATEMENTS
}
.end
```

Figure 5.4. Net-list of simple output stage.

Outlines of the Brute-Force approach can be summarized as follows;

- Generation of lisfile.cpp program, which was used to create alter statements for H-spice, was modified due to the given input parameters. When W and L values are known, the program calculates area and form *area lookup table* for the given topology. The program prepared 30MB text file which included alter-statements in less than 1 minute.
- Output file of the above program was copied and appended to the original netlist. The modified net-list was run. H-spice took 2 days to finalize the entire simulation (Intel Pentium4 2.0 GHz, 512 MB of RAM)
- *Mosreader.cpp program*, which read the necessary information from the lis-file, created a lookup table including circuit parameters for each of the transistors.

Using the appropriate ones, gain and power consumption were also calculated and included into the table.

• A Matlab program was used to form a lookup table and modified them in order to prepare graphical representations.

Brute force approach found out 404701 candidates for the output stage. This could be increased to one million if different L values were taken into account. However, some of the candidates gave infeasible results which were meaningless for the circuit behavior such as smaller values from 1 V for the output voltage. 404701 candidates were reduced to 4324 solutions when  $1 < V_{DS} < 2$  and Gain > 10 inequality was added to the Matlab code. Since design space of an output stage was obtained, the entire simulation took two days, which had to be reduced. Therefore, a simple rule between W ratios of nMOS and pMOS was applied in order to reduce the number candidates for faster simulation. W value of pMOS was taken 2, 3 and 4 times higher than Wvalue of nMOS respectively. Thus, some infeasible W pairs could be eliminated by the second approach, which was a kind of simple intelligent-design technique.

 $W_1 = 0.7 \mu m - 50 \mu m,$  $W_2 = W_1 \cdot I \text{ where } i = (2, 3, 4)$ 

As a result, 4260 candidates were obtained in only 5 minutes. If the same inequality ( $1 < V_{DS} < 2$  and gain is higher than 10) had been added to the Matlab code, 4260 candidates reduced to 211. The accuracy of the second approach could be verified by showing that its solution set was a subset of the design space set obtained from the first approach as expressed in (5.8).

$$PF_{Approach2} \subset PF_{Approach1}$$
 (5.8)

Numerical examples are shown in Table 5.2. Same area values were found for the same bias voltages and power values. A subset was found from the first approach and their values were compared. Thirdly, analysis of differential stage was given: Minimum gain value was 30, 3dB frequency was 10 kHz, slew-rate was 4 V/ $\mu$ s and maximum output

| Random Solution | Index of the Solution | $V_{BIAS}$ | $V_{IN}$     | Power                | Area           |
|-----------------|-----------------------|------------|--------------|----------------------|----------------|
| 1st Approach    | $306^{th}$ of $4324$  | 2.5V       | 0.65V        | $21~\mu {\rm W}$     | $12.2 \mu m^2$ |
| 2nd Approach    | $13^{th}$ of 211      | 2.5V       | 0.65V        | $21~\mu {\rm W}$     | $12.2 \mu m^2$ |
| 1st Approach    | $26^{st}$ of $4324$   | 2.5V       | $\theta$ .6V | $3.89~\mu\mathrm{W}$ | $29.4 \mu m^2$ |
| 2nd Approach    | $4^{th}$ of 211       | 2.5V       | $\theta.6V$  | $3.89~\mu {\rm W}$   | $29.4 \mu m^2$ |

Table 5.2. A sample of numerical examples from  $1^{st}$  and  $2^{nd}$  approach

conductance was 1pF. Here, the output load was taken as 1 pF because in Miller op-amp, compensation capacitance, whose maximum value was the half of the output capacitance  $(C_L)$ , was the output capacitance of the differential stage. EKV-based approach found out 30 solutions when gain was limited between 35 and 40. The tool estimated the power consumption as 26.4  $\mu$ W; thus, the current was 8  $\mu$ A for  $V_{BIAS}$ was 2.5 V and  $V_{OUT}$  was 0.65 V. In Table 5.3, a sample of a lookup table which has 7 elements out of 58 is given because same biasing voltages  $(V_{BIAS}, V_{gn})$  have to be selected as determined in Table 5.3. It should not be overlooked that  $V_{BIAS}$  is common for both subblocks and that the output voltage of the differential stage is equal to the input voltage of the output stage. Hence, biasing values from subblocks must be equal to each other for a robust op-amp design. Then, first three steps were examined:

| $V_{BIAS}\left(V\right)$ | $V_{OUT}\left(V ight)$ | Gain   | $Power\left(\mu W\right)$ | $Area\left(\mu m\right)^2$ |
|--------------------------|------------------------|--------|---------------------------|----------------------------|
| 2.5                      | 0.65                   | 38.826 | 26.4                      | 4.77                       |
| 2.5                      | 0.65                   | 38.826 | 26.4                      | 4.29                       |
| 2.5                      | 0.65                   | 34.278 | 26.4                      | 4.45                       |
| 2.5                      | 0.65                   | 39.975 | 26.4                      | 5.73                       |
| 2.5                      | 0.65                   | 39.975 | 26.4                      | 5.25                       |
| 2.5                      | 0.65                   | 39.365 | 26.4                      | 7.64                       |
| 2.5                      | 0.65                   | 39.784 | 26.4                      | 14.49                      |

Table 5.3. A sample of look-up table for differential pair stage

analysis of Miller op-amp (by library), analysis of differential stage (by library) and analysis of output stage (by Brute-Force with two approaches). Finally, the last two steps were taken into account. Solution sets from step 2 and step 3 were composed and then compared to step 1. In Table 5.4, solutions from subblocks and Miller op-amp are shown. It was observed that solutions verified the hybrid approach using (5.9).

$$Area_{Miller} = Area_{Diff} + Area_{Out}$$

$$Power_{Miller} = (Current_{Diff} + Current_{Out}) \times Power \ Supply$$

$$Power_{Miller} = (Power_{Diff} + Power_{Out})$$
(5.9)

In this example, it was shown that instead of extracting PF of Miller op-amp, its Table 5.4. Verification of the results collected from subblocks and Miller op-amp

| Blocks                          | $\operatorname{Differential}$ | Output Stage | BTS OPAMP              |
|---------------------------------|-------------------------------|--------------|------------------------|
|                                 | Pair Stage                    |              |                        |
| Output voltage of $M_3$ In-     | 0.65V                         | 0.65V        | $0.63 \ 0.65 V$        |
| put voltage of $M_1$            |                               |              |                        |
| Gate voltage of $M_7$ and       | 2.5V                          | 2.5V         | 2.48 - 2.5V            |
| $M_2$ Bias voltage              |                               |              |                        |
| $\operatorname{Current}(\mu A)$ | 8                             | 40           | 8 for diff input stage |
|                                 |                               |              | 40 for output stage    |
| Area $(\mu m)^2$                | 4.77-14.49                    | 120-135      | 124-149.49             |

subblocks were used to obtain optimum solution set of the op-amp. In other words, subblocks of the system (the op-amp) were adequate for obtaining the overall performance. This statement was verified by comparing the solution set of the op-amp to the composed set of subblocks. The relation between the subblocks was given in (5.9) for area-power pair and the numerical results were given in Table 5.4. Consequently, it can be suggested that subblocks of a system can be used for PDS exploration instead of exploring performance space of the main system. In addition, different approaches can be applied to subblocks in order to speed up the automation process.

Case Study 2: First order low-pass filter (LPF) was taken as a case study and divided into its subblocks to simulate ML-PE. Miller op-amp was used instead of an ideal op-amp as shown in Figure 5.5 so that LPF was considered as a three-layer hierarchical system illustrated in Figure 5.6. In many applications, feedback loops can be observed as in Miller op-amp. In our example, compensation capacitance  $(C_C)$  creates a feedback path inside the Miller opamp. Therefore, performance information must be allocated to subsystems considering the feedback path and the relation between subblocks as given in Figure 5.6. Firstly,



Figure 5.5.  $1^{st}$  order LPF, Miller op-amp is used.

PFs of differential input stage and output stage (subblocks of Miller op-amp) were composed for the next upper layer (Miller and RC part) since PF of subblocks was already extracted using previous example. In this example, performance was searched for gain-power pair. However, gain parameter was considered in two states; open-loop gain (*Gain*) and closed-loop gain (*CLG*). Open loop gain was used for Miller op-amp and closed loop gain was used for LPF. Hence, x-axis on Figure 5.7a is open loop gain and closed loop gain is used in RC-part part as in Figure 5.7b. The relation between the subblocks is defined in (5.10). After that, PF of RC part was extracted using Brute-Force approach as in output stage of Miller op-amp. However, there is no need to use H-Spice simulation to generate RC-part. Instead, resistors and capacitance were varied while considering dc-gain and bandwidth constraints. PF of RC-part is illustrated in Figure 5.7b. Then, composition method proceeded and Figure 5.7a and



Figure 5.6. Hierarchical top-down decomposition bottom-up composition.

Figure 5.7b were combined in order to achieve PF of LPF for gain-power pair.

$$PF_{power} (LPF) = PF_{power} (Miller) \oplus PF_{power} (RCpart)$$

$$PF_{gain} (LPF) = PF_{gain} (Miller) \otimes PF_{gain} (RCpart)$$

$$\oplus \text{ and } \otimes \text{ operations can be defined as :}$$

$$\oplus \Rightarrow \text{ composed\_set\_of\_power} = power\_set\_of\_Miller + power\_set\_of\_RCpart$$

$$PF_{power} (LPF) = \text{ dominance}(\text{composed\_set\_of\_power})$$

$$\otimes \Rightarrow \text{ composed\_set\_of\_gain} = \frac{gain\_set\_of\_Miller \times gain\_set\_of\_RCpart}{gain\_set\_of\_Miller+1+gain\_set\_of\_RCpart}$$

$$PF_{gain} (LPF) = \text{ dominance}(\text{composed\_set\_of\_RCpart} + PF_{gain} (LPF) = \text{ dominance}(\text{composed\_set\_of\_gain})$$

$$where Cartesian production and addition were used.$$

$$(5.10)$$

PF of main system and a comparison are given in Figure 5.7c. PDS of LPF was explored by symbolic analysis and it was compared with the result from ML-PE. In addition, estimated results from ML-PE were simulated in H-spice and their accuracy was also verified. In other words, verification of PF for the main block could be completed in two ways;

• Composition process was verified by a comparison between the simulation results (H-spice) and the estimation results gathered from composed PFs. Net-list of LPF was extracted and given in Figure 5.8a for the minimum power value (98  $\mu$ W,



Figure 5.7. PF composition process of LPF a) PF(Miller) for gain-power b) PF(RC-part) for gain-power c) PF(LPF-composition) for gain-power

which can be seen in Figure 5.7.c) of the LPF. Output file was also given in Figure 5.8b. In Table 5.5, the comparison of performance parameters of the composed PF and the simulation results are given. The comparison results showed that the approach extracted good estimation results. Consequently, both performance of LPF and its design with subblocks were obtained and verified.

• In addition to this, the approach was validated by comparing the results to a different method, symbolic analysis. Equivalent circuit of LPF was taken into account considering a two-stage op-amp (like Miller op-amp) as given in Figure



Figure 5.8. (a) Net-list file of LPF, (b) Output file of LPF.

Table 5.5. AC performance comparison of LPF

| Performance        | Composed PF | H-spice |
|--------------------|-------------|---------|
| Power $(\mu W)$    | 96.3        | 97.75   |
| Gain (Closed Loop) | 98          | 114     |
| BW (Closed Loop)   | 20kHz       | 24kHz   |

5.9. Transfer function and gain expressions are given in (5.11).  $E_i$  and  $Rout_i$  are gain and output resistance respectively (i=1,2). Symbolic expressions were embedded into a Matlab code and while altering the variables of the expressions, performance values were calculated considering some constraints as given in (5.11).

$$\frac{V_{out}}{V_{in}} = -\frac{a_0 + a_1 s + a_2 s^2}{b_0 + b_1 s + b_2 s^2 + b_3 s^3}$$

$$DC_{Gain} = \frac{a_0}{b_0} = -\frac{E_1 \times E_2 \times R_2 - R_{out2}}{E_2 \times E_1 \times R_1 + R_{out2} + R_2 + R_1}$$
(5.11)

In other words, PDS was explored by symbolic analysis and then PF was obtained. However, obtaining the design space of LPF from symbolic analysis took 15 hours. On the other hand, obtaining PF of subblocks of LPF and composing them together took half an hour. Approximately similar results were obtained as mentioned in Figure 5.7c.



Figure 5.9. Equivalent circuit of LPF.

$$E_1 \times E_2 \ge 1000$$

$$R_2/R_1 \ge 100$$

$$1/(2 \times \pi \times R_2 \times C_1) \ge 10kHz$$

$$DC_{current} \ge SR \times C_L$$
(5.12)

# 6. APPLICATIONS OF THE PROPOSED APPROACH

In this chapter, the main purpose is to show the utilization of the proposed approach in ADA systems. Therefore, different applications were taken into account. In the first example, library of PE was embedded into a sigma-delta ADC design automation tool. Subblocks of the system were already defined and only op-amps were designed by the library, so this example was considered as SL-PE with a topology selector [32]. Then, the proposed approach was applied to Flash ADC system, which was a two-level system. In addition, PF extraction and composition procedure were also examined in the second example. In the last example, Pipelined ADC, which was a four-level system, was taken into account. Details can be found in the following sections.

### 6.1. A Sigma-Delta ADC design automation tool with embedded PE

The design of a sigma-delta ADC system, which works at system level, is automated by utilizing PE feature. The library part of the proposed approach was used in this example as an embedded PE. Two different op-amp topologies and a comparator were selected from the library and embedded into the system. As mentioned before, PE takes some performance parameters such as gain, bandwidth, slew rate etc. from the ADC designer part and estimates the area and power consumption values for architecture selection. Hence, this tool takes the advantage of PE models which are crucial for effective ADC design where more accurate search of the design space is required. Communication between the blocks is illustrated in Figure 6.1. In this example, the architecture was determined before the design can proceed further. A coarse SNDRcalculation was carried out, as well as area and power consumption estimations, in order to evaluate the performance of the ADC. The area and power values were estimations which were provided by the PE module of the system. On the other hand, SNDR could be calculated for ideal elements in the earliest stage and then could be modified by including non-linearity. SNDR of the ideal system can be calculated as given in [32].



Figure 6.1. Communication diagram of blocks.

The flowchart of the design automation system is given in Figure 6.2. The blocks marked with '\*' in Figure 6.2 show that these blocks are optional and may be skipped by user. There are three different modes of operation in the tool. Details of the operation modes can be found in [48]. Here, it is concentrated on the usage of PE tool.

In the second mode, semi-custom operation, the user specifies a library in which the characteristics of the blocks (which may have multiple versions) in an ADC are given. However, in some cases, only some of the blocks are suitable for the desired performances, the system searches the design space to calculate the required parameters for the other blocks. For example, if only comparators are available in the library, the system searches for amplifier parameters, which satisfy the desired ADC specifications. However, to estimate amplifier parameters such as area and power, a PE is required. PE is again utilized in the last mode of operation, which is useful when there is no solution with the given libraries. One of the most important features of a design automation tool is the ability to find the best architecture for the provided specifications. Topology selection is significant in automation design systems. In this example, performances of the topologies can be determined by SNDR, area and power consumption. In our approach, a performance estimation tool which works at system level and communicates with system level tools was developed. This tool provides device-level information to system-level tools and can work quite fast. With the utilization of PE, candidate solutions may be evaluated at the system level and accurate architecture selection becomes possible.



Figure 6.2. Flowchart of the algorithm.

Area and power estimation are considered as output of the PE tool and calculations can be given as follows;

• The power consumption sources can be separated into four parts: comparator, integrator, DAC, and digital circuit. It is obvious that most of the power will be dissipated by the amplifier in the switched capacitor integrator. The power consumed by the digital part can be estimated in a coarse manner. The order of the system and the structure of the cascade can be used to estimate the number of latches and blocks of the filter. Thus, a macromodel can be formed which defines the power consumption of different order of filters. This macromodel is developed by performing simulations. Then, the measured data were fitted to the

macromodel. The power of the other blocks is estimated via the PE.

• The widths and lengths of all transistors are calculated and the total area of the modules can be estimated by PE. In order to calculate the area of the remaining components, the information in the technology file such as unit capacitance per area or areas of digital standard cells is used. The area of capacitors, value of which is already defined by the tool, is calculated by utilizing the unit capacitance per area data. If any module is used from the library, the area value of the module is taken directly from the library data. Total area can be calculated by adding the areas of the amplifiers, comparators, digital circuitry, capacitors, and switches.

Accuracy of the PE models is verified in previous sections. Model of both op-amp and comparator is validated with H-spice simulations. Furthermore, utilization of PE was examined by three examples.

Example 1: The second operation mode of the Sigma-Delta ADC design tool was chosen because it contains both library oriented design and automated design. The integrator was designed automatically by the amplifiers which are already in PE and the comparators were chosen from the library. The amplifier types utilized by the PE throughout this example are BTS and cascode op-amps. In the design example, the desired specifications are tabulated in Table 6.1. One major constraint was that the solutions were restricted to a single bit. The algorithm searches for candidate archi-

Table 6.1. Input parameters of the Sigma-Delta automation tool

| Parameters     | $SNDR\left( dB ight)$ | Supply Voltage $(V)$ | OSR | Resolution         |
|----------------|-----------------------|----------------------|-----|--------------------|
| Desired values | 86                    | 5                    | 64  | 14 bits calculated |
|                |                       |                      |     | by given $SNDR$    |

tectures with the given specifications. The candidate architectures were 2-1, 2-2 and 2-2-1. Then, the candidate architectures which satisfy the desired SNDR performance for OSR of 64 were sent to the PE and area and power values were estimated. For this example, it was assumed initially that only Miller (BTS) op-amps were available.
The algorithm uses an array structure, which is called *worm*, is a solution for the ADC. The *worm* contains all information such as *OSR*, supply voltage, configuration, noise values, *SNDR*, capacitor values, and the block parameters such as amplifier gain, comparator offset. These *worms* are passed to functions in order to modify their relevant value, such as quantization noise or *SNDR*. For the semi-custom operation mode that is used in this example, *worms* were generated for 2-1, 2-2 and 2-2-1 configurations. The initial number of worms depends on the number of elements in the library. The amplifier library was not used in the example, thus the comparator library defines the number of initial worms. Then, the user may select the range of parameters for the amplifier and other blocks. For our run, ranges for the gain of the amplifier and capacitor values in the integrator were set.

The worms were sent to noise estimation functions who calculate the SNDR of the worm. As a result, the user has many solutions which satisfy the specified SNDRperformance. The algorithm generates 846 candidate solutions. PE could find a solution for 567 of them. Thus, 67% of the candidate solutions were feasible. The solutions are listed according to their SNDR values by the software. Several solutions may exist for the same SNDR with different structures in which case the optimum solution (in terms of area, power, or a combination of these two) should be selected by the user.

Example 2: In this example run, the candidate architectures may contain two types of op-amps for different stages because PE had both cascode and BTS modules. In the first stages, higher gain solutions were obtained; therefore, power consumption for the first stage was higher than the following ones. On the other hand, lower gain solutions were selected for the other stages. In addition, slew-rate limitation is very important in amplifier selection. Since the current charging the load capacitor is calculated via minimum slew rate, the power dissipation is directly proportional with the current as mentioned in section 3.3. Moreover, design and circuit parameters of the op-amps such as W, L of the transistors, biasing values etc. were obtained, whereas, most of the behavioral PE models are not applicable to any device level and their implementation as such is not simple [49].

In Figure 6.3a, SNDR vs. power results for some cascode solutions are presented. In these solutions, the OSR is 128. Here, PE only uses cascode op-amp structure. Two candidate structures were compared and it was observed that 2-1 structure had the same SNDR specs with lower power as opposed to the 2-2 structure for low SNDR. For more demanding SNDR values, the op-amp was pushed to its limits in the 2-1 structure and power increased sharply. PE tool can also generate PF for performance parameters



Figure 6.3. SNDR versus power .

such as *SNDR*-power pair which is illustrated in Figure 6.3b. In this example, *OSR* was limited between 20 and 64 and both Miller and cascode op-amp were used. Values of the worm were generated for different configurations and the lowest power consumption value was achieved for 2-1 structure with *OSR* value of 20. The tool selected cascode op-amp for first stage and BTS for the rest. The highest *SNDR* value was achieved for 2-1-1 cascaded configuration.

The examples show that the developed tool may be used to generate Paretooptimal curves [50] for desired specifications. The design space from which these curves are generated has many dimensions and the PE connects performance values to real circuit parameters. Hence, not only optimization through SNDR vs. power curve is possible but also area can be incorporated to select the most appropriate configuration. Example 3: In the library, there are three different op-amp topologies; Miller op-amp, Telescopic op-amp and folded cascode op-amp. However, in this example, only Miller op-amp and folded (cascode) op-amp were used. As mentioned before, the selected candidates had large gain solutions in the first stage of the structure for the given specifications. Therefore, it was assumed that a designer evaluated which op-amp provided the best performance tradeoff, i.e. gain-power pair using Pareto optimum analysis. PDS of each op-amp was obtained for the same input specifications given in Table 6.2. In Figure 6.4, PDS of each op-amp is illustrated for gain-power pair.

Table 6.2. Input parameters of the op-amp topologies

| Input Set      | Gain    | BW      | SR         | $C_L$ |
|----------------|---------|---------|------------|-------|
| Minimum values | 2000V/V | 2.5 kHz | $5V/\mu s$ | 4pF   |

Solid-lined rectangular represents PDS of the folded cascode op-amp and the dashedlined rectangular represents PDS of the Miller op-amp. PF extraction was applied to Figure 6.4 and the curves are shown in Figure 6.5. It is observed that cascode op-amp should be selected for the gain values higher than 3800 as found in the previous example. On the other hand, Miller op-amp can be preferred until the gain values of 3800 because power consumption is less than the cascode one. In addition, gain values higher than 5000, both op-amps are not applicable since power consumption increases sharply. As a result, an optimum topology can be selected among the competing ones according to the desired performance evaluations of the given system. Consequently, a Sigma-Delta ADC design automation system which operates at the system level and efficiently selects the most appropriate architecture for desired specifications is given. PE is utilized for not only the selection of the best architecture but also obtaining the accurate estimation of power and area consumption together with the circuit and design parameters of blocks that are used in ADC. Hence, PE provides an insight into the capability of the system for the end-users. The estimator uses EKV models in order to perform very fast estimation results and also at the same time, keeping the required accuracy needed for adequate architecture selection. As a result, suitable execution time and accuracy tradeoff is achieved.



Figure 6.4. PDS of op-amps for gain-power pair.



Figure 6.5. PF of op-amps for gain-power pair.

## 6.2. Designing F-ADC system with proposed PE

F-ADC is known as the fastest ADC because the conversion takes place in as single cycle; hence the name *Flash*. Block diagram of n-bit F-ADC is shown in Figure 6.6. In this example, a two-level PE is considered since Flash ADC consists of only a comparator and a resistor string. The analog input signal is applied simultaneously to

the  $2^n - 1$  latching comparators which are already in the library; n represents the bit numbers. The individual reference voltage is derived from a resistive voltage divider string. With the analog signal applied to the F-ADC ( $V_{in}$ ), each comparator will compare the signal level to the reference level. The resolution of the F-ADCs tends to be limited to eight bits due to the fact that the amount of circuitry doubles every time, the resolution is increased by one-bit. Even though, the design of F-ADCs is highly repetitive in its structure, it demands a high level of matching between the parallel comparators. One of the major contributors to the nonlinearities is the comparator offset voltage. The offset should be less than  $\pm 1/2$ LSB not to degrade the monotonicity of the converter.

First of all, power and area calculations of the ADC are given. Power of the resistor string and the comparators are calculated as follows;



Figure 6.6. Block diagram of F-ADC.

$$P_R = V_{ref}^2 / (2^n \cdot R) \tag{6.1}$$

$$P_{comp} = (2^n - 1) \cdot f \left( I_{sup}, V_{sup} \right)_{single \ comp} \tag{6.2}$$

Area estimation of the resistor string requires technology parameters.  $R_S$  is the sheet resistance of the hpoly (high res. poly) read from technology file. w and l are defined by the guidelines from the given technology file.

$$R = (w/l) \cdot R_S \text{ where } A_{res} = w \cdot l \cdot 2^n \tag{6.3}$$

Area estimation of the comparators is  $2^n - 1$  times area of a single comparator.

$$A_{comp} = f(W, L)_{single\ comp} \cdot 2^n - 1 \tag{6.4}$$

In addition, there are error sources which have to be taken into account. One of them is the mismatch between resistors [51]. It is approximately defined in (6.5) where m is the mismatch parameter.

$$e_m = (V_{ref} \cdot 2^n) \cdot 0.03 \cdot m \tag{6.5}$$

Second one is the thermal noise in the resistor string defined in (6.6) which generates a voltage et. K is the Boltzmann constant and T is the temperature. In addition, offset voltage has to be considered as defined in (6.7). Finally, sum of these voltages should not exceed the half of the LSB, and thus, a constraint is defined in (6.7)

$$e_t = \sqrt{4 \cdot K \cdot T \cdot R \cdot BW_{comp}} \tag{6.6}$$

$$e_m + e_t + vos < V_{ref}/2^{n-1}$$
(6.7)

SNR value should be checked for different n-values in order to control whether or not n satisfies the given SNR with respect to (6.8). In other words, the given SNR limits the number of bits [52].

$$SNR = n \cdot 6.02 + 1.76 \ dB$$
 (6.8)

F-ADC is fast enough to give the response in one clock cycle. Thus, the delay is one clock cycle for this topology and the clock frequency is the only factor in speed estimation, which is defined by a latch at the output of the comparator. F-ADC example can be considered as a ML-PE such as shown in Figure 6.7. Since the model of the comparator is available in the library, only resistor-string is taken into account. Brute-Force modeling can be used for resistor string. Resistors values are varied from an initial value to a maximum value. In addition,  $V_{ref}$  is searched from a minimum level to maximum voltage level. For instance, the supply voltage can be selected as the maximum voltage level. Furthermore, a relation between the resistor string and the



Figure 6.7. Flash ADC system blocks.

comparator is required for different performance parameter evaluations because only area and power (6.1-6.4) can be calculated using resistor and voltage values. Therefore, offset voltage of the comparator is used as given in (6.9).

$$V_{os} = f(W, L, biasing, mismatch \ parameters)$$
(6.9)

Substitute (6.9) into the constraint given in (6.7) so that a relation between comparator and resistor string is obtained in (6.10).

$$V_{ref} = g\left(e_m, e_t, f\left(W, L, biasing, mismatch \ paramters\right)\right)$$
(6.10)

Moreover, bandwidth of the comparator is limited by the resistor string and the input capacitance of the comparator as illustrated in Figure 6.8. Elmore delay calculation is required for the bandwidth estimation of the resistor string part as given (6.11). Input capacitance of the comparator  $(C_{in})$  depends on the W and L values of the input transistors. This is significant because input voltages have to be applied to a comparator without any decrease in their amplitude. Therefore, bandwidth (clock frequency) of the comparator should not exceed (6.11). If the clock frequency of a comparator is higher than the f3dB value given in Figure 6.8, amplitude of the input voltage decreases. As a result, comparator may give incorrect results.

$$f3dB = \frac{1}{2 \cdot \pi \cdot \frac{n \cdot (n-1)}{2} \cdot R \cdot C_{in}}$$
(6.11)

Assumed input parameters to the system are listed in Table 6.3. In this example, subblocks of the system are determined before the design can proceed further. Most of the parameters are related to comparator model. *SNR*, gain and bandwidth parameters



Figure 6.8. Elmore delay assumption.

| Input param- | $SNR\left( dB ight)$ | $Gain\left(V\right)$ | V) $BW(MHz)$     | Input reso-      | Number          |
|--------------|----------------------|----------------------|------------------|------------------|-----------------|
| eters        |                      |                      |                  | lution $(mV)$    | of bits         |
| F - ADC      | 20                   | 3                    | 50               | 2 - 14           | 4 - 8           |
|              | Compa                | arator               |                  | Resistor         | String          |
| ENOB         | Gain                 | BW                   | Input resolution | Resistor Value   | Reference       |
|              |                      |                      |                  |                  | voltage         |
|              | (V/V)                | (MHz)                | (mV)             | $(\Omega)$       | (V)             |
| (6.7)        | 3                    | 50                   | 2 - 14           | 500 : 100 : 2000 | $V_{ref} < 3.3$ |

Table 6.3. Input parameters for F-ADC system

are the minimum values for F-ADC. While extracting the model, performance design space is extended with input resolution and number of bits data set. Then, input set is distributed to subblocks. Comparator model uses W, L, inversion-coefficient, biasing voltages and current as key parameters. Minimum value of effective number of bits (ENOB) is calculated for the given SNR value. In other words, n is lower bounded by the given SNR. Upper bound can be 8 or 10 bits. In Figure 6.9, PF of the subblocks are illustrated for bandwidth-power performance pair. It is assume that there is a 20 fF capacitance at the output node. PFs were obtained in 3.4 and 1.2 minutes for the comparator and the resistor string, respectively. First of all, comparator was modeled



Figure 6.9. PF of the subblocks (a) Comparator (b) R-string

and PDS was obtained. Secondly, according to the estimated  $V_{os}$  and bandwidth values, minimum  $V_{ref}$  was determined for Brute-Force approach of R-string such as given in (6.12).

$$V_{ref} = 2 \cdot (V_{os} + e_m + e_t) \cdot 2^n \tag{6.12}$$

$$\min(V_{ref}) = 2 \cdot \min(V_{os} + e_m + e_t) \cdot 2^n = 2 \cdot \left[\min(V_{os}) + \min(e_m) + \min(e_t)\right] \cdot 2^n$$
(6.13)

Initial reference voltage value for the resistor string can be obtained using minimum offset voltage and minimum bandwidth values which are extracted from PDS of the comparator as defined in (6.13). Also, bandwidth calculation of the resistor string requires input capacitance values and number of effective bits of the comparator as given in (6.11). PDS exploration of resistor string can be illustrated as given in Figure 6.10. Moreover, other performance parameters of the optimum design can be extracted

Figure 6.10. Searching PDS of the resistor string.

as it is given in Table 6.4. In other words, resolution of F-ADC, delay, input resolution and gain of the comparator are obtained for Pareto points of the design. Minimum nand maximum input resolution values are obtained as expected. When n is increased, power consumption will raise and bandwidth values decrease if small input resolution is desired. In order to obtain PF of F-ADC, PF of subblocks have to be composed.

Table 6.4. Other performance parameters for PF of BW vs. Power pair

| Number of     | Gain   | Reference voltage  | Input resolution | Number  |
|---------------|--------|--------------------|------------------|---------|
| Pareto points | (V/V)  | min. Values $(mV)$ | (mV)             | of bits |
| 8 Points      | 5 - 10 | 250 - 360          | 14               | 4       |

Therefore, we have to define composition operations for both performance parameters as follows;

$$PF_{power} (F - ADC) = PF_{power} (Comparator) \oplus PF_{power} (R - string)$$
$$PF_{BW} (F - ADC) = min (PF_{BW} (Comparator), PF_{BW} (R - string))$$

Power values of the comparator dominate power consumption of the system as seen in Figure 6.9. Furthermore, bandwidth values of the comparator are smaller than the resistor string, and thus, x-axis of the Figure 6.9a can be used as the bandwidth of the system due to (6.11). In other words;

 $PF_{BW}(F - ADC) = min(PF_{BW}(Comparator), PF_{BW}(R - string))$  $= PF_{BW}(Comparator)$ 

Moreover, F-ADC can also be modeled and implemented into the library. PE of F-ADC is straightforward since the comparator model is already present. Only a resistor string is added into the model with some additional constraints. Block diagram of the F-ADC model can be illustrated as in Figure 6.11. Input file consists of gain, SNR, clock frequency and resistor value. Inversion coefficient, length of transistors, number of bits, biasing values, input resolution, etc. are swept for feasible design space exploration providing the constraints as given in Figure 6.11. All performance, design and circuit parameters are presented in different look-up tables as output files. PDS



Figure 6.11. Modeling diagram of F-ADC.

of F-ADC is obtained in 50 minutes Pentium 2.5 GHz machine and PF for bandwidth power pair is extracted in 3 minutes using the input parameters given in Table 6.3. Composition process can be verified by comparing PF of F-ADC, which is obtained by Figure 6.11 and PF of composition obtained from Figure 6.9. Both of them are illustrated on the same graph as it is given in Figure 6.12. Composition process takes 50 seconds.

It can be concluded that both PFs are similar to each other. Only a small difference is observed at around 15 mW power value. This is because while extracting

Pareto points of R-string using table-based method, incremental steps of the resistor value and gain values are higher than the pre-defined values of the design plan of F-ADC. Therefore, a point is missing in the composition procedure because the missing point may be between the arbitrary values of the resistor. Power consumption increases dramatically when bandwidth values, which are higher than 300 MHz, are demanded.

Number of comparators in F-ADC directly depends on the bit numbers. For instance, 255 comparators are required for F-ADC with 8-bit resolution. As a result, both power and area increase in relation to (6.2) and (6.4). Therefore, optimum solution is required. In Figure 6.13, PF of subblocks and PF of F-ADC are given. PDS and PF of subblocks are illustrated on the same graph. Composition process is again compared to the F-ADC model in order to verify the proposed approach. Consequently, similar PFs are observed as expected from the approach.



Figure 6.12. PF of F-ADC for BW vs. Power.



Figure 6.13. (a) PF of comparator for Area-Power, (b) PF of R-string for Area-Power, (c) PF of F-ADC for Area-Power

## 6.3. Pipelined ADC automation system with proposed PE

The conceptual block diagram of a generic pipeline ADC is shown in Figure 6.14, consisting of a number of consecutive stages, which are similar in their function. Each stage contains F-ADC and a multiplying DAC (MDAC) except the last one, which only contains an F-ADC. First stage takes a sample of the input voltage and performs a coarse A/D conversion, and generates the residue signal for the following stages. There is NoS number of stages, each quantizes N bits. As a result the overall resolution is NoS times N. Each stage samples the output from the previous stage and converts it



Figure 6.14. General architecture of a NoS-stage P-ADC.

to digital codes using F-ADC. Then, the digital codes are converted to analog signals by DAC and subtracted from the sampled signal. Subsequently, the residue is amplified by a gain of G. Next, the output register combines the output bits from each stage starting from the first one as MSB and gives the final digital codes.

MDAC is used instead of three components; DAC, sample and hold, and residue amplifier. A typical switched-capacitor MDAC, which has two clock phases per conversion, is shown in Figure 6.15. In the first phase, the capacitor array is connected to the input and is charged for sampling. Then, the feedback capacitor is switched to the amplifier output making a negative feedback path. This time, capacitor array is connected to the reference voltages which are controlled by the output of F-ADC. In other words, the charge stored in the capacitor array represents the analog voltage created by the output code of the F-ADC. The residue signal is transferred to the feedback capacitor (doubles the unit capacitor) through the op-amp for the following stage to process. Although there are many error sources in P-ADC, correction circuit compensates these errors efficiently. Therefore, redundant bits may be added to each stage. On the other hand, the error sources are used as constraints for selecting optimum stage resolutions and system configuration. Table 6.5 shows main error sources of the system. In the first column of Table 6.5, error sources of F-ADC, which are already discussed in previous example, are tabulated. In MDAC, thermal noise and



Figure 6.15. Operations of MDAC.

Table 6.5. Main error sources

| F-ADC                 | MDAC           |
|-----------------------|----------------|
| Offset voltage (6.9)  | Thermal noise  |
| Mismatch error (6.5)  | Mismatch error |
| Thermal noise $(6.6)$ | kT/C noise     |

TTN, are defined as given in (6.14).

$$TTN^{2} = \frac{N_{S/H}^{2}}{1} + \frac{N_{MDAC1}^{2}}{G_{1}^{2}} + \frac{N_{MDAC2}^{2}}{G_{1}^{2} \cdot G_{2}^{2}} + \frac{N_{MDAC3}^{2}}{G_{1}^{2} \cdot G_{2}^{2} \cdot G_{3}^{2}} + \dots + \frac{N_{MDAC(NoS-1)}^{2}}{G_{1}^{2} \cdots G_{(NoS-1)}^{2}}$$
(6.14)

 $N_{S/H}$  and  $N_{MDACi}$  are the output referred RMS noise value of the S/H and the MDAC, respectively. G is the closed-loop gain of each stage. An approximate expression of noise value for MDAC can be given as:

$$N_{MDACi}^{2} = \theta \cdot \left[ \frac{K \cdot T}{C_{MDACi}} + (4 \cdot K \cdot T \cdot R_{on}) \cdot BW \right] \cdot G_{i}^{2}$$
(6.15)

In (6.15), K is the Boltzmann's constant, T is the absolute temperature and C is the total input capacitance of the MDAC.  $\theta$  can take 1 or 2 depending on whether or not the circuit is a single-ended type.  $R_{ON}$  is the on-resistance of the CMOS switches. Assume that nMOS is used with the aspect ratio of 10. KP is the technological parameter for AMS  $0.35\mu m$ .

$$R_{ON} = \frac{1}{KP_n \cdot \frac{W}{L} \cdot (V_{gs} - V_t)} = \frac{1}{261.4 \cdot 10^{-6} \cdot 10 \cdot 2} \approx 200\Omega$$

As a result, a constraint can be defined for n effective bit of resolution for P-ADC.

$$TTN < \frac{\theta \cdot V_{ref}}{2^n \cdot \sqrt{12}} \tag{6.16}$$

The other main error source of MDAC is the mismatch error of the capacitor array. The effect of the mismatch is defined as follows;

$$V_{RA} = 2^{N-1} \cdot \left( V_{in} - V_{ref} \cdot \sum_{i=1}^{N} 2^{i-N-1} \cdot b_i \right)$$
(6.17)

Expression 6.17 represents the ideal residue a MDAC. The values of the capacitors determine the voltage, which will be subtracted from the input. Considering the mismatch  $\epsilon$ , capacitor values, which are given in Figure 6.15, are modified as:

$$C_f = (1+\epsilon) \cdot C_{unit}, \ C_i = (1+\epsilon) \cdot C_{unit} \ i = 1, 2, \cdots, N$$
 (6.18)

As a result, (6.17) is rewritten with the contributions of the mismatch error.

$$V_{RA} = 2^{N-1} \cdot \left[ V_{in} \left( 1 + \frac{1}{2^N} \cdot \left[ \epsilon + \sum_{i=1}^N \epsilon_i \right] \right) - \frac{V_{ref}}{2^N} \cdot \sum_{i=1}^N \left[ 2^{i-1} + \epsilon_i \right] \right]$$
(6.19)

The first term of (6.19) represents the gain error and the second term shows the nonlinearity of the MDAC.  $b_i$  is the digit value (0 or 1). Not only mismatch determines the gain error, but also parasitic capacitances have an effect on the gain error. This error must be smaller than the half of the LSB.

$$gain_{error} = \frac{2 \cdot C_P}{2^N \cdot C_{unit}} \text{ where } C_P \text{ is the parasitic capacitance}$$
(6.20)

Moreover, kT/C noise is the lower bound for the unit capacitor. In (6.21), minimum

value of the unit capacitor can be calculated.

$$\sqrt{\frac{kT}{C}} = \frac{LSB}{2} = \frac{V_{ref-min}}{2^{N+1}} \tag{6.21}$$

In order to select optimum configurations, power and area calculations are also taken into account. Both can be calculated in a straightforward way. Power consumption of MDAC and F-ADC dominates the total power estimation. In MDAC, static power dissipation of the op-amp and dynamic power of the switching activities can be approximately given as;

$$P_{MDACi} = (I_{sup} \cdot V_{sup}) + \theta \cdot C_{MDACi} \cdot (1 + K_{Ci}) \cdot V_{ref}^2 \cdot f$$
(6.22)

where  $K_{Ci}$  represents the parasitic capacitors of the switches. Besides, area estimation of MDAC consists of the op-amp, the capacitor array and the switches. Moreover, power consumption and area of F-ADC is already given in the previous example.

Delay estimation of a pipeline ADC is equal to NoS times the clock. Speed limitation is determined by the settling time of the amplifier. In other words, the speed is determined by the slew-rate of the amplifier.

Most of the errors, especially from F-ADC, can be corrected by the digital correction module. On the other hand, self-calibration methods are widely used. They generally correct the errors from DAC levels and gain errors. The design of P-ADC is more complex than the F-ADC. It has many stages, which differ from each other with their resolutions. Thus, exploration of a huge design space is required. However, this space has to be limited due to the error sources and given parameters.

P-ADC is considered as a multi-level PE, which means top-down decomposition of the system is applied and PDS exploration of each subblock is considered with respect to the constraints. Then, bottom-up composition is applied; PF of each subblock is composed in order to achieve the PF of the P-ADC at the top level. As a result, exploration of huge design space is partitioned into small pieces for a faster execution time. In addition, an approximate design of each subblock such as the comparator, op-amp etc. can be obtained whereas most behavioral models cannot. In Figure 6.16, the proposed approach is illustrated. Input performance parameters of the P-



Figure 6.16. Block diagram of P-ADC as ML-PE application.

ADC have to be distributed to subblocks for PDS exploration. In Table 6.6, input parameter set of each block is shown for each level. *SNR* determines the lower bound of the resolution as mentioned in (6.8). Maximum resolution is the upper bound of the system. Clock frequency is a lower bound of the delay estimation. According to the effective number of bits, all available design configurations are generated. Input parameters of the blocks can be altered by the designer or new parameters can be added. Methodology of the approach starts from obtaining the information about resolution. Exploration of PDS for all subblocks begins after the generation of all possible configurations. However, each configuration has different number of bits, thus creates a huge design space. Therefore, some constraints will be considered in order to reject some configurations. For example, high resolutions at the latter stages and lower ones at the early stages are eliminated because of the thermal noise, which dominates

| P - ADC                         |           |                  |           |          |  |  |
|---------------------------------|-----------|------------------|-----------|----------|--|--|
| SNR                             | BW        | Max              | Numł      | per of   |  |  |
| (dB)                            | (MHz)     | Resolution       | Redund    | ant bits |  |  |
| 65                              | > 20      | 12               | ]         | L        |  |  |
| Co                              | omparator | R-string         | Op-amp    | C-array  |  |  |
| Gain (V/V)                      | 5         | -                | $>\!2000$ | -        |  |  |
| BW (MHz)                        | 50        | -                | > 0.1     | -        |  |  |
| Input resolution $(mV)$         | 2 - 14    | -                | -         | -        |  |  |
| BW closed-loop(MHz)             |           |                  | $>\!25$   |          |  |  |
| Resistor Value $(\Omega)$       | -         | 500 - 2000       | -         | -        |  |  |
| Reference voltage $(V)$         | _         | $< 3.3 - V_{tn}$ | -         | -        |  |  |
| Slew Rate $(V/\mu s)$           | > 100     | -                | > 100     | -        |  |  |
| Output Capacitance $(pF)$       | ) -       | -                | 2         | -        |  |  |
| Unity Capacitance $(pF)$        | -         | -                | -         | 0.5      |  |  |
| Switch on-resistance $(\Omega)$ | -         | -                | -         | 200      |  |  |

Table 6.6. Input parameters for the blocks of P-ADC

at the early stages since low resolution contributes higher error. As a result, higher resolutions at the early stages are desired.

Thermal noise is calculated for the remaining configurations and then some of them are eliminated, especially for low reference voltages. Next, errors of F-ADC and matching between capacitors are calculated. Mismatch of capacitors generates gain errors which limits the minimum gain for each stage.

Finally, performance parameters such as power, area, delay etc. are estimated for the remaining configurations. Consequently, optimum solutions are evaluated. Flowchart of the automation process can be illustrated as in Figure 6.17.

In order to validate the proposed method, an example is considered using the input set, which is given in Table 6.6. SNR value determines the lower bound of the resolution as 10 bits by (6.8) and the upper bound is 12 bits. In other words, different system configurations from 10 to 12-bits resolution are generated. Higher

resolutions at the early stages and lower resolutions at the latter stages are selected. A sample of available configurations for 10 -bit design is given in Table 6.7 and the



Figure 6.17. Flowchart of P-ADC design automation.

results are illustrated in Figure 6.18. Configuration column shows the stage resolutions from left to right. The sum of the stage resolutions exceeds the desired resolutions because of the redundancy bit. For each stage, one redundant bit is used for the digital correction part. The proposed approach selects the configuration 3333322 as an optimum solution. On the other hand, 5432 has the minimum power consumption; however its area value is high. Besides, 2222222222 has the minimum area due to the small quantity of capacitors and comparators for each stage. The increasing number of comparators determines the area and power values. Therefore, 553 and 5432 are far from global minimum. It can be concluded that optimum solutions can be search among the configurations, which have medium resolution. The proposed method is compared to the reference models, which can be found in the open literature in order to show the validity of our approach. For example, an optimization model of P-ADC, which was given in [53], was compared to our approach. Different configurations were evaluated and 443333 was selected as an optimum solution. Then, our approach was applied to different configurations and results were demonstrated in Figure 6.19. It was observed that 443333 was found among the optimum solutions. The dashed line represents the Pareto curve of power and area parameters for the performance space of

| Configurations | Number of | Number of   | Power Dissipation | Area     |
|----------------|-----------|-------------|-------------------|----------|
| types          | op-amps   | comparators | (mW)              | $(mm)^2$ |
| 553            | 3         | 69          | 82 - 92           | 0.034    |
| 2222222222     | 10        | 30          | 78 - 98           | 0.008    |
| 33222222       | 8         | 32          | 62 - 78           | 0.012    |
| 4322222        | 7         | 37          | 53 - 66           | 0.017    |
| 33333          | 5         | 35          | 38 - 48           | 0.014    |
| 333322         | 6         | 34          | 45 - 58           | 0.011    |
| 3332222        | 7         | 29          | 54 - 68           | 0.011    |
| 4433           | 4         | 44          | 32 - 40           | 0.02     |
| 5432           | 4         | 56          | 32 - 40           | 0.03     |
| 44322          | 5         | 43          | 38 - 48           | 0.017    |
| 433222         | 6         | 35          | 45 - 58           | 0.017    |

Table 6.7. Different configurations for 10-bits and their power-area values



Figure 6.18. PFs of different configurations for 10-bits.

the 14-bit P-ADC. In addition, using the solutions from the reference model, PF was obtained and demonstrated together with the proposed method as it is given in Figure



Figure 6.19. PFs of different configurations for 14-bits.

6.20. Difference in parameter values stem from the technological discrepancies such as supply voltage was 5 V in the reference model. Therefore, solutions from the reference model were scaled in order to examine a better comparison. A coarse calculation was applied to the solutions of the [53]; power values were decreased linearly and the area was shrunk quadratically. Consequently, scaled-solutions were come close to the range of our approach. As expected, they were still above the PF of our approach. Moreover, different P-ADC models, most of which emphasized power optimization, were taken into account. In Table 6.8, several models are listed including their optimum solution. Furthermore, all possible technological knowledge was collected from the models in order to apply a regular comparison. Input parameters, which are given in Table 6.6, were applied to our model. The only difference was the reference voltage, which was selected as 1 V due to the reference models. Comparison is shown in Figure 6.21.

In Figure 6.21, the best matching was obtained in the second comparison because the critical parameters for power values such as supply voltage, reference voltage, unit capacitance and also sampling rate were similar. Besides, all solutions collected from our work were approximately equal or smaller than the reference ones. Eventually,



Figure 6.20. PF of the proposed approach and PF of the reference model including scaled solutions.

| Reference | $\operatorname{Resolution}$ | Optimum                        | Technological        | Solution |
|-----------|-----------------------------|--------------------------------|----------------------|----------|
| Models    |                             | $\operatorname{configuration}$ | Knowledge            |          |
| [54]      | 12 - bit                    | 3232332                        | Supply:3.3V,         | 230mW    |
|           |                             |                                | $V_{ref}$ : $1V$ ,   |          |
|           |                             |                                | $I_{opamp}$ :        |          |
|           |                             |                                | 4mA                  |          |
| [55]      | 12 - bit                    | 43222222                       | ${ m Supply:} 3.3V,$ | 56mW     |
|           |                             |                                | $V_{ref}$ : $1V$ ,   |          |
|           |                             |                                | $C_U: 0.45 pF$       |          |
| [56]      | 12 - bit                    | 322222223                      | Supply: 3V,          | 290mW    |
|           |                             |                                | $V_{ref}: 1V$        |          |
| [57]      | 12 - bit                    | 244443                         | Supply: 5V,          | 65mW     |
|           |                             |                                | $V_{ref}: 2V$        |          |

Table 6.8. Different reference models and their solutions.



Figure 6.21. A comparison to different models.

our work was verified by H-spice simulation. 8-bit P-ADC was selected as a case study for verification process. Optimum solution was searched from different configurations and then, circuit parameters were gathered from the optimum solution for simulation. It was assumed that there was no redundant bit in the stages. For example, the configuration 332 was selected as an optimum solution. As mentioned before, not only the performance parameters were obtained but also an approximate design of each subblock was extracted. A Pareto point was selected from the PF of 332, and their parameters were sent to the H-spice simulator. The selected system has 15 comparators and 3 MDAC. There was no amplification at the last stage so the number of op-amps was 2. The latched comparator and the Miller op-amp were selected from the library. Their input specifications were given in Table 6.6 together with resistor string, switches and capacitor array. Reference voltage was limited to 2 V and the unity capacitor was 1 pF in each stage. Initially, dc analysis was taken into account. In Table 6.9, simulation and estimation results of the comparator and the op-amp were compared. It was observed that current values remained below 3%. In other words, power consumption estimations were accurate enough for the designer. In Figure 6.22, output of F-ADC and MDAC were illustrated.

| The op-amp     | Idout        | Idif         | Power of a           | Total Power |
|----------------|--------------|--------------|----------------------|-------------|
|                |              |              | $single \ op-amp$    | of F-ADCs   |
| Estimations    | 1.1mA        | $10.5 \mu A$ | 3.7mW                | 7.4mW       |
| Simulation     | 1.2mA        | $11.5 \mu A$ | 3.99mW               | 8mW         |
| The comparator | Id           | Idtail       | Power of a           | Total Power |
|                |              |              | $single\ comparator$ | of MDACs    |
| Estimations    | $30\mu A$    | $267 \mu A$  | 1.81mW               | 31.2mW      |
| Simulation     | $31.5 \mu A$ | $275 \mu A$  | 1.92mW               | 32.5mW      |

Table 6.9. A sample of DC parameters of the op-amp and the comparator.



Figure 6.22. A sample of transient analysis for 8-bit P-ADC. (a) Output of MDAC, residue amplification, (b) Output of F-ADC, digital output

Secondly, transient analysis was applied to the example. Estimated bandwidth values were 57 MHz and 20 MHz for the comparator and the op-amp, respectively. Maximum delay of the comparator was calculated as 113 ns. The speed of the system is determined by the bandwidth of the op-amp. Therefore, 20 MHz clock frequency

was applied to the system. Reference voltage was 1 V and a sinusoidal input voltage was applied. Residue amplification was obtained as expected and the comparison was done accurately. For example, when input voltage is higher than the reference voltage, output of the comparator becomes logic 1, which means 1.65 V-3.3 V.

| First stage 3-bits     |                        |        |                                        |                    |       |               |
|------------------------|------------------------|--------|----------------------------------------|--------------------|-------|---------------|
|                        | F-ADC                  |        |                                        | MDAC               |       |               |
| V <sub>in</sub> =1.1 V | V <sub>ref</sub> =1.4V | Output | Sample Phase                           |                    | Ar    | nplification  |
| 1.1V                   | 1.225V                 | 0      | $\Delta V = V_{ii}$                    | -V <sub>in</sub> ° |       |               |
| 1.1V                   | 1.05V                  | 1      |                                        |                    |       |               |
| 1.1V                   | 0.875V                 | 1      | Vin'=Vref. 6/                          | 8= 1.05V           | (     | 0.05×8=0.4V   |
| 1.1V                   | 0.7V                   | 1      |                                        |                    | i     | nput for the  |
| 1.1V                   | 0.525V                 | 1      | $\Delta V = 1.1 - 1.02$                | 5V=0.05V           |       | next stage    |
| 1.1V                   | 0.35V                  | 1      |                                        |                    |       |               |
| 1.1V                   | 0.175V=LSB             | 1      |                                        |                    |       |               |
|                        | Second                 |        |                                        |                    |       |               |
|                        | F-ADC                  |        |                                        | MDAC               |       |               |
| $V_{in}=0.4V$          | V <sub>ref</sub> =1.4V | Output | Sample I                               | Phase              | A     | Amplification |
| 0.4V                   | 1.225V                 | 0      | $\Delta V = V_{ii}$                    | -V <sub>in</sub> ° |       |               |
| 0.4V                   | 1.05V                  | 0      |                                        |                    |       |               |
| 0.4V                   | 0.875V                 | 0      | V <sub>in</sub> '=V <sub>ref</sub> . 2 | /8=0.35V           | (     | 0.05×8=0.4V   |
| 0.4V                   | 0.7V                   | 0      |                                        |                    | i     | nput for the  |
| 0.4V                   | 0.525V                 | 0      | $\Delta V = 0.4 V$                     | -0.35V             |       | next stage    |
| 0.4V                   | 0.35V                  | 1      | = 0.0                                  | 5V                 |       |               |
| 0.4V                   | 0.175V=LSB             | 1      |                                        |                    |       |               |
| L                      | ast stage 2-bits       |        | Digit                                  | al Code see        | Fig   | .6.24         |
|                        | F-ADC                  |        | 0111111                                | 0000011            | l     | 001           |
| $V_{in}=0.4V$          | V <sub>ref</sub> =1.4V | Output | 110                                    | 010                |       | 01            |
| 0.4V                   | 1.05V                  | 0      | 1100                                   | 1001 = 1.1V        | / foi | 8-bit         |
| 0.4V                   | 0.7V                   | 0      | Verification                           |                    |       |               |
| 0.4V                   | 0.35V                  | 1      | 128+64+8+1=201                         |                    |       | 1             |
|                        |                        |        | 201×1.4/256=1.099V                     |                    |       |               |

Figure 6.23. Expected values of each stage for 8-bit conversion.

After that, a constant input was applied and it was shown that the 8-bit conversion was obtained successfully for the given bandwidth. In Figure 6.23, expected values of the digital code and the analog inputs between stages (residue amplification) were tabulated. Next, simulation results were shown by the following figure. Outputs of F-ADC were converted to 3-bit code such as 0111111 to 110 using an encoder. Finally, 1.1V was represented by 11001001, 8-bit digital code. Simulation results of each F-ADC were shown in Figure 6.24. However, there was a bit error at the output of the second stage. 0000111 was obtained instead of 0000011. On the other hand, this error can be corrected by the digital correction block. As a result, 1.12 V was obtained (11001101) by simulation with 20 MHz clock frequency as estimated.



Figure 6.24. Output of the F-ADCs (a)  $1^{st}$  stage: 0111111, (b)  $2^{nd}$  stage: 0000111, (c)  $3^{rd}$  stage: 001.

## 7. CONCLUSION

A general methodology for performance estimation of mixed-signal systems is proposed while exploiting the Pareto Front concept without running optimization algorithms and Spice-like simulators. As mentioned before, performance estimation requires a well-determined performance design space exploration for a given technology. Since complexity of mixed-signal systems grows progressively, exploration of a huge design space is required for PE of system blocks with a dramatically increased exploration time. Therefore, a Matlab-based library is presented for fast and accurate PDS exploration. In addition, accurate design estimations of the system blocks and subblocks are obtained. Then, PF approach is applied to the system blocks in order to collect optimum solution sets for selected performance parameters. Finally, PF composition is considered providing the *dominance rule* with algebraic representations. Moreover, not only optimum solution sets are extracted but also an approximate design of the blocks is obtained in this thesis. As a result, a suitable execution time and accuracy tradeoff is achieved for PE in ADA systems. The proposed approach has three main contributions: the library, PF extraction and PF composition.

First contribution is the model library. It provides PE models of analog blocks such as different op-amp topologies, a comparator, a Flash ADC, a LPF and subblocks such as a differential input stage, an output stage, and current mirrors etc. which are generated by EKV MOSFET model. EKV MOSFET model is preferred for model generation because it has the advantage of continuity in all regions. Although, BSIM3 is the most preferred MOSFET model in industrial applications, it has some problems in the moderate inversion region at which most of the modern designs are modeled. In addition, EKV is more convenient for a faster modeling procedure because it has less model parameters than the BSIM3. Accurate estimations can be examined by simple, paper-work analysis.

If the given system does not exist in the library, it is partitioned into subblocks so that the designer has the advantage of observing the behavior of subblocks which may affect the performance of the whole system. In addition, execution time will be reduced as any approach will be applied to a sub-block rather than the main analog block. It should not be overlooked that the execution time of modeling increases exponentially with the number of elements in analog blocks. The designer not only evaluates PE of subblocks but also attains knowledge about an approximate design including circuit and design parameters of each subblock, whereas, most of the behavioral models cannot provide design details of the blocks. PE tool can be implemented into a specific system whose subblocks are already found or whose subblocks are determined by an algorithm called topology divider.

It was observed that the library spent maximum of three hours for collecting PDS estimation of a selected block within an acceptable error rate. Models of the analog blocks were validated by comparing with different approaches and accuracy of the estimated solutions was verified by comparing with H-spice simulation results. PE was embedded into a sigma-delta ADC design automation system and already proposed in [32] where the library also enables us select the best op-amp topology through the automation process, serves most purposes for mixed-signal systems.

Second contribution is utilization of Pareto Front extraction. PE tool helps the circuit level synthesis by giving hints about optimization and eliminating unfeasible solutions. Since extracting PDS of analog blocks refers to the elimination part, an approach is proposed for obtaining optimum solution sets which can be derived from that PDS set.

PF extraction provides less execution time when compared to complex optimization algorithms. Since PDS is already explored, only a *simple algorithm* is applied to that data set and then Pareto set is extracted quickly. Each Pareto point refers to circuit and design parameter sets, which are saved by the library for circuit level synthesis. In other words, the designer has the knowledge of the optimal designs, which are implementable in circuit level synthesis for exact sizing. Furthermore, PF of a given block can be extracted in not only two-dimensional space but also three or more dimensional representations can be obtained. Case studies showed that PFs were obtained just in minutes for different input parameters of different blocks such as Miller op-amp, LPF, comparator and F-ADC.

The last contribution is necessary when a designer needs a fast way to get insight into the capability of the whole system after the generation of PF of subblocks. Therefore, a fast composition process of PF is proposed. Estimating overall performance using the design space sets is a costly process. Instead, Pareto points are used for that purpose. This is the advantage of PFs, especially when the systems start to become more and more complex. In this thesis, a simple and fast composition approach is proposed: any operation which preserves the domination rule can be used in connecting the performance parameters.

Composition procedure was applied to several examples and it was observed that composition took only a couple of minutes. In LPF example, composition was compared to symbolic analysis with Brute Force approach and the similar results were obtained. Symbolic analysis took 15 hours, whereas, modeling of subblocks, PF extraction and PF composition of the proposed approach took only half an hour. In addition, PF extraction of comparator and resistor string took 3.4 and 1.2 minutes, respectively in F-ADC example and the composition procedure took 50 seconds.

In the future work, new analog blocks together with the new technology parameters can be added into the library. In addition, an optimizer and/or some behavioral models can be implemented into the tool instead of Brute Force approach for the blocks, which cannot be found in the library. Finally, a general interface of PE tool can be designed such as a Matlab GUI with add-on tools for different ADC types or for different mixed-signal systems.

## REFERENCES

- 1. Günhan D. Balkır S. and Selcuk Ö. Analog VLSI Design Automation. CRC, 2003.
- 2. V. Pareto. The new theories of economics. Journal of Political Economy, 5, 1897.
- 3. B. De Smedt and G. Gielen. Watson: design space boundary exploration and model generation for analog and rfic design. *IEEE Journal of CAD*, 22(2):213–224, 2003.
- W. Daems, G. Gielen, and W. Sansen. Simulation-based generation of posynomial performance models for the sizing of analog integrated circuits. *IEEE Journal of CAD*, 22(5):517–534, 2003.
- F. De Bernardinis, M. I. Jordan, and A. SangiovanniVincentelli. Support vector machines for analog circuit performance representation. In *Proc. Design Automation Conf. DATE*, pages 964–969, 2003.
- T. Kiely and G. Gielen. Performance modeling of analog integrated circuits using least-squares support vector machines. In Proc. Design, Automation and Test in Europe Conf. and Exhibition, volume 1, pages 448–453, 2004.
- G. Van der Plas, J. Vandenbussche, G. Gielen, and W. Sansen. Estemate: a tool for automated power and area estimation in analog top-down design and synthesis. In Proc. Custom Integrated Circuits Conf., pages 139–142, 1997.
- T. McConaghy, T. Eeckelaert, and G. Gielen. Caffeine: template-free symbolic model generation of analog circuits via canonical form functions and genetic programming. In Proc. 31st European Solid-State Circuits Conf, ESSCIRC, pages 243-246, 2005.
- J. Shao and R. Harjani. Feasibility region modeling of analog circuits for hierarchical circuit design. In Proc. 37th Midwest Symp. Circuits and Systems, volume 1, pages 407–410, 1994.

- J. Tao, X. Zeng, W. Wang, and C. Chiang. A multi-companding algorithm for analog behavioral modeling via wavelet collocation method. In *Proc. 5th Int ASIC Conf*, volume 1, pages 266–269, 2003.
- H. Taher, D. Schreurs, and B. Nauwelaers. Black box modelling at the circuit level: Op-amp as a case study. In *Proc. IEEE Mediterranean Electrotechnical Conf., MELECON*, pages 133–136, 2006.
- R. L. O. Pinto and F. Maloberti. X ray and blue print: tools for mosfet analog circuit design addressing short-channel effects. In Proc. Int. Symp. Circuits and Systems, ISCAS, volume 5, 2004.
- A. Nunez-Aldana and R. Vemuri. An analog performance estimator for improving the effectiveness of cmos analog systems circuit synthesis. In Proc. Automation and Test in Europe Conf Design and Exhibition 1999, pages 406–411, 1999.
- E. Deniz and G. Dundar. Performance estimator for an analog design automation system using ekv-modeled analog circuits. In Proc. European Conf. Circuit Theory and Design, ECCTD, volume 2, 2005.
- 15. E. Deniz and G. Dundar. Ddpet: Design space exploration tool for opamp configurations and basic analog blocks, appel. In *Proc. Applied Electronics*, 2008.
- D. Mueller, G. Stehr, H. Graeb, and U. Schlichtmann. Deterministic approaches to analog performance space exploration. In Proc. 42<sup>nd</sup> Design Automation Conf, DATE, pages 869–874, 2005.
- G. Stehr, H. E. Graeb, and K. J. Antreich. Analog performance space exploration by normal-boundary intersection and by fourier-motzkin elimination. *IEEE Journal of CAD*, 26(10):1733-1748, 2007.
- D. Mueller, H. Graeb, and U. Schlichtmann. Trade-off design of analog circuits using goal attainment and "wave front" sequential quadratic programming. In

Proc. Design, Automation-Test in Europe Conf. and Exhibition, DATE, pages 1–6, 2007.

- H. Graeb, D. Mueller, and U. Schlichtmann. Pareto optimization of analog circuits considering variability. In Proc. 18<sup>th</sup> European Conf. Circuit Theory and Design, ECCTD, pages 28–31, 2007.
- 20. H Graeb. Analog Design Centering and Sizing. Springer, 2007.
- M. Holzer, B. Knerr, and M. Rupp. Design space exploration with evolutionary multi-objective optimisation. In Proc. Int. Symp. Industrial Embedded Systems, SIES, pages 126–133, 2007.
- 22. Flor Castillo, Arthur Kordon, Guido Smits, Ben Christenson, and Dee Dickerson. Pareto front genetic programming parameter selection based on design of experiments and industrial data. In Proc. of the 8<sup>th</sup> annual conference on Genetic and evolutionary computation, GECCO, pages 1613–1620, New York, NY, USA, 2006. ACM.
- 23. John R. Josephson, B. Chandrasekaran, Mark Carroll, Naresh Iyer, Bryon Wasacz, Giorgio Rizzoni, Qingyuan Li, and David A. Erb. An architecture for exploring large design spaces. In Proc. of the 15<sup>th</sup> national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence, AAAI/IAAI, pages 143–150, Menlo Park, CA, USA, 1998. American Association for Artificial Intelligence.
- Jun Zou, Daniel Mueller, Helmut Graeb, and Ulf Schlichtmann. Pareto-front computation and automatic sizing of cpplls. In Proc. 8th Int. Symp. Quality Electronic Design, ISQED, pages 481–486, 2007.
- 25. R. Schoofs, T. Eeckelaert, M. Steyaert, G. Gielen, and W. Sansen. A continuoustime delta-sigma modulator for 802.11a/b/g whan implemented with a hierarchical bottom-up optimization methodology. In Proc. IEEE Int. Conf. Electronics, Cir-

cuits and Systems, ICECS, pages 950-953, 2006.

- T. McConaghy, P. Palmers, M. Steyaert, and G. G. E. Gielen. Variation-aware structural synthesis of analog circuits via hierarchical building blocks and structural homotopy. *IEEE Journal of CAD*, 28(9):1281–1294, 2009.
- T. McConaghy, P. Palmers, G. Gielen, and M. Steyaert. Automated extraction of expert knowledge in analog topology selection and sizing. In Proc. IEEE/ACM Int. Conf. Computer-Aided Design, ICCAD, pages 392–395, 2008.
- Yu Liu, M. Yoshioka, K. Homma, and T. Shibuya. Efficiently finding the best solution with multi-objectives from multiple topologies in topology library of analog circuit. In *Proc. Asia and South Pacific Design Automation Conf., ASP-DAC*, pages 498–503, 2009.
- 29. R. Castro-Lopez, E. Roca, and F. V. Fernandez. Multimode pareto fronts for design of reconfigurable analogue circuits. *Electronics Letters*, 45(2):95–96, 2009.
- 30. C. Haubelt and J. Teich. Accelerating design space exploration using pareto-front arithmetics [soc design]. In Proc. Asia and South Pacific Design Automation Conf, ASP-DAC, pages 525–531, 2003.
- Christian C. Enz, François Krummenacher, and Eric A. Vittoz. An analytical mos transistor model valid in all regions of operation and dedicated to low-voltage and low-current applications. *Analog Integr. Circuits Signal Process.*, 8(1):83–114, 1995.
- Selçuk Talay, Engin Deniz, and Günhan Dündar. A sigma-delta adc design automation tool with embedded performance estimator. *Integration. The VLSI Journal*, 42(2):181–192, 2009.
- 33. C. Enz E. Vittoz and F. Krummenacher. A basic property of mos transistors and its circuit implications. In Workshop on Compact Modeling at the International

Conference on Modeling and Simulation of Microsystems, 2003.

- M. Kayal D. Stefanovic. Bsim2ekv: Bsim3.3 to ekv 2.6 model library file automatic conversion. In MOS Modeling and Parameter Extraction Working Group Workshop, 2004.
- 35. P. Mawet. Low-power circuits and beyond; a designer's prospective on the ekv model and its usage. In MOS Modeling and Parameter Extraction Working Group Workshop, 2006.
- EPFL-EKV-Model.
   legwww.epfl.ch/ekv/i.
- 37. J. Arabas, L. Bartnik, S. Szostak, and D. Tomaszewski. Global extraction of mosfet parameters using the ekv model: Some properties of the underlying optimization task. In Proc. MIXDES-16<sup>th</sup> Int. Conf. Mixed Design of Integrated Circuits and Systems, pages 67–72, 2009.
- M. Bucher, C. Lallement, and C. C. Enz. An efficient parameter extraction methodology for the ekv most model. In Proc. IEEE Int. Conf. Microelectronic Test Structures, ICMTS, pages 145–150, 1996.
- D. Stefanovic, M. Kayal, M. Pastre, and V. B. Litovski. Procedural analog design (pad) tool. In Proc. Fourth Int Quality Electronic Design Symp, pages 313–318, 2003.
- 40. David M. Binkley. A methodology for analog cmos design based on the ekv mos model. In *Proc. European Conf. Circuit Theory and Design, ECCTD*, 2001.
- M. Bucher, D. Kazazis, F. Krummenacher, D. Binkley, D. Foty, and Y. Papananos. Analysis of transconductances at all levels of inversion in deep submicron cmos. In Proc. 9<sup>th</sup> Int Electronics, Circuits and Systems Conf, pages 1183–1186, 2002.
- 42. C. Enz F. Théodoloz F. Krummenacher M. Bucher, C. Lallement. The epfl-ekv

mosfet model equations for simulation. Technical report, EPFL, 1997.

- 43. C. Lallement, F. Pecheux, and Y. Herve. Vhdl-ams design of a most model including deep submicron and thermal-electronic effects. In Proc. 5<sup>th</sup> Int. Workshop Behavioral Modeling and Simulation, BMAS, pages 91–96, 2001.
- 44. I. Panayotov Angelov, G. and M. Hristov. Ekv mosfet model implementation in matlab and verilog-a. In Proc. 26<sup>th</sup> Int. Conf. Microelectronics, MIEL, pages 515– 518, 2008.
- 45. F. Krummenacher J.M. Sallese C. Lallement A.S. Porret M. Bucher, C. Enz. The ekv 3.0 compact mos transistor model: Accounting for deep-submicron aspects. In *Conference on Modeling and Simulation of Microsystems*, 2002.
- 46. M. Bucher, C. Lallement, C. Enz, and F. Krummenacher. Accurate mos modelling for analog circuit simulation using the ekv model. In *Proc. IEEE Int Circuits and Systems, ISCAS*, volume 4, pages 703–706, 1996.
- 47. Willy Sansen. Analog Design Essentials. The Springer, 2006.
- 48. K. Francken and G. G. E. Gielen. A high-level simulation and synthesis environment for delta-sigma; modulators. *IEEE Journal of CAD*, 22(8):1049–1061, 2003.
- J. Ruiz-Amaya, J. M. de la Rosa, F. V. Fernandez, F. Medeiro, R. del Rio, B. Perez-Verdu, and A. Rodriguez-Vazquez. High-level synthesis of switched-capacitor, switched-current and continuous-time sigma-delta modulators using simulinkbased time-domain behavioral models. *IEEE Journal of CASI-RP*, 52(9):1795– 1810, 2005.
- D. Mueller U. Schlichtmann J. Zou, H. Graeb. Optimization of sc sd modulators based on worst-case-aware pareto-optimal fronts. In *IEEE Custom Integrated Circuits Conference, CICC*, 2007.
- 51. T. Fiez M. Ismail. Analog VLSI, Signal and Information Processing. McGraw Hill,
1994.

- 52. R J. Plassche. Integrated Analog to Digital and Digital to Analog Converters. Kluwer Academic Publishers, 1994.
- J. Goes, J. C. Vital, and J. E. Franca. Systematic design for optimization of high-speed self-calibrated pipelined a/d converters. *IEEE Journal of CAS-II*, 45(12):1513-1526, 1998.
- 54. J. Ruiz-Amaya, J. M. de la Rosa, M. Delgado-Restituto, and A. Rodriguez-Vazquez. Behavioral modeling simulation and high-level synthesis of pipeline a/d converters. In Proc. IEEE Int. Symp. Circuits and Systems, ISCAS, pages 5609– 5612, 2005.
- 55. R. Lotfi, M. Teherzadeh-Sani, M. Y. Azizi, and O. Shoaei. Systematic design for power minimization of pipelined analog-to-digital converters. In *Proc. of Computer Aided Design Int. Conf., ICCAD*, pages 371–374, 2003.
- B.E. Boser B. Murmann. A 12-bit 75-ms/s pipelined adc using open-loop residue amplification. *IEEE Journal of Solid-State Circuits*, 38:2040–2050, 2003.
- 57. Cao Junmin, Chen Zhongjian, Lu Wengao, and Zhao Baoying. A cost-efficient 12bit 20msamples/s pipelined adc. In Proc. 9th Int. Conf. Solid-State and Integrated-Circuit Technology, ICSICT, pages 1961–1964, 2008.