## DAQ LINK AND IBERT STUDIES OF CMS HGCAL BACK END ELECTRONICS

by

Taylan Tatlı<br/> B.S., Physics, Boğaziçi University, 2018

Submitted to the Institute for Graduate Studies in Science and Engineering in partial fulfillment of the requirements for the degree of Master of Science

> Graduate Program in Physics Boğaziçi University 2022

## ACKNOWLEDGEMENTS

The studies were carried out within Bogazici University Research Fund Project (Grant Number: 21B03P2 and 18581), "Data taking and analysis with the new HF Online RADDAM monitoring system during RunIII period", TUBITAK 2232 - 118C223 Project, "AdvancedTCA Development for High-Energy Physics and Other Applications" and 2018 TAEK (CERN) A.5H6.F2-18 Project, "CMS dedektöründe B Fiziği, Kuark-Gluon Jet Ayrımı ve Faz 2 Dedektör Güncelleme Çalışmaları".

### ABSTRACT

# DAQ LINK AND IBERT STUDIES OF CMS HGCAL BACK END ELECTRONICS

The HL-LHC experiment will face two main challenges: high radiation due to increased high integrated luminosity and large amount of data resulting from high instantaneous luminosity. To overcome these two challenges, CMS plans to make many updates to its detector, such as replacing the calorimeter with high granularity calorimeter. The new calorimeter uses newly developed technologies for both the front-end and back-end electronics part of the data read-out electronics. The design of these parts is based on high-bandwidth data transmission via optical links and FPGA technology. A lot of studies are required to establish an efficient and long-lasting data collection performance of the detector. One of them is the optimization of link connections of the hardware on the back-end electronics and the other is integrated bit error rate tests on the FPGAs. In this thesis, first, optimization of link connection studies, methodology and tools used for those studies are explained and then integrated bit error rate test by introducing and explaining hardware and software used for data transmission at back end electronics of the new calorimeter.

## ÖZET

# CMS HGCAL VERİ TOPLAMA ARKA UÇ ELEKTRONİĞİ BAĞLANTI VE IBERT ÇALIŞMALARI

HL-LHC deneyi iki ana zorlukla karşı karşıya kalacaktır: artan entegre luminozite nedeniyle ortaya çıkan yüksek radyasyon ve yüksek anlık luminoziteden kaynaklanan çok miktarda veri. Bu iki zorluğun üstesinden gelmek için CMS, kalorimetrenin bazı kısımlarını yüksek tanecikli kalorimetre ile değiştirmek gibi birçok güncelleme planladı. Yeni kalorimetre, veri okuma elektroniğinin hem ön uç hem de arka uç elektroniği bölümü için yeni geliştirilmiş teknolojileri kullanmaktadır. Bu bölümlerin tasarımı, optik bağlantılar ve FPGA teknolojisi aracılığıyla yüksek bant genişliğine sahip veri iletimine dayanmaktadır. Dedektörün verimli ve uzun ömürlü bir veri toplama performansının oluşturulabilmesi için çok sayıda çalışma yapılması gerekmektedir. Bunlardan biri donanımın arka uç elektroniği üzerindeki optik bağlantılarının optimizasyonu, diğeri ise FPGA'lar üzerinde entegre bit hata oranı testleri. Bu tezde, öncelikle optik bağlantıların optimizasyonu çalışmaları, bu çalışmalarda kullanılan metodoloji ve araçlar anlatılmış, ardından yeni kalorimetrenin arka uç elektroniğinde veri iletimi için kullanılan donanım ve yazılımlar tanıtılarak entegre bit hata oranı testi anlatılmıştır.

# TABLE OF CONTENTS

| ACK  | KNO  | WLED   | GEMENTS                                        |     |               | • | •   | •          | • • | iii   |
|------|------|--------|------------------------------------------------|-----|---------------|---|-----|------------|-----|-------|
| ABS  | STR. | ACT    |                                                |     |               | • | •   | · •        | •   | iv    |
| ÖZE  | ET . |        |                                                |     |               | • | •   | · •        | •   | v     |
| LIST | ΓO   | F FIGU | RES                                            |     |               | • | •   | · •        | •   | х     |
| LIST | ΓO   | F TAB  | ES                                             |     |               |   | •   | · •        | •   | xviii |
| LIST | ΓO   | F SYM  | 30LS                                           |     |               | • | •   | · •        | •   | xix   |
| LIST | ΓO   | F ACR  | ONYMS/ABBREVIATIONS                            |     |               | • | •   | · •        | •   | XX    |
| 1. I | NTI  | RODU   | TION                                           |     |               |   | •   | · •        | •   | 1     |
| 2. I | LAR  | GE HA  | DRON COLLIDER                                  |     |               | • | •   | . <b>.</b> | •   | 3     |
| 2    | 2.1. | Techni | cal Details and Design                         |     |               |   |     | •          | •   | 4     |
| 2    | 2.2. | LHC t  | HL-LHC                                         |     |               |   | • • | •          | •   | 6     |
|      |      | 2.2.1. | Hardware Upgrades of HL-LHC                    |     |               |   |     | •          |     | 7     |
|      |      |        | 2.2.1.1. Accelerator Upgrades                  |     |               |   |     | •          | •   | 7     |
|      |      |        | 2.2.1.2. Injector Upgrades                     |     |               |   |     | •          | •   | 8     |
|      |      |        | 2.2.1.3. The upgrade of the major LHC experim  | ien | $\mathrm{ts}$ |   |     | •          |     | 8     |
| 3. ( | COM  | IPACT  | MUON SOLENOID                                  |     |               | • | •   | . <b>.</b> | •   | 9     |
| 3    | 3.1. | Tracki | ng System                                      |     |               |   |     | •          | •   | 10    |
| 3    | 3.2. | ECAL   |                                                |     |               | • | • • | •          | •   | 11    |
| 3    | 3.3. | HCAL   |                                                |     |               |   | • • | • •        | •   | 12    |
| 3    | 3.4. | The Se | lenoid Magnet                                  |     |               |   | • • | •          | •   | 13    |
| 3    | 3.5. | The M  | uon System                                     |     |               |   | • • | • •        | •   | 13    |
| 3    | 3.6. | HL-LH  | C Era and the CMS                              |     |               |   | • • | •          | •   | 14    |
|      |      | 3.6.1. | Radiation Damage to the CMS Detector           |     |               |   |     | •          |     | 15    |
|      |      | 3.6.2. | High Pileup                                    |     |               | • | • • | •          | •   | 16    |
| 3    | 3.7. | The P  | nase-2 Upgrades                                |     |               |   |     |            |     | 17    |
|      |      | 3.7.1. | General Considerations for the Phase 2 Upgrade |     |               |   |     | •          |     | 17    |
|      |      | 3.7.2. | Calorimeter Upgrades                           |     |               |   |     |            |     | 17    |
| 4. F | HIGI | H GRA  | NULARITY CALORIMETER                           |     |               | • | •   | · •        | •   | 19    |
| 4    | 1.1. | Active | Elements                                       |     |               |   | • • | · •        | •   | 21    |

|    |      | 4 1 1   |                                                     | 01 |
|----|------|---------|-----------------------------------------------------|----|
|    |      | 4.1.1.  | Silicon Sensors                                     | 21 |
|    |      | 4.1.2.  | Silicon Modules                                     | 21 |
|    |      | 4.1.3.  | Plastic scintillators and Scintillator tile-modules | 22 |
|    | 4.2. | Electro | onics Overview                                      | 23 |
|    |      | 4.2.1.  | Front-end Electronics                               | 23 |
|    |      | 4.2.2.  | Back-end Electronics                                | 25 |
| 5. | DAG  | ) LINK  | STUDIES OF HGCAL BACK END ELECTRONICS               | 27 |
|    | 5.1. | Baseli  | nes of BE DAQ Architecture                          | 27 |
|    | 5.2. | Load I  | Balancing Studies                                   | 29 |
|    |      | 5.2.1.  | Description of the Problem                          | 29 |
|    |      | 5.2.2.  | Main Challenge                                      | 31 |
|    |      | 5.2.3.  | Early Studies and First Chosen Work Strategy        | 31 |
|    |      |         | 5.2.3.1. Pyomo Optimization Suite                   | 32 |
|    |      |         | 5.2.3.2. Couenne MINLP Solver                       | 32 |
|    |      | 5.2.4.  | First Studies with Counne Solver                    | 33 |
|    |      |         | 5.2.4.1. For 5 FPGA Case                            | 33 |
|    |      |         | 5.2.4.2. For 6 FPGA Case                            | 33 |
|    |      | 5.2.5.  | Motivation after Conclusions                        | 34 |
|    |      |         | 5.2.5.1. New Objective                              | 35 |
|    |      |         | 5.2.5.2. Multi-Run Solving                          | 35 |
|    |      | 5.2.6.  | Some Results of Multi-Run Solving                   | 35 |
|    |      |         | 5.2.6.1. 7-FPGA 2-MultiRuns                         | 35 |
|    |      |         | 5.2.6.2. 7-FPGA 3-MultiRuns                         | 36 |
|    |      | 5.2.7.  | WarmStart and HardStart                             | 36 |
|    |      |         | 5.2.7.1. 7 FPGA HardStart Results                   | 37 |
|    |      |         | 5.2.7.2. 7-FPGA 2 Multi-Runs WarmStart Results      | 37 |
|    |      |         | 5.2.7.3. 7 FPGA 3 Multi-Run WarmStart Results       | 37 |
|    |      | 5.2.8.  | 14 and 28 FPGA Studies                              | 38 |
|    |      |         | 5.2.8.1. Work Strategy                              | 38 |
|    |      |         | 5.2.8.2. 14 FPGA Results                            | 39 |
|    |      |         | 5.2.8.3. 28 FPGA Results                            | 41 |

|    |      | 5.2.9. | Link Conncections after Redistribution and Swap Algorithm $$ .     | 43 |
|----|------|--------|--------------------------------------------------------------------|----|
|    |      |        | 5.2.9.1. The Solution of Total Sector (28 FPGA, 336 Slink) $\ .$ . | 44 |
|    |      |        | 5.2.9.2. The Solution of Total Sector after Redistribution (28     |    |
|    |      |        | FPGA, 336 Slink)                                                   | 44 |
|    |      |        | 5.2.9.3. The Solution of the Total Sector (28 FPGA, 336 Slink)     |    |
|    |      |        | after Swap                                                         | 45 |
|    |      | 5.2.10 | . The First BE DAQ Architecture Updates, 224 and 312 Slink         |    |
|    |      |        | Studies                                                            | 46 |
|    |      |        | 5.2.10.1. A New Constraint for the Slinks Connection to FPGAs:     |    |
|    |      |        | Event Buffer Number                                                | 47 |
|    |      |        | 5.2.10.2. Swap Algorithm for Slink connections on FPGAs $\ .$ .    | 47 |
|    |      |        | 5.2.10.3. Work Strategy for New Studies                            | 47 |
|    |      |        | 5.2.10.4. 224 Slink Results                                        | 48 |
|    |      |        | 5.2.10.5. 312 Slink Results                                        | 49 |
|    |      |        | 5.2.10.6. 16 FPGA Studies                                          | 53 |
| 6. | THE  | E IBER | T OF DIFFERENT FPGAS                                               | 56 |
|    | 6.1. | Sereni | ty ATCA Board as an HGCAL DAQ Prototype                            | 56 |
|    |      | 6.1.1. | The Serenity Hardware                                              | 56 |
|    |      | 6.1.2. | The Serenity Software                                              | 57 |
|    |      | 6.1.3. | Logical Connectivity of the Serenity Board                         | 58 |
|    | 6.2. | IBER   | Γ of different FPGAs using EMP Framework                           | 59 |
|    |      | 6.2.1. | Creating Bit and Auxiliary Files using EMP Framework               | 60 |
|    |      | 6.2.2. | NearPMA IBERT of VU7P so1.v1                                       | 61 |
|    |      | 6.2.3. | NearPMA IBERT of FPGA KU15P so2.v0                                 | 62 |
|    |      | 6.2.4. | NearPMA IBERT of FPGA KU15P so1.v1                                 | 65 |
|    |      | 6.2.5. | NearPMA IBERT of FPGA VU7P so.v1 at Different Speeds $\ .$ .       | 65 |
|    |      |        | 6.2.5.1.  NearPMA IBERT of FPGA VU7P so 1.v1  using  16  Gb/s      |    |
|    |      |        | Null Algorithm                                                     | 66 |
|    |      |        | 6.2.5.2. NearPMA IBERT of FPGA VU7P so<br>1.v1 using 25 Gb/s       |    |
|    |      |        | Null Algorithm                                                     | 67 |
| 7. | CON  | ICLUS  | ION                                                                | 68 |

|                                                 | 39 |
|-------------------------------------------------|----|
| APPENDIX A: ADDITIONAL RESULTS AND INFORMATIONS | 74 |
| A.1. Early Studies                              | 74 |
| A.1.1. A Realistic Toy Example with GEKKO       | 74 |
| A.1.2. The First Results for Toy Example        | 74 |
| A.1.3. Results for Toy Example with 2 FPGA      | 75 |
| A.1.4. Input Order Effect on Solutions          | 76 |
| A.2. Pyomo Optimization Suite                   | 77 |
| A.3. Couenne MINLP Solver                       | 78 |
| A.3.1. How does Couenne Work?                   | 79 |
| A.4. First Results with Counne Solver           | 30 |
| A.4.1. For 5 FPGA                               | 30 |
| A.4.2. For 6 FPGA                               | 31 |
| A.5. Some Results of Multi-Run Solving          | 32 |
| A.5.1. 7-FPGA 2-MultiRuns                       | 32 |
| A.5.2. 7-FPGA 3-MultiRuns                       | 32 |
| A.6. HardStart and WarmStart                    | 34 |
| A.6.1. 7 FPGA HardStart Results                 | 34 |
| A.6.2. 7 FPGA -2 Multi-Runs WarmStart Results   | 34 |
| A.6.3. 7 FPGA 3 Multi-Run WarmStart Results     | 35 |

ix

# LIST OF FIGURES

| Figure 2.1. | CERN's accelerator complex                                                                                                                                         | 3  |
|-------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| Figure 2.2. | Large Hadron Collider Schematics                                                                                                                                   | 4  |
| Figure 2.3. | Schematic view of LHC sections                                                                                                                                     | 5  |
| Figure 2.4. | The LHC to HL-LHC program                                                                                                                                          | 7  |
| Figure 3.1. | CMS detector with components                                                                                                                                       | 9  |
| Figure 3.2. | CMS coordinate system.                                                                                                                                             | 10 |
| Figure 3.3. | The tracker layout $(1/4 \text{ of the z view})$                                                                                                                   | 11 |
| Figure 3.4. | The view of the locations of subsystems of the HCAL and ECAL in CMS detector.                                                                                      | 13 |
| Figure 3.5. | The layout of one quarter of the CMS muon system                                                                                                                   | 14 |
| Figure 3.6. | Expected radiation levels in the CMS detector. R is the transverse distance from the beamline and Z is the distance along the beamline from the interaction point. | 15 |
| Figure 3.7. | High pileup event with 78 reconstructed vertices taken in 2012                                                                                                     | 16 |
| Figure 4.1. | HGCAL with key paremeters.                                                                                                                                         | 20 |
| Figure 4.2. | CE-E silicon module, showing stacked layers                                                                                                                        | 22 |

| Figure 4.3. | Schematic representation and key components of the CMS HGCAL<br>FE readout, trigger and control chain.                                                                                 | 25 |
|-------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| Figure 5.1. | Basic data transmission scheme. On the left, FE data transmission<br>and on the right, BE data transmission are shown basically                                                        | 30 |
| Figure 5.2. | lpGBT rates for the whole sector.                                                                                                                                                      | 30 |
| Figure 5.3. | On the left, Slink Rates vs Number of Slinks and on the right total FPGA rate plot                                                                                                     | 33 |
| Figure 5.4. | On the left, Slink rates vs number of Slinks and on the right, total FPGA rate plot                                                                                                    | 34 |
| Figure 5.5. | Results for 7 FPGA 2 MultiRuns. On the left, total rate distribution on number of Slinks. On the right, total FPGA rates are given.                                                    | 35 |
| Figure 5.6. | Results for 7 FPGA 3 MultiRuns. On the left, total rate distribution on number of Slinks. On the right, total FPGA rates are given.                                                    | 36 |
| Figure 5.7. | Total rate distribution to number of Slinks                                                                                                                                            | 37 |
| Figure 5.8. | Total rate distribution on number of Slinks for 7-FPGA 2 Multi-Runs                                                                                                                    | 37 |
| Figure 5.9. | Total rate distribution on number of Slinks. 7 FPGA results for<br>HardStart including High and Double lpGBTs, and WarmStart in-<br>cluding medium and low lpGBTs in two separate runs | 38 |

| Figure 5.10. | Total rate distribution on number of Slinks, high and double lpG-BTs are connected.                                                                                                       | 39 |
|--------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| Figure 5.11. | Total rate distribution on number of Slinks. Double, high and medium lpGBTs are connected.                                                                                                | 40 |
| Figure 5.12. | Total rate distribution on number of Slinks. Double, high, medium<br>and low lpGBTs are connected.                                                                                        | 40 |
| Figure 5.13. | On the left, how lpGBTs are distributed to Slinks and 14 FPGA lpGBTs vs Slinks and lpGBTs vs Total Rates is on the right                                                                  | 40 |
| Figure 5.14. | Total Rate vs Number of Slinks for each run. Run 0 (swap) is the naive distribution algorithm part for double and high lpGBTs rate. Each Run connection is saved and carried to other run | 41 |
| Figure 5.15. | Total rate distribution on number of Slinks is on the left and How objective value changes run by run is shown on the right                                                               | 41 |
| Figure 5.16. | Total rate distribution on number of Slinks. Double, high, medium<br>and half of low lpGBTs are connected.                                                                                | 42 |
| Figure 5.17. | Total rate distribution on number of Slinks. Double, high, medium<br>and low lpGBTs are connected, whole solution                                                                         | 42 |
| Figure 5.18. | The number of double high medium and low lpGBTs are connected for some selected Slinks.                                                                                                   | 42 |
| Figure 5.19. | The number of double high medium and low lpGBTs are connected for some selected Slinks.                                                                                                   | 43 |

| Figure 5.20. | How lpGBTs distribute to Slinks is on the left and lpGBTs vs Slinks                                                                                                               |    |
|--------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
|              | and lpgbt vs Total Rates is on the right.                                                                                                                                         | 43 |
| Figure 5.21. | Total rate distribution on number of Slinks                                                                                                                                       | 44 |
| Figure 5.22. | Total rate distribution on number of Slinks                                                                                                                                       | 45 |
| Figure 5.23. | Total rate distribution on number of Slinks                                                                                                                                       | 45 |
| Figure 5.24. | Total rate of each Slinkis shown on the left and total lpGBT number of each Slink is shown on the right.                                                                          | 46 |
| Figure 5.25. | On the left total lpGBT rates and on the right total lpGBT number distribution on FPGAs are shown.                                                                                | 46 |
| Figure 5.26. | On the left Couenne solver result for the whole sector and on the right the result after algorithms applied are shown                                                             | 48 |
| Figure 5.27. | On the left total lpGBT number and on the right total rate distribution on the FPGAs are shown.                                                                                   | 48 |
| Figure 5.28. | On the left total lpGBT rates and on the right total lpGBT number distribution on the FPGAs are shown                                                                             | 49 |
| Figure 5.29. | Total rate distributions on number of Slinks are given. On the left<br>Couenne solver result for the whole sector and on the right, result<br>after algorithms applied are shown. | 50 |
| Figure 5.30. | On the left, total lpGBT rates and on the right, total lpGBT num-<br>ber distribution on the FPGAs are shown.                                                                     | 50 |

| Figure 5.31. | On the left, result after all lpGBTs are even and on the right, result after algorithms applied are shown.                                                    | 51 |
|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| Figure 5.32. | On the left, lpGBT rates and on the right, total lpGBT number distribution on the FPGAs are shown                                                             | 51 |
| Figure 5.33. | Total lpGBT number (on the left) and rate distribution (on the right) on the number of FPGAs after swap algorithm for Slink connections on FPGAs              | 52 |
| Figure 5.34. | On the left total rate distribution on Slinks of Couenne solver result<br>for the whole sector and on the right, result after algorithms applied<br>are shown | 53 |
| Figure 5.35. | Total rate distribution on number of Slinks after improved swap algorithm.                                                                                    | 54 |
| Figure 5.36. | On the left, lpGBT rates and on the right, total lpGBT number distribution on the FPGAs are shown                                                             | 54 |
| Figure 5.37. | On the left, total lpGBT rates and on the right, total lpGBT num-<br>ber distribution on the FPGAs are shown.                                                 | 55 |
| Figure 6.1.  | A Serenity board with components                                                                                                                              | 57 |
| Figure 6.2.  | All connection possibilities of the Serenity board                                                                                                            | 58 |
| Figure 6.3.  | The chosen method of programming FPGA.                                                                                                                        | 59 |
| Figure 6.4.  | The bathtub plots for VU7P so1.v1                                                                                                                             | 62 |

| Figure 6.5.  | The bathtub plots for channels 0-35 of KU15P so<br>2.v0                                                                                                                             | 63       |
|--------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
| Figure 6.6.  | The bathtub plots for channels 36-71 of KU15P so<br>2.v0. $\ldots$                                                                                                                  | 64       |
| Figure 6.7.  | The bathtub plots for channels 12-23 of KU15P so<br>1.v1. $\ldots$ .                                                                                                                | 65       |
| Figure 6.8.  | The bathtub plots for channels 0-11 of VU7P so1.v1 at speed 16 Gb/s                                                                                                                 | 66       |
| Figure 6.9.  | The bathtub plots for channels 12-23 of VU7P so1.v1 at speed 16 Gb/s                                                                                                                | 66       |
| Figure 6.10. | The bathtub plots (25Gb/s) for VU7P so<br>1.v1 for channels 0-11. $% = 10^{-10}$ .                                                                                                  | 67       |
| Figure A.1.  | Total rate distribution on number of Slinks is at top left, lpgbt rates<br>for toy example is next to it, double and single lpGBT connections<br>are shown at bottom part of figure | 75       |
| Figure A.2.  | On the left, total rate distribution on number of Slinks and on the right, double lpGBT connections for 2 FPGAs are shown                                                           | 75       |
| Figure A.3.  | Single lpGBT connections for 2 FPGA case                                                                                                                                            | 76       |
| Figure A.4.  | lpGBTs input orders according to rates. On the left, Low/Medium/Hig<br>and on the right, Low/High/Medium                                                                            | gh<br>76 |
| Figure A.5.  | Total rate distribution on Slinks. On the left, Low/Medium/High and on the right, Low/High/Medium                                                                                   | 77       |
| Figure A.6.  | The assigning pattern of lpGBTs. On the left, Low/Medium/High and on the right, Low/High/Medium.                                                                                    | 77       |

| Figure A.7.  | Results of Apopt Solver for Pyomo                                                                                                                                                                                              | 78 |
|--------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| Figure A.8.  | Standard deviation and double lpGBT assigning results                                                                                                                                                                          | 80 |
| Figure A.9.  | Single lpGBTs assignment to Slinks is on the left and Slinks as-<br>signment to FPGAs is on the right of figure.                                                                                                               | 80 |
| Figure A.10. | Slink Rates vs Number of Slinks and total FPGA rate plot                                                                                                                                                                       | 81 |
| Figure A.11. | Standard deviation and double lpGBT assigning results for 6 FPGA case.                                                                                                                                                         | 81 |
| Figure A.12. | Total rate distribution on Slinks and FPGA rate plot after Slink connection.                                                                                                                                                   | 81 |
| Figure A.13. | Single lpGBTs assignment to Slinks is on the left and Slinks as-<br>signment to FPGAs is on the right of figure.                                                                                                               | 82 |
| Figure A.14. | For 7 FPGA 2 Multi-Run case, total rate distribution on number<br>of Slinks is at top left, value of new objective funciton is next to it,<br>all lpGBT assignment and total FPGA rates are shown at bottom<br>part of figure. | 83 |
| Figure A.15. | For 7 FPGA 3-MultiRuns case, total rate distribution on number<br>of Slinks is on the left and on the right, objective function values.                                                                                        | 83 |
| Figure A.16. | Total rate distribution to number of Slinks is on the left and naive distribution result for 7 FPGA is on the right of figrue                                                                                                  | 84 |

| Figure A.17. | Total rate distribution on Slinks and FPGAs are on the left, value |    |
|--------------|--------------------------------------------------------------------|----|
|              | of objective function and lpGBT assignment to Slinks are top and   |    |
|              | bottom right.                                                      | 84 |
|              |                                                                    |    |
| Figure A.18. | For 7 FPGA 3 Multi-Run WarmStart case, total rate distribution     |    |
|              | on number of Slinks is at top left, value of objective function is |    |
|              | next to it, lpGBT assignment to Slinks and total FPGA rates are    |    |
|              | shown at the bottom part of figure                                 | 85 |

# LIST OF TABLES

| Table 5.1. | Event buffer number of FPGAs before using the swap algorithm | 49 |
|------------|--------------------------------------------------------------|----|
| Table 5.2. | Problematic event buffer number of FPGAs after swap          | 50 |
| Table 5.3. | Event buffer number of FPGAs before swap                     | 52 |
| Table 5.4. | Event buffer number of FPGAs after using swap algorithm      | 52 |
| Table 5.5. | Event buffer number of FPGAs before using swap algorithm     | 55 |
| Table 5.6. | Event buffer number of FPGAs after using swap algorithm      | 55 |

# LIST OF SYMBOLS

| $E_T$     | Transverse Energy   |
|-----------|---------------------|
| $p_T$     | Transverse Momentum |
| $X_0$     | Radiation Length    |
|           |                     |
| η         | Pseudorapidity      |
| $\lambda$ | Interaction Length  |

# LIST OF ACRONYMS/ABBREVIATIONS

| ADC   | Analog-to-Digital Converter                        |
|-------|----------------------------------------------------|
| ALICE | A Large Ion Collider Experiment                    |
| APD   | Avalanche Photo-Diodes                             |
| ASICs | Application Specific Integrated Circuits           |
| ATCA  | Advanced Telecommunications Computing Architecture |
| ATLAS | A Toroidal LHC Apparatus                           |
| BX    | Bunch of Crossing                                  |
| BE    | Back End                                           |
| cDAQ  | Central Data Acquisition System                    |
| CE-E  | Electromagnetic Calorimeter                        |
| CE-H  | Hadronic Calorimeter                               |
| CMS   | The Compact Muon Solenoid                          |
| CSC   | Cathode Strip Chambers                             |
| DCS   | Detector Control System                            |
| DSS   | Detector Safety System                             |
| DTC   | Drift Tube Chambers                                |
| DTH   | Data Timing Hub                                    |
| EB    | ECAL Barrel                                        |
| ECAL  | Electromagnetic Calorimeter                        |
| EMP   | Extensible Modular Processor                       |
| EW    | Electro Weak                                       |
| FE    | Front End                                          |
| FEC5  | The Forward Error Correction                       |
| FIFO  | First-in-First-out                                 |
| FPGA  | Field Programmable Gate Array                      |
| HB    | HCAL Barrel                                        |
| HCAL  | Hadron Calorimeter                                 |
| HE    | HCAL Endcap                                        |

| HF      | HCAL Forward                          |
|---------|---------------------------------------|
| HGCROC  | HGCAL Readout Chip                    |
| HL-LHC  | High Luminosity Large Hadron Collider |
| HLT     | High Level Trigger                    |
| НО      | HCAL Outer                            |
| HPDs    | Hybrid Photodiodes                    |
| IBERT   | Integrated Bit Error Rate Test        |
| L1A     | Level1 Accept                         |
| LHC     | Large Hadron Collider                 |
| LHCb    | Large Hadron Collider beauty          |
| LIU     | LHC Injector Upgrade                  |
| LS1     | Long Shutdown 1                       |
| LS2     | Long Shutdown 2                       |
| LS3     | Long Shutdown 3                       |
| LpGBT   | Low-Power Gigabit Transceiver         |
| MINLP   | Mixed Integer Non-Linear Programming  |
| MIP     | Minimum Ionizing Particle             |
| QCD     | Quantum Chromodynamics                |
| PCB     | Printed Circuit Board                 |
| PS      | Proton Synchrotron                    |
| PS      | Polystrene-based                      |
| PSB     | Proton Synchrotron Booster            |
| PVT     | Polyvinyltoluene-based                |
| RPC     | Resistive Plate Chambers              |
| SAR ADC | Successive-Approximation ADC          |
| SiPM    | Silicon Photo-Multiplayer             |
| SM      | Standard Model                        |
| SPS     | Super Proton Synchronous              |
| TDC     | Time-to-Digital Converter             |
| TPG     | Trigger Primitive Generator           |
| VPT     | Vacuum Phototriodes                   |

VTRx+

### 1. INTRODUCTION

The large hadron collider (LHC) is the largest particle accelerator ever built. The experiments at LHC have enhanced our understanding of matter and energy. On July 4th 2012, LHC announced the discovery of higgs boson which was one of the main cornerstones of Standard Model (SM). To increase the discovery potential of the rare processes and particle observations, it is decided to increase the center of mass energy of collisions, because producing rare processes at the accelerator physics is directly related to it. Heavy-ion interactions (lead-lead, proton-lead), and proton-proton interactions with very high center of mass energy can enable the observations of physics phenomena that haven't been observed yet. In 2015, LHC produced proton-proton collisions at an energy of 13 TeV, it was an important improvement but wasn't enough to see rare events at the desired level. To achieve this, the High Luminosity Large Hadron Collider (HL-LHC) is being designed to increase the total integrated luminosity by a factor of 10. This project will provide more accurate measurements of new particles as well as observation of rare processes occurring below a certain sensitivity level and thus increase our understanding of the universe.

CMS plans a Phase 2 Physics program designed to explore the potential of the HL-LHC completely. As part of the Phase-2 Physics Program some systems of the CMS detector will be upgraded. One of the sub-systems that are being upgraded is end-cap calorimeter. That part should protect its performance against integrated radiation levels that are ten times higher and huge volume of data resulting from high luminosity collisions. Such radiation levels and high data rates will need new computing and electronics technology.

The High Granularity Calorimeter (HGCAL) is a new sampling calorimeter with high transverse and longitudinal granularity designed for the CMS experiment Phase-2 Physics Program to handle the challenges of HL-LHC. The challenges are about engineering (electronics, mechanical, and thermal) and data transmission. Because of the high radiation level, systems of the CMS that is close to interaction point will be damaged considerably. To increase the lifetime of the detector and maintain the performance of the detector, especially the front part should be radiation tolerant. HGCAL will use recent advances in radiation-tolerant fast electronics. The other challenge is that excessive number of collisions will lead to high data-bandwidth. The data should be transmitted to data acquisition system via back-end electronics of HGCAL with high data speed so that information can be collected from collision point in the most efficient way. HGCAL back-end electronics rely on high-bandwidth data transmission via optical fibres, and FPGA technology for the first level of event selection to achieve this. The optical fibers will transmit data to FPGAs, that's why the connection to FP-GAs via optical connectors should be optimized to ensure that the FPGAs can handle huge data in most efficient way. Also integrated bit error rate test (IBERT) of FPGAs must be done so that it provides the information that they can operate without data loss.

In this thesis, LHC/HL-LHC will be discussed in Chapter 2, CMS will be discussed in Chapter 3, and HGCAL in Chapter 4. Then, in Chapter 5, the Data Acquisition Systems Links Studies of HGCAL, and in Chapter 6 IBERT for different FPGAs will be explained.

### 2. LARGE HADRON COLLIDER

The LHC started up on September 10, 2008. It is the latest addition to CERN's accelerator complex as shown in Figure 2.1. It is placed in an underground tunnel approximately in a depth of 100 m. Its location is on the France-Switzerland border as shown in Figure 2.2.



Figure 2.1. CERN's accelerator complex.

The LHC comprises of a 26.7-kilometer ring of superconducting magnets with many accelerating systems to increase the energy of the particles [1]. Particle colliders are the particle accelerators where two beams of particles can collide with each other. They are strong research instruments because they provide much more center of mass (CM) energy compared to fixed target setups. Many particles can only be produced in high energy collisions and decay in a short time. That's why it is almost impossible to observe these in a different way. The data analysis of particles produced in these collisions may provide answers to many fundamental questions in physics. These questions are mainly related to the deep structure of space and time, the relation between quantum mechanics and general relativity and the fundamental laws governing the interactions and forces among the elementary particles.



Figure 2.2. Large Hadron Collider Schematics.

#### 2.1. Technical Details and Design

The LHC collides protons and also heavy-ion beams to enable lead-lead and proton-lead collisions for the studies of quark-gluon plasma. All the controls, services and technical infrastructure for the accelerator are managed by CERN Control Center. From here, the beams inside the LHC collide at four crossing points around the accelerator ring as illustrated in Figure 2.3. The positions of four particle detectors also called experiments. These are A Large Ion Collider Experiment (ALICE), A Toroidal LHC ApparatuS (ATLAS), Large Hadron Collider beauty (LHCb) and Compact Muon Solenoid (CMS). In addition to four major experiments, there are some other experiments: Large Hadron Collider forward (LHCf), Total Cross Section, Elastic Scattering and Diffraction Dissociation at the LHC (TOTEM) and the Monopole and Exotics Detector At the LHC (MoEDAL).

The LHC collider is not a perfect circle. It consists of eight arcs with a length of 2.45 km and straight sections with a length of 545 m. Inside the collider, two highenergy particle beams move with a speed close to the speed of light before the collision. Beams travel in opposite directions in separate beam pipes that are kept at ultrahigh vacuum. Much of the accelerator is connected to a system of liquid helium [1]. The particles are guided around the accelerator ring by a strong magnetic field produced by superconducting magnets. There are thousands of magnets in different sizes. These are 1232 dipole magnets 15 meters long bending the beams and 392 quadrupole magnets 5–7 metres long focusing the beams. Each arc, with a regular lattice structure, has 154 dipole magnets, 23 arc cells. Each arc cell has a FODO structure with 106.9 m length. This structure includes main dipole magnets, quadrupole magnets and other multipole magnets [2].

The particles are prepared by series of systems to increase their energy gradually [3] before being injected into the main accelerator. These systems are illustrated in Figure 2.1. The first system is the linear particle accelerator LINAC 2 or 4. It generates 160-MeV negative hydrogen ions and inject them to the Proton Synchrotron Booster (PSB). At PSB, electrons are stripped from hydrogen. Protons are then accelerated to 2 GeV. Then they move into the next system: Proton Synchrotron (PS). There, protons are accelerated to 26 GeV. Finally, they move into the Super Proton Synchrotron (SPS) to increase their energy to 450 GeV before they are injected into the main ring. The main ring accelerates the protons to their peak energy and circulate for 5 to 24 hours so that collisions occur at four intersection points.



Figure 2.3. Schematic view of LHC sections.

When the protons have an energy of 6.5 TeV, total collision energy reaches up to 13 TeV. At this energy, the protons have a Lorentz factor of about 6930 and move at the speed of 0.999999990 c. It takes less than 90 µs to travel around the main ring for a proton. A beam at the LHC is not a continuous string of particles. It is split into compressed chunks called as bunch. Each bunch is several cm long and contains about a hundred billion protons. The distance between two separate bunches is defined as the bunch spacing in nanoseconds (ns).

At each collision, massive amounts of data are expected. An international collaborative project, the LHC Computing Grid [4], was constructed to handle huge amount of data by using fiber optic cable connections and existing high-speed portions of the public internet.

Currently, the LHC operations have gone through four periods: Run1, Long Shutdown 1 (LS1), Run2, Long Shutdown 2 (LS2). In these periods, luminosity of the LHC and the energy per beam increased gradually. Operational process of LHC involves many successful collisions and machine maintenance and upgrades from 2009 to today.

#### 2.2. LHC to HL-LHC

Luminosity is an important factor for particle accelerator experiments. It is proportional to the number of collisions over a determined time and the higher luminosity means the more data can be collected to observe rare processes and particles. The HL-LHC project aims at enhancing the performance of the LHC and expanding the potential for discoveries after 2027. It has the following targets: A peak luminosity of  $5 \times 10^{34}$  cm<sup>-2</sup>s<sup>-1</sup> and an integrated luminosity of 3000 fb<sup>-1</sup> [5]. The HL-LHC upgrade is for applicable to almost all LHC experiments and it has varied physics goals. Main goals of HL-LHC [6] includes improved SM measurements, flavor physics of heavy quarks and leptons, searches for beyond the SM (BSM) physics, studying properties of the Higgs boson and Quanmtum Chromodynamics (QCD) matter at high density and temperature [7, 8]. Since 2012, Higgs physics program turns into a major precision measurements and searches for rare physics processes. The measurement of Higgs Boson properties and the relation of these to Electro Weak (EW) symmetry breaking is one of the main purpose of HL-LHC physics program. The HL-LHC will provide 140 collisions per bunch crossing (BX) which will make it much more possible to study Higgs boson properties.

The LHC to HL-LHC program can be seen in Figure 2.4 year by year. Increasing the luminosity and the number of collisions to 140 from 30 will create new avenues to observe rare processes and particles but also many challenges for the hardware of HL-LHC.



Figure 2.4. The LHC to HL-LHC program.

#### 2.2.1. Hardware Upgrades of HL-LHC

2.2.1.1. Accelerator Upgrades. High luminosity upgrade will initiate research for new technologies and concepts for particle physics experiments. The total hardware upgrade of LHC is equivalent to manufacturing and installing a new collider with 1.2 km length at various places of the LHC ring. The collider upgrade to machine systems comprises the main part of the new HL-LHC. Upgrades focus on mainly: Quadrupole magnets, Dipole magnets, Crab cavities, Beam optics, Cryogenics, Machine protection and collimators, Superconducting power lines. A detailed summary of the accelerator upgrade is given in [9].

<u>2.2.1.2. Injector Upgrades.</u> Some important changes will be made to the proton injector as part of the HL-LHC. The beams that come to LHC are pre-accelerated by following four accelerators also named as injector. Linac4, PSB, PS, SPS. All four of these injectors will be upgraded through the LHC Injector Upgrade (LIU) project during the Long Shutdown 2 (LS2) to reach the required speeds for protons in order to get sufficient center of mass energy for the HL-LHC project. More details about injector upgrades can be found in [10] and [11].

2.2.1.3. The upgrade of the major LHC experiments. The main purpose of the upgrades is maintaining and improving the performance of detectors in terms of efficiency, resolution and background rejection. The challenges of detector upgrades are radiation damage resulting from high integrated luminosity of HL-LHC and high pileup (the number of individual proton collisions per bunch crossing) resulting from high instantaneous luminosity of HL-LHC [9].

The particles from the HL-LHC collisions and the radioactivity they induce at detector material and on-board electronics lead to a decrease of the detector performance. The particles from many interactions, so called pileup interactions, are recorded together. At the HL-LHC, most of the data read out will be associated with the pileup collisions. Pileups create more hits in the detector and confuses the trigger and reconstruction and interpretation of events for the data analysis. The high number of pileup and radiation damage will be discussed in more detail for the CMS detector part in the next section. Each major experiment of the LHC plans many upgrades to overcome these challenges. The CMS experiment plans a Phase-2 Program designed to explore the potential of the HL-LHC completely. The CMS detector will continue to upgrade based on Phase-2 Upgrade Program [12] to protect detector performance.

### 3. COMPACT MUON SOLENOID

The CMS detector is one of the two large particle physics detectors constructed on the LHC at CERN. The purpose of the CMS experiment is to study a wide range of physics including the search for SM Higgs Boson, Supersymmetric particles, heavyion physics and new massive Vector Bosons. The CMS detector is built around 3.8T superconducting solenoid magnet 13 m long with 6 m diameter and provides great bending strength for momentum measurements. The magnetic field is limited by a steel return "yoke" that constitutes most of the detector's weight of the 14000-tonne. The detector is quite compact with 15 meters high and 21 meters long and contains all the detector material. The CMS detector consists of 5 main sub-system as shown in Figure 3.1, starting from inside to outside laid out as the Tracker, the Electromagnetic Calorimeter (ECAL), the Hadron Calorimeter (HCAL), and the Muon System. The diameter of the magnet is large enough to accommodate tracking and calorimetry systems. The innermost layer is a silicon-based tracker. The tracking volume is contained in a cylinder that is with 5.8 m long and 2.6 m in diameter. ECAL, based on scintillating crystal, surrounds the silicon based tracker and a sampling calorimeter HCAL surrounds ECAL. The large muon detectors are located outside the magnet.



Figure 3.1. CMS detector with components.

CMS coordinate system [13] is described in Figure 3.2. The CMS uses a righthanded Cartesian coordinate system and the origin of the coordinate system is in the center of the detector. The x-axis points towards the center of the LHC ring, the y-axis points upwards and the z-axis points along the beam direction. Due to the cylindrical shape of the detector, the azimuthal angle  $\phi$  measured from the x-axis in the x-y plane, and the polar angle  $\theta$  calculated from the z-axis are also used. Also, the pseudorapidity  $\eta$  can be used instead of theta and it is defined as  $\eta = -ln(tan(\theta/2))$ . It is a parameter to express the angles with respect to the axis of colliding beams. It has a value of 0 for particle trajectories perpendicular to the beam and positive or negative values for the trajectories at an angle to the beam. Also, the transverse momentum  $p_T$  and the transverse energy  $E_T$  can be used, these measure the momentum and energy in the transverse plane (the x-y plane) to the beam direction (z-axis).



Figure 3.2. CMS coordinate system.

#### 3.1. Tracking System

Measuring the momentum of the particles is very important for observing the events at the center of the collision. The CMS tracker detects the path taken by charged particles using their position information. The tracker can reconstruct the path taken by high energetic muons and electrons. It can also record the path taken by hadrons and tracks resulting from the decay of short-lived particles [14].

When a charged particle passes through the tracker layer, it produces a hit due to the interaction with silicon and these hits can be combined to determine the track of the particle. The tracking system has two main trackers. One is silicon pixel detector that is located right to the core of the detector and the other is silicon microstrip detector that surrounds the silicon pixel detector [14]. The CMS tracker layout is given in Figure 3.3. The outer radius and total length of the tracker are 110 and 540 cm, respectively.



Figure 3.3. The tracker layout (1/4 of the z view).

The tracker consists of 66 million pixels and 9.6 million silicon strips. In the barrel region, there are 3 layers of pixel detectors and 10 layers of silicon microstrip detectors. The forward region has 2 pixels and 9 microstrip layers in each of the 2 endcaps. The total area of the pixel detector and silicon strip detectors is  $\approx 1 \text{m}^2$  and 200 m<sup>2</sup>, respectively [15].

#### **3.2. ECAL**

The information about energies of electrons and photons from the collisions is collected by the ECAL. It comprises of 61200 lead tungstate (PbWO4) crystals mounted in the central barrel part and 7234 PbWO4 crystal in each of the two endcaps [15]. The use of PbWO4 crystals enables ECAL to be fast, radiation resistant and has fine granularity. The barrel section (EB) has an inner radius of 129 cm. It is structured as 36 identical super-modules, each covering half the barrel length, and has a pseudorapidity range of  $0 < |\eta| < 1.479$ . The barrel crystals have a cross section of  $22 \times 22 \text{ mm}^2$  and a length of 230 mm. The endcaps section are placed at a distance of 314 cm from the interaction point and has a pseudorapidity range of  $1.479 < |\eta| < 3.0$ . It consists of 2 semi-circular aluminum plates that include crystal structures called supercrystals having a cross section of  $28.6 \times 28.6 \text{ mm}^2$  and a length of 220 mm [15]. The photodetection is enabled by Avalanche Photo-Diodes (APD) in the barrel and Vacuum Phototriodes (VPT) in the endcaps. A preshower detector (ES) based on lead absorbers is used in front of the endcap crystals.

#### 3.3. HCAL

HCAL is a sampling calorimeter measuring the hadron energies and finds a particle's energy, position and time of arrival using absorber and scintillator material. In addition, it provides indirect detection of particles that are difficult to interact with, such as neutrinos. The HCAL surrounds the ECAL with the separated barrel and endcap parts.

The HCAL design focuses on maximizing the material inside the magnet coil in terms of interaction lengths. However, additional layer of scintillators, called as Hadron Outer (HO), is placed outside the magnet. The absorber material is brass due to its short interaction length and that it is non-magnetic [15]. The extensive use of the absorber material limits the space left for the active material. The HCAL uses the fibre consisting of plastic scintillator tiles that generates scintillator layers with a thickness of 3.7 mm. The photodetection readout is done by multi-channel hybrid photodiodes (HPDs). The CMS HCAL has four sub-parts: the HCAL Barrel (HB), HCAL Endcap (HE), HO, and HCAL Forward (HF) [12]. The locations of these subparts in the CMS detector are given in Figure 3.4.



Figure 3.4. The view of the locations of subsystems of the HCAL and ECAL in CMS detector.

#### 3.4. The Solenoid Magnet

CMS experiment is built around a huge superconducting solenoid magnet aiming to bent the particle trajectories from the LHC collisions. CMS can make accurate momentum measurements of the particles from high energy collisions by combining the information of bent trajectories with position measurements in the tracker and muon detectors. Chosen parameters for magnets are: 4T of field, 5.9 m inner iore radius, 12.9 m length, 2168 number of turns, 19.5 kA current, 2.7 GJ stored energy and 64 atm hoop stress [16]. The magnet also contains the tracker, ECAL and HCAL within its radius of 5.9 m.

#### 3.5. The Muon System

Muons have much longer mean lifetime than other known particles. The muon studies are highly important since muons are less affected by the radiative loss due to its decay process that is dominated by weak force. The muon system is located at the outside of the solenoid magnet. It has three main functions: muon identification, momentum measurements, and triggering. The solenoid magnet and return yokes are placed in between layers of the muon system and it enables good momentum resolution and triggering by providing homogeneous magnetic field to the CMS. Three types of detector are used for the muon system considering the large surface to be covered and radiation environment. These are drift tube (DT), cathode strip chamber (CSC) and resistive plate chamber (RPC). The DT chambers are used in the barrel region ( $|\eta| < 1.2$ ), where the neutron-induced background is small and the muon velocity is low. The CSC is used in the two endcaps covering the range of  $|\eta| < 2.4$ , where the muon, the neutron-induced background rate and the magnetic field is high. The RPC is used in both the barrel and the endcap regions covering the interval of  $|\eta| < 1.6$ . The layout of one quarter of the CMS muon system is shown in Figure 3.5. In total, the muon system includes 25,000 m<sup>2</sup> active detection plane and approximately a million electronic channels. More detail can be found in [15].



Figure 3.5. The layout of one quarter of the CMS muon system.

#### 3.6. HL-LHC Era and the CMS

The HL-LHC will confront with significant challenges because of the increased luminosity. The CMS prepared the CMS Phase 2 Upgrade Program [12] to maintain the detector performance in terms of efficiency, resolution, and background rejection. There are two main challenges: radiation damage from the high integrated luminosity and very high pileup resulting from the high instantaneous luminosity of the HL-LHC collisions.

#### 3.6.1. Radiation Damage to the CMS Detector

Particles produced in HL-LHC collisions induce radioactivity in the detector material and on-board electronics. This leads to significant damage and may result in gradual decrease in detector performance. Charged particles from collisions produce ionization in the detectors. In addition, particles back scattered from the calorimeters expand and interact with other components. Especially neutrons can go long distances by slowing down and scattering many times in the detector. Expected radiation levels in the CMS detector for HL-LHC collisions is illustrated in Figure 3.6. It shows the distribution of absorbed dose over the CMS detector for an integrated luminosity of  $3000 \text{ fb}^{-1}$ .



Figure 3.6. Expected radiation levels in the CMS detector. R is the transverse distance from the beamline and Z is the distance along the beamline from the interaction point.

The damage caused by the radiation varies from sub-detector to sub-detector. One of the consequences from high radiation environment is the problems in silicon detector electrical properties. It leads to an inefficiency of collecting signals from interaction point. The main problem for calorimeters is the loss of transmission of the medium through which the scintillation light or wavelength-shifted light must pass. This causes a reduction in the signal, which can be quite large, sometimes more than 90%, and so it leads to a reduction in resolution.
# 3.6.2. High Pileup

Each beam in LHC collisions consists of many bunches of protons. Each bunch has an rms length of 5 cm and contains several  $\times 10^{11}$  protons. The bunches will be separated by 25 ns over time corresponding to a spatial separation of about 750 cm. Each beam has 2800 filled bunches. The collision of the bunches (BX) occurs at a rate of 40 MHz. At the HL-LHC, the average number of interactions in a single BX is around 140. If these interactions are not well understood, they do not contribute to research for new physics. Since the number of bunches cannot be increased, the rise of luminosity at the LHC leads to higher pileups. High pileup creates more hits in the detector and it leads to tracks that are measured and identified wrong and confuses the trigger and reconstruction and interpretation of events in the data analysis. Also, it adds extra energy to calorimeter measurements like jet energies related to collision that contains hard scatter.

A relatively high pileup generated in a study in 2012 is shown in Figure 3.7. There are 78 reconstructed vertices. The total number of pileup collisions is actually slightly higher because some vertices have too few tracks for reconstruction.



Figure 3.7. High pileup event with 78 reconstructed vertices taken in 2012.

#### 3.7. The Phase-2 Upgrades

# 3.7.1. General Considerations for the Phase 2 Upgrade

According to Phase-2 Upgrade program, the tracker and the endcap calorimeters must be replaced because of expected radiation levels produced in HL-LHC collisions. To fully benefit from the HL-LHC, the ability to provide efficient event selection for data acquisition is essential. First-level trigger systems (L1 Trigger) allow us to take quick decisions about which events should be recorded. It uses coarse grain detector information taken much faster than the full detector readout data. For the efficient L1 Trigger performance in the high pileup conditions, the trigger electronics should be upgraded. Facilitating tracking in the L1 trigger is a main focus of the Phase 2 tracker design and a new hardware architecture will be required. To maintain trigger performance within the current coverage, the muon system will be designed with new chambers. In the forward region, physics acceptance will be maximized over the largest solid angle possible to reach small production cross-sections and decay branching ratios.

The readout electronics will be upgraded for efficient data taking in the high pileup conditions. Measuring time of arrival of the data from the HL-LHC collisions with high precision will improve the wrong identification and measuring due to high pileup and so the trigger and reconstruction and interpretation of events will also be improved.

#### 3.7.2. Calorimeter Upgrades

The HL-LHC challenges are important for both calorimeter longevity and performance. These difficulties will be reduced by different upgrades for each sub-part of the calorimeters. The challenge is particularly strong in the endcap region between  $|\eta|$ = 1.48 and  $|\eta|$  = 3.0, where the radiation levels are very high. In HCAL, the radiation dose at the location of the HE scintillating tiles will reach up to 300 kGy. These dose and data transmission levels will result in significant loss to detector performance. That requires the replacement of the endcap calorimeters of CMS for HL-LHC. The endcap calorimeters will be replaced by the HGCAL [17] in Phase-2 Upgrade Program. In the next section HGCAL will be explained in more detail. Detailed information about calorimeter upgrades can be found in [18] and [17].

# 4. HIGH GRANULARITY CALORIMETER

The HGCAL is a sampling calorimeter using both silicon and scintillator technology for detection. The building blocks of the calorimeter are the silicon or scintillator modules, an absorber plate, and readout electronics. There are 28 layers in the electromagnetic section and 22 layers in the hadronic section for a total of 50 layers per endcap with more than 6 million channels. The HGCAL must meet two key requirements: maintain acceptable performance after an integrated luminosity of 3,000 fb<sup>-1</sup> and a ~30 ps timing resolution to mitigate pile-up. At 3,000 fb<sup>-1</sup>, the total neutron fluence in the innermost region will be  $10^{16} n_{eq}/cm^2$  and the total ionizing dose will reach 2 MGy.

The HGCAL consists of an electromagnetic section (CE-E) followed by a hadronic section (CE-H). The electromagnetic part consists of 28 sampling layers with a thickness of 34 cm and a depth of approximately 26  $X_0$  and 1.7  $\lambda$ . The active detector element is a 163 mm wide hexagonal silicon sensor from an 8-inch wafer sandwiched between a 1.4 mm thick WCu baseplate and a printed circuit board (PCB) carrying front-end (FE) electronics to form a silicon module. Figure 4.1 shows the schematic cross-section of HGCAL with key parameters and active elements.

The absorber in the hadronic section consists of 12 stainless steel planes with a thickness 35 mm and 12 stainless steel planes with 68 mm thickness. Between these absorber plates, there are silicon modules and scintillator plates mounted on 6 mm thick copper cooling plates, and these form cassettes 30 degrees wide.

The total power required for the FE electronics is approximately 110 kW per endcap. The FE electronics requires three voltages: 1.5 V for analog, 1.2 V for digital, and 2.5 V for optical transmission electronics. The power is fed to the silicon and scintillator modules through FE PCBs that also host the HGCAL application specific integrated circuits (ASICs).

All layers are read out to be used in energy measurement. Alternate layers in CE-E and layers in CE-H are used to make L1 trigger primitives. In the CE-H, the radius where the active medium changes from silicon sensors to plastic scintillators is determined according to radiation level. Sufficient margin must be allowed so that the light loss due to the radiation dose in scintillators does not exceed a reasonable amount (50%) and neutron flux does not exceed  $8 \times 10^{13} n_{eq}/\text{cm}^2$ . Therefore, electronic noise from increased silicon photo-multiplayer (SiPM) leakage current and light loss, is kept low to allow good measurement of the minimum ionizing particle (MIP) response. MIP is a particle whose mean energy loss rate through matter is close to the minimum. All detailed information about HGCAL can be found in reference [17].



Electromagnetic calorimeter: Si, Cu/CuW/Pb absorbers, 28 layers,  $26X_0 \& \sim 1.7\lambda$ Hadronic calorimeter: Si & scintillator, steel absorbers, 22 layers,  $\sim 9.0 \lambda$ 

Figure 4.1. HGCAL with key paremeters.

# 4.1. Active Elements

The HGCAL has two types of active elements: silicon sensors covering an area of  $600 \text{ m}^2$  and small plastic scintillator tiles covering an area of  $500 \text{ m}^2$ .

#### 4.1.1. Silicon Sensors

The silicon sensors for the CE-E and the inner parts of the CE-H will be planar DC-coupled hexagonal silicon sensors fabricated on 8 inch wafers. The hexagonal shape of the sensors provide more efficient use of the available area of circular wafers and significant cost reduction compared to square or rectangular sensors.

The silicon sensors will have three different active thicknesses (300, 200 and 120  $\mu$ m) in order to optimize the charge collection and operation conditions. Each sensor is segmented into cells and read out individually and DC coupled to the on-module readout electronics. The readout chips will be connected to each sensor cell by wire bonds through holes that are in the readout PCB to the sensor. The size of the single cell is optimised for an improved signal-over-noise ratio. The 120- $\mu$ m thick sensors contain 432 cells with 0.52 cm<sup>2</sup>/cell and the others contains 192 cells each with 1.18 cm<sup>2</sup>/cell. P-type epitaxial wafer is the line substrate material for the 120- $\mu$ m sensors and the physically thinned p-type float zone silicon wafers are for the others. A comprehensive irradiation campaign is in progress to converge on optimal sensor design choices and parameters [19–22].

# 4.1.2. Silicon Modules

The HGCAL will have about 30,000 silicon modules to be placed in the CE-E and the CE-H section. A silicon module consists of 4 components as shown in Figure 4.2. These are baseplate, Kapton-gold sheet, silicon sensor, and the PCB, labeled the hexaboard, with FE electronics. All of these components have the same area as the sensor. The baseplate has precise reference holes for assembly and placement onto the cassettes. A 105  $\mu$ m thick Kapton foil covered with a thin layer of gold is glued to

the baseplate. The Kapton itself provides electrical insulation of the sensor back-plane from the baseplate, which is held at ground.



Figure 4.2. CE-E silicon module, showing stacked layers.

The silicon sensors and the hexaboard are hexagonal with small cutouts at each of the six corners. The cutouts provide access to the positioning and mounting holes in the baseplate. They also provide access to a portion of the Kapton-Au layer for wirebond connections to the hexaboard, for the biasing of the sensor back-plane. The hexaboard will contain the HGCAL readout chip (HGCROC) FE readout ASICs. The signals from the sensor pads are routed to the HGCROC for on-board signal digitization.

#### 4.1.3. Plastic scintillators and Scintillator tile-modules

The HGCAL will use the scintillator as the active material in regions where the integrated radiation dose is low enough (<3 kGy) for the scintillator, and fluence is currently limited to  $8 \times 10^{13} n_{eq}/\text{cm}^2$  to maintain good overall performance during the HL-LHC project. As a result, for the region  $|\eta| > 2.4$ , silicon sensors are used.

Two types of scintillator material are considered for the hadronic section: polyvinyltoluene-based (PVT) and polystrene-based (PS). Based on cost, performance, and ease of assembly, the cast and machined PVT-based scintillators will be used in the front, and injection molded PS-based scintillators will be used for the rest of the hadronic section.

# 4.2. Electronics Overview

The HGCAL readout, trigger, and control system uses a combination of custom and commercial parts. Some of the custom parts are specific to HGCAL, and some are common with other CERN projects.

#### 4.2.1. Front-end Electronics

The FE electronics digitizes either the silicon sensor or SiPM signal, provides a high-precision time-of-arrival measurement, and transmits the digitized data to the back-end (BE) electronics. It also computes, at every bunch crossing, the digital sums of neighbouring cells  $(2 \times 2 \text{ cells in the case of the } 1.18 \text{ cm}^2 \text{ silicon sensors}, 3 \times 3 \text{ cells}$ in the case of the  $0.52 \text{ cm}^2$  silicon sensors, and 2 cells in the case of scintillator tiles) to build trigger primitives. The HGCROC measures the charge and the time-of-arrival at 40 MHz. The HGCROC requirements are extremely challenging: a high dynamic range from a few fC to 10 pC, low noise of about 2,000 electrons, high precision timing information for pileup mitigation, and low power consumption of 15 mW/channel. The HGCROC will also face a harsh radiation environment, up to 300 MRad. The HGCROC has 72 channels of analog low noise and high gain preamplifier and shapers, and a 10-bit 40 MHz successive-approximation analog-to-digital converter (SAR ADC), which provides the charge measurement over the linear range of the preamplifier. In the saturation range of the preamplifier, a discriminator and time-to-digital converter (TDC) provide the charge information over a 200 ns dynamic range using 50 ps binning. A fast discriminator and TDC provide timing information to 25 ps accuracy. Both charge and timing information are kept in a memory waiting for a Level1-accept (L1A). At a bunch crossing rate of 40 MHz, data corresponding to (4 or 9) adjacent channels are sent out to participate in the generation of the trigger primitives. As part of the ongoing development and testing of the HGCAL electronics, performance of the HGCROC prototype in terms of signal-over-noise ratio, charge, and timing, as well as radiation qualification with total ionizing dose, and single-event effects, has been studied [23]. The data are zero-suppressed and transmitted to a concentrator ASIC (ECON-D) via 1.28 Gb/s electrical links. Digital sums of 4 or 9 adjacent channels

(depending on the sensor granularity) are computed. These sums are transmitted for every bunch crossing from the HGCROC to another concentrator ASIC (ECON-T), via separate 1.28 Gb/s electrical links to be used for the formation of trigger primitives. The ECON-D aggregates, formats, and serialises the data at the L1 frequency of up to 750 kHz. It also receives and distributes the fast control signals (clock, L1A, reset, bunch crossing count) and all the controls needed to configure the FE electronics, and conversely sends data and control information to the DAQ electronics. For the trigger path, the ECON-T selects the trigger sums of interest, aggregates and formats their data in packets, and stores them in a First-in-First-out (FIFO) buffer. These data are sent within a defined latency to the trigger primitive generator (TPG) electronics, using separate optical links. The ECON-D and ECON-T act therefore as hubs, receiving up to seven 1.28 Gb/s electrical links from the HGCROCs and serialising the corresponding data, trigger and L1A readout, and outputs at 10.24 Gb/s. They are connected to the BE electronics via a bi-directional link using low-power gigabit transceivers (LpGBTs) and are coupled to a slow control adapter ASIC (SCA) [24] and versatile transceiver plus (VTRx+) [25]. The fast control links will run at 320 Mb/s and the slow control links will use I<sup>2</sup>C protocol at 80 Mb/s. The optical transmission is performed through the VTRx+ optoelectronics transceivers.

The PCB layouts for the FE electronics optimize the board shapes and the number of optical links, as required by the data rates. The ECON-D and ECON-T are located on one PCB, Wagon board, and the LpGBTs and VTRx+ are located on a different PCB, Engine board. The silicon and scintillator modules are connected to Wagon boards, which are connected to Engine boards with miniature connectors [13, 14]. The LpGBTs use the forward error correction (FEC5) protocol [26] for the data path. The total average data volume for the whole HGCAL is around 2.5 MByte per event, which scales approximately linearly with the average pileup. At an L1A rate of 750 kHz, this corresponds to an average data rate from the FE electronics of 15 Tb/s. For the trigger path, in most cases one or two optical links per Engine board are sufficient, but the ECON-T can transmit the trigger data on up to three links for the busiest Engine boards. The total average rate of trigger cell data will be 1.25 Mbit per bunch crossing, or 50 Tb/s. The total number of 10.24 Gb/s LpGBT links from FE electronics is 8028 for DAQ, and 8502 for TPG BE electronics. A schematic representation of HGCAL FE readout, trigger and control chain is shown in Figure 4.3.



Figure 4.3. Schematic representation and key components of the CMS HGCAL FE readout, trigger and control chain.

#### 4.2.2. Back-end Electronics

The BE electronics consists of the DAQ and TPG systems, and detector control system (DCS) and detector safety system (DSS). The DAQ and TPG systems are implemented in (ATCA) [27] format. The systems consist of CMS common 'Serenity' boards [28] housed in ATCA crates, with DAQ and timing hub (DTH) ATCA boards [29] also in these crates to provide the central DAQ and timing systems interface. The DAQ system consists of ATCA boards containing two FPGAs with sufficient bandwidth to drive links in both directions to and from the FE electronics, and 12 optical links to and from a DTH in the same crate. The links from and to the FE electronics will run at 10.24 Gb/s and 2.56 Gb/s, respectively. The links to the DTH board will run at 16 Gb/s and/or 25 Gb/s. The FPGAs on these boards will handle

and process the data, with the required buffering, event building, front-end emulation, and monitoring implemented in firmware. The TPG system has two tasks: to form 3D clusters from trigger cells, and to form the overall energy map from the coarse granularity HGCROC energy sums. The current plan is that the TPG will perform these tasks in two stages. Each layer is considered separately in the first stage. The first stage forms two-dimensional (2D) clusters from the trigger cells and combines all the HGCROC sums into a single energy map for the particular layer being processed. The 2D clusters in depth will be combined to form 3D clusters in the second stage. The second stage will also combine all the single-layer energy maps, with an appropriate weighting, into the complete transverse energy map. The completed energy maps and 3D clusters form the trigger primitives that are transmitted from the TPG to the central Level-1 Trigger system [30]. The concept and design of the TPG system can be found in [31].

# 5. DAQ LINK STUDIES OF HGCAL BACK END ELECTRONICS

The HGCAL BE electronics system consists of four subsystems. All four BE electronic subsystems will be implemented on Serenity baseboards, having ATCA form factor. The sub-detector custom electronics uses high speed optical data transmission between FE and BE. One of the subsystems is control, timing, and DAQ, interfacing to the FE and central DAQ (cDAQ). In the cDAQ, the data from the different subsystems are combined into event data structures and then provided to the high level trigger data analysis. The BE electronic architecture must guarantee that no data are lost in the process of data transmission from FE to BE and cDAQ.

Data from the BE DAQ Serenity boards will be transmitted to cDAQ boards via optical links. The cDAQ boards can accept up to 24 input optical links using a serial point-to-point protocol with flow control, evolved from the Run-2 S-Link Express [32]. That's why these optical links will be named as Slinks in the studies.

In this thesis, two main issues for HGCAL BE Electronics were studied. One of them is the optimization of link connections of the FPGAs on Serenity boards and the other is the IBERT for the FPGAs. These FPGAs receive data from FE, aggregate and process the data, with the required buffering, event building, front-end emulation, and monitoring implemented in firmware. Each FPGA can take data from optical links using lpGBTs in both directions to and from the FE electronics, and a further 12 optical links (Slinks) to and from a cDAQ in the same crate at speed of 16 Gb/s or 25 Gb/s.

#### 5.1. Baselines of BE DAQ Architecture

The HGCAL BE system overall design has been changing due to the better understanding of the firmware implementation, required resources and various external factors. This resulted in many updates to the baseline of the BE DAQ architecture. The choices of baseline should be consistent with required bandwidth. First baselines for DAQ architecture were only for creating a basis for the costs of the design. The choices of baseline was consistent with required bandwidth but there was no effort for firmware engineering providing FPGA resources estimates, the choice of FPGAs was based on predictions.

As the firmware engineering progressed, an update was made for baseline of BE electronics. This considered some changes of HGCAL FE electronics and the available prices of FPGAs at CERN. It proposed 16Gb/s optical links for all BE communication. The DAQ boards can contain two KU15P FPGAs. A board can take 120 lpGBT connections and 24 Slink connections to the cDAQ. This means 60 lpGBT and 12 Slink connections to cDAQ per FPGA. This baseline was consistent with the estimated bandwidths but the FPGA resource estimates were incomplete.

By mid 2021, a significant amount of DAQ FPGA firmware had been written. This gives a reasonable estimate of the required resources [33]. According to these resource estimates, the KU15P with 60 lpGBT inputs isn't feasible. The one to one replacement of VU7P with KU15P is found feasible.

These updates of baselines effect some complex parts of the DAQ firmware design. One of them is the inhomogeneity of the data rates across the HGCAL. The average data rate on the lpGBTs varies a lot. That's why, load balancing for connections of lpGBTs which are connected to FPGAs and combined into one output Slinkrocket (or Slinks) to the cDAQ is required to be created as efficient data transmission system. Data rates from FE should be within the limits of available bandwidth in the BE to avoid full buffer saturation and data losses. Many load balancing studies realized for different number for FPGAs and Slinks according to baseline and updates.

#### 5.2. Load Balancing Studies

#### 5.2.1. Description of the Problem

Data are transferred to FPGAs by ECON-Ds using lpGBT links which are connected to Slinks on FPGAs and so assigning lpGBTs to Slinks determines how much data Slinks and FPGAs carry. FPGAs will transmit data to cDAQ via using Slinks on them and 12 optical channels connected to each Slinks. Total data of that Slinks should be balanced so that FPGAs don't lose their data transmission synchronization. The schematic representation of data transmission from FE to BE is given in Figure 5.1. The main purpose is to reach balanced total data rate distribution of Slinks and FPGAs after lpGBT are assigned to Slinks and so, minimizing the standard deviation of the total data rates of Slinks is the main consideration. However, event size (data rate) of lpGBTs has a wide range and also the system has some constraints for link connections, thus assigning lpGBTs to Slinks in a balanced way is not so easy. It is basically a mixed integer non-linear programming (MINLP) problem [34]. MINLP problems mean optimization problems with continuous and discrete variables and nonlinear functions in the objective function and/or the constraints. In the studies, the objective function is the standard deviation formula for the Slink loads as a nonlinear function. There are discrete variables (i.e. the number of lpGBT connections to an Slink) and some certain constraints (i.e. max lpGBT connections that an Slink can take) and search integer information (how many lpGBTs (5,6,7..) should be assigned to an Slink is the integer information to get). That's why MINLP problem solvers are used to reach optimized link connections. System constraints:

- Each lpGBT can be assigned to only one Slink.
- Each Slink can be assigned to only one FPGA.
- Slinks should have a determined maximum capacity based on the baseline model to take lpGBT connections.
- FPGAs should have a determined maximum capacity based on the baseline model to take Slink connections.



Figure 5.1. Basic data transmission scheme. On the left, FE data transmission and on the right, BE data transmission are shown basically.

Average lpGBT rates coming from FE simulation data are shown in Figure 5.2. These data categorized as double (paired) and single lpGBT. Double lpGBTs carry information from the same ECON-D ASIC and hence the data shouldn't be reconstructed in different Slinks in the BE, hence they need to be paired. Also data are categorized as high medium and low lpGBT rates to get a result with the optimization solvers. High lpGBTs are top one third of the top highest event size data. Medium lpGBTs are one third of the data after the high lpGBTs and low lpGBTs are the one third of the data after the medium ones according to the event-size.



Figure 5.2. lpGBT rates for the whole sector.

## 5.2.2. Main Challenge

The link connection problem has many possible solutions. There is a huge solution set to determine the connections but the most optimized one should be found. Huge solution set refer to high number of variables meaning all connection possibilities that need to be defined at the same time, such as lpGBT connections to Slinks, Slinks connections on FPGAs, lpGBT connections to FPGAs. Number of variable increases exponentially with new input data. If the number of FPGAs increases, the number of other components also increases. For some numbers of FPGAs, number of variables is following:

- For 4 FPGA: 9792 variables
- For 5 FPGA: 15060 variables
- For 6 FPGA: 22032 variables
- For 7 FPGA: 29988 variables
- For 28 FPGA: 470064 variables (The total sector (120  $\phi$ ) has 28 FPGA and 1508 lpgBT)

#### 5.2.3. Early Studies and First Chosen Work Strategy

Early studies contain the studies on small number of FPGAs to determine an efficient working strategy for the whole system. Until the first DAQ architecture updates, whole system has critical numbers as follows:

- Assignment done for a 120  $\phi$  sector
- 28 FPGAs / 1508 lpGBT
- Each FPGA can have max 60 lpGBTs as inputs and 12 Slink as outputs, in total, 12x28 (336 Slink output)

MINLP problem solvers were used to get lpGBT connections to Slinks information and optimization suites were used for the modelling BE DAQ link connection system. First study was done for the case of KU15P FPGA choice. After getting a solution for this case, the results were improved or changed according to the baseline updates of the BE DAQ architecture. For the first studies, the chosen optimization suites are Gekko and Pyomo and MINLP solvers are Apopt and Convex Over and Under Envelopes for Nonlinear Estimation (Couenne). The first results can be seen in Appendix A.1.1, A.1.2, A.1.3, and A.1.4. Gekko can work with only Apopt Solver. However, it couldn't get satisfying results to work with the whole sector. Apopt solver in Gekko seems to be impossible to coneverge on a solution for more than 2 FPGA case. Apopt solver in Pyomo get worse result than in Gekko and also seems impossible to get results for more than 2 FPGA. That's why, another solver should be tried with Pyomo to get sufficient results.

5.2.3.1. Pyomo Optimization Suite. The chosen tool for modelling link connections is Pyomo Optimization Suite [35]. Pyomo is a Python-based open-source software package that supports to create, solve, and analyze optimization models. Pyomo can be used to define and create specific problem examples and solve these examples using commercial and open source solvers. Pyomo supports a wide range type of problems including MINLP. The first results of Pyomo Optimization Suites are given in Appendix A.2.

5.2.3.2. Couenne MINLP Solver. Couenne is considered as the best solver for this study since it can run until it reaches a solution for a high number variable problem. Couenne solver is an open-source library for MINLP problems. The source code for Couenne is available for download in the Computational Infrastructure for Operations Research (COIN-OR) repository and on Github<sup>1</sup>. Couenne uses Spatial Branch and Bound (sBB) algorithm that is a divide-and-conquer technique of Branch and Bound (BB) algorithm. The MINLP optimization problems are exponential in terms of time complexity. The BB Algorithm technique solves these problems relatively quickly. A BB algorithm consists of a systematic enumeration of candidate solutions. The set of candidate solutions is considered as forming a rooted tree. The algorithm searches for branches of this tree, which are the subsets of the solution set. A solution found for

<sup>&</sup>lt;sup>1</sup>https://github.com/coin-or/Couenne.

the problem is kept within the given time limit until a better solution is found, and if a more optimal result is found, the solution is changed, which gives the most optimal results in a given time. A more detailed explanation of the Couenne solver can be found in Appendix A.3 and [36–38].

#### 5.2.4. First Studies with Counne Solver

First Studies contains 5 and 6 FPGA. Objective function value is given as Obj. In the first studies, it is standard deviation value.

5.2.4.1. For 5 FPGA Case. Inputs are 60 Slinks, 230 single and 20 double lpGBTs. Figure 5.3 shows the number of Slinks vs Slink rate and total FPGA rates after assignment. More detailed results are given in Appendix A.4.1.



Figure 5.3. On the left, Slink Rates vs Number of Slinks and on the right total FPGA rate plot.

5.2.4.2. For 6 FPGA Case. Inputs are 72 Slinks, 276 single and 24 double lpGBT. Figure 5.4 shows the number of Slinks vs Slink rate, total FPGA rates. More detailed results are given in Appendix A.4.2. Some Conclusions after 5 FPGA and 6 FPGA Studies are reached and listed below:

- The Slink rate plots don't seem promising. They have high standard deviation value that can be problematic for efficiency of the system.
- Time limit is an extremely important factor. Extended time limit is better since

Couenne can reach more optimized result.

- The number of variables increases considerably with each FPGA added to the model. It is the biggest limiting factor to the Couenne solver performance. More number of variables means more reformulation (see Appendix A.3.1) time.
- Limit on Slink rates needed to be optimised for the minimum standard deviations. Couenne searches solutions according to Slink rate threshold; higher threshold is not better since it creates more possibilities for assigning and Couenne can produce a non-optimal solution at the end of the time limit. Slink rate threshold should be given in more effective way and shouldn't be given so high but given sufficiently high since Counnne shouldn't lose effectiveness in an infinite possible solution pool but should have flexibility of creating solutions.



Figure 5.4. On the left, Slink rates vs number of Slinks and on the right, total FPGA rate plot.

# 5.2.5. Motivation after Conclusions

- To reduce run time, objective function is linearized so that reformulation part of Counnne takes less time.
- To reduce number of variables, lpGBTs are divided into related groups (high, medium, low) and given them in different runs and run by Couenne seperately.

5.2.5.1. New Objective. The first objective was standard deviation.  $\frac{\sqrt{\sum_{i=1}^{n}(x_i-x)^2}}{\sqrt{n-1}}$  is standard deviation formula, because of root squared term of the standard deviation formula, reformulation time of Couenne solver uses longer and longer time. To reduce it, the objective function is reorganized, instead of squared term in the function, absolute value (a factorable function) is used so that objective function is linearized. New objective, instead of directly minimizing standard deviation, is to minimize the absolute value of difference between total rate of each Slink and average Slink rate which would also minimize the standard deviation.

5.2.5.2. Multi-Run Solving. To handle with the high number of variables, the model is reorganized such that Couenne would deal with less number of variables. lpGBTs are divided into groups of 2 or 4. Doubles and high rated lpGBTs are solved in run 1-2. The medium and low rated lpGBTs are solved in run 2-3. All Slinks are given at once. Solved and linked variables are fixed and saved for the next run.

#### 5.2.6. Some Results of Multi-Run Solving

5.2.6.1. 7-FPGA 2-MultiRuns. Inputs are 7 FPGA, 84 Slink, 462 single and 42 double lpGBTs. First run contains double and high lpGBT rates. Second run contains medium and low lpGBT rates. Results, total Slink and FPGA rates after all lpGBTs assignment and average slink rate as a red line on Slink rate plot can be seen in Figure 5.5. The average Slink rate is around 460.27 and objective function value is around 27.60. The details are given in Appendix A.5.1.



Figure 5.5. Results for 7 FPGA 2 MultiRuns. On the left, total rate distribution on number of Slinks. On the right, total FPGA rates are given.

<u>5.2.6.2.</u> 7-FPGA 3-MultiRuns. First run contains double and high lpGBT rates. Second run contains medium lpGBT rates. Third Run contains low lpGBT rates. Total lpGBT number is increased to case be more difficult. Results are shown in Figure 5.6. The average Slink rate is around 482.73 and objective function value is around 19.69. The details are given in Appendix A.5.2.



Figure 5.6. Results for 7 FPGA 3 MultiRuns. On the left, total rate distribution on number of Slinks. On the right, total FPGA rates are given.

Better objective results were reached in case of 7 FPGA 3 multi-run compared with the 7 FPGA 2 multi-run, since Couenne can reach better results when it deals with less number of variables. Therefore, an update is done for the multi-run solving method. Inputs are separated into two: one group will be solved with a naive distribution that assign lpGBTs to each different Slink from high-rated lpGBT to low-rated lpGBT, it was named as HardStart. The other group of inputs will be assigned with Couenne solver to get a whole solution and it is named as WarmStart. This change would also reduce the total run time.

#### 5.2.7. WarmStart and HardStart

WarmStart takes naive distribution as initial values and starts. HardStart Takes naive distribution as the solution for the first and solve the rest with Couenne solver. This method is used first for 7 FPGA case. Naive distribution results are fixed for 2 multi-run and 3 multi-run methods of 7 FPGAs. Results seem promising and time is reduced with this method since Counnne does not run for half the time. 5.2.7.1. 7 FPGA HardStart Results. First Run is solved for double and high lpGBT rates with the naive distribution and the result is given in Figure 5.7.



Figure 5.7. Total rate distribution to number of Slinks.

5.2.7.2. 7-FPGA 2 Multi-Runs WarmStart Results. The naive distribution solution is fixed and a run including medium and low lpGBT rates are added to it and solved with Couenne solver. Results are shown in Figure 5.8. The average Slink rate is around 482.73 and objective function value is around 17.39. The details are given in Appendix A.6.2.



Figure 5.8. Total rate distribution on number of Slinks for 7-FPGA 2 Multi-Runs

5.2.7.3. 7 FPGA 3 Multi-Run WarmStart Results. The naive distribution solution is fixed and two run including medium and low lpGBT rates are added to it and solved with Couenne solver. Results are shown in Figure 5.9. The average Slink rate is around 482.73 and objective function value is around 8.84. The details are given in Appendix A.6.3.



Figure 5.9. Total rate distribution on number of Slinks. 7 FPGA results for HardStart including High and Double lpGBTs, and WarmStart including medium and low lpGBTs in two separate runs.

The HardStart and WarmStart method gives better results and the best solution is reached when it runs with three multi-run. Solving the problem by dividing it into parts in different runs is an important method to handle with less number of variables. Each run can be saved into ".pkl" file and can be started from that file to run it with different lpGBTs. These conclusions are promising for the higher FPGAs studies.

#### 5.2.8. 14 and 28 FPGA Studies

## 5.2.8.1. Work Strategy.

- Multi-Run Solving
  - (i) To reduce number of variables.
  - (ii) Divide lpGBTs into groups according to event-size (Double, high, medium and low).
  - (iii) Solve for high and double lpGBTs together with naive distribution algorithm, use Couenne solver for medium and low parts separately.
  - (iv) Divide mediums into 8 runs and lows into 2 runs for 28 FPGA so that Couenne can find a solution for 28 FPGA.
  - (v) For 14 FPGA use just 3 runs. Run1 (high+double) with naive distribution algorithm, Run2 (mediums) and Run3 (lows) with Couenne solver.
  - (vi) Solved and linked variables are fixed for the next run and all Slinks are given at once.

- New Objective
  - (i) To reduce solving time.
  - (ii) Minimize absolute value of difference between total event size of each Slink and average Slink rate instead of minimizing standard deviation.

5.2.8.2. 14 FPGA Results. 14 FPGA part is solved using only 3 runs. All Slinks and the half of the all lpGBTS are given. Run1 has double and high, Run2 has medium and Run3 has low lpGBT rates. The results are illustrated below. Value of the objective function, the average distance of Slink rates to the average Slink rate is given as Obj. How the lpGBTs are distributed to Slinks and total rates are shown in Figure 5.13. For Run1 naive distribution is used and double and high lpGBT rates are given, average Slink rate is around 330.67 and objective function value is around 41.42, the result is shown in Figure 5.10.



Figure 5.10. Total rate distribution on number of Slinks, high and double lpGBTs are connected.

For Run 2, Couenne Solver is used. Medium lpGBTs are given, average Slink rate is around 439.28 and objective function value is around 22.97, the result is shown in Figure 5.11.

For Run 3, Couenne Solver is used. Low lpGBTs are given. At the end of this run, all low, medium and high lpGBTs rates are assigned to Slinks, average Slink rate is around 471.71 and objective function value is around 13.16 and whole solution of 14 FPGA is given in Figure 5.12. How lpGBTs are distributed to Slinks and total rates are given in Figure 5.13.



Figure 5.11. Total rate distribution on number of Slinks. Double, high and medium lpGBTs are connected.



Figure 5.12. Total rate distribution on number of Slinks. Double, high, medium and low lpGBTs are connected.



Figure 5.13. On the left, how lpGBTs are distributed to Slinks and 14 FPGA lpGBTs vs Slinks and lpGBTs vs Total Rates is on the right.

5.2.8.3. 28 FPGA Results. Medium lpGBT rates are divided into 8 part. Each part runs for 16 hours. Each part result is given in Figure 5.14.



Figure 5.14. Total Rate vs Number of Slinks for each run. Run 0 (swap) is the naive distribution algorithm part for double and high lpGBTs rate. Each Run connection is saved and carried to other run.

The plot of run 8 is the whole solution of 28 FPGA medium part and is shown in Figure 5.15. Double, high and medium lpGBTs are connected. Low lpGBTs are divided into 2 runs. First run runs for 2 day, average Slink rate is 431.96 and objective function value is 21.86, the result is shown in Figure 5.16.



Figure 5.15. Total rate distribution on number of Slinks is on the left and How objective value changes run by run is shown on the right.



Figure 5.16. Total rate distribution on number of Slinks. Double, high, medium and half of low lpGBTs are connected.

Second run of the low part runs approximately 5 days, average Slink rate is 457.41893059523795 and objective function value is 19.246068531537155. The result is shown in Figure 5.17 and it is also the first solution of the whole sector (28 FPGA, 120  $\phi$ ).



Figure 5.17. Total rate distribution on number of Slinks. Double, high, medium and low lpGBTs are connected, whole solution.

How many double, high, medium and low lpGBTs assigned to Slinks for 28 FPGA is shown in Figure 5.18 and Figure 5.19. It seems that there are some patterns here. A swapping algorithm would have to reduce to objective function to get better total rate distribution on Slinks.

| Inabt Pater  | Slight5 | Slight6 | Slink4      | Slink6 | SEek2     | Stat 22 | Slink 22 | Slink24 | Slink 25 | Slink20 | Sink56   | Sink57   | Slink 60 | Stok61  | Stel:12 | SFek14  | Slinka | Slink10 | Slight1    |
|--------------|---------|---------|-------------|--------|-----------|---------|----------|---------|----------|---------|----------|----------|----------|---------|---------|---------|--------|---------|------------|
| ipgor reason | 0111120 | Unitad  | Contract of | Galaxy | Contract. | SHILL   | 000020   | 0111124 | Shines   | Chinado | 0.110.00 | Children | 0        | 0111102 | Unitad  | Unit 14 | Shins  | Unitab  | - Children |
| Single Low   | 2       | 2       | 2           | 2      | 2         | 1       | 1        | 1       | 1        | 1       | 0        | 0        | 0        | 0       | 0       | 0       | 0      | 0       | 0          |
| Single Mid   | 2       | 2       | 2           | 2      | 2         | 2       | 2        | 2       | 2        | 2       | 2        | 2        | 2        | 2       | 3       | 3       | 3      | 3       | 3          |
| Single High  | 1       | 1       | 1           | 1      | 1         | 1       | 1        | 1       | 1        | 1       | 2        | 2        | 2        | 2       | 1       | 1       | 1      | 1       | 1          |
| Double Low   | 0       | 0       | 0           | 0      | 0         | 0       | 0        | 0       | 0        | 0       | 0        | 0        | 0        | 0       | 0       | 0       | 0      | 0       | 0          |
| Double Mid   | 0       | 0       | 0           | 0      | 0         | 0       | 0        | 0       | 0        | 0       | 0        | 0        | 0        | 0       | 0       | 0       | 0      | 0       | 0          |
| Double High  | 0       | 0       | 0           | 0      | 0         | 0       | 0        | 0       | 0        | 0       | 0        | 0        | 0        | 0       | 0       | 0       | 0      | 0       | 0          |
| Total Rate   | 453     | 456     | 470         | 456    | 447       | 452     | 439      | 445     | 433      | 431     | 459      | 467      | 457      | 439     | 460     | 442     | 473    | 471     | 468        |

Figure 5.18. The number of double high medium and low lpGBTs are connected for some selected Slinks.

| lpgbt Rates | Slink112 | Slink115 | Slink117 | Slink103 | Slink104 | Slink102 | Slink108 | Slink109 | Slink157 | Slink160 | Slink137 | Slink98 | Slink129 | Slink221 | Slink177 | Slink62 | Slink63 | Slink93 | Slink94 |
|-------------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|---------|----------|----------|----------|---------|---------|---------|---------|
| Single Low  | 1        | 1        | 1        | 1        | 1        | 2        | 2        | 2        | 0        | 0        | 0        | 0       | 4        | 4        | 3        | 3       | 3       | 3       | 3       |
| Single Mid  | 2        | 2        | 2        | 2        | 2        | 2        | 2        | 2        | 3        | 3        | 3        | 4       | 1        | 1        | 1        | 1       | 1       | 1       | 1       |
| Single High | 2        | 2        | 2        | 2        | 2        | 2        | 2        | 2        | 2        | 2        | 2        | 2       | 2        | 2        | 2        | 2       | 2       | 2       | 2       |
| Double Low  | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0       | 0        | 0        | 0        | 0       | 0       | 0       | 0       |
| Double Mid  | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0       | 0        | 0        | 0        | 0       | 0       | 0       | 0       |
| Double High | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0        | 0       | 0        | 0        | 0        | 0       | 0       | 0       | 0       |
| Total Rate  | 498      | 509      | 500      | 484      | 499      | 508      | 506      | 498      | 535      | 540      | 552      | 568     | 560      | 504      | 501      | 494     | 480     | 486     | 488     |

Figure 5.19. The number of double high medium and low lpGBTs are connected for some selected Slinks.



Figure 5.20. How lpGBTs distribute to Slinks is on the left and lpGBTs vs Slinks and lpgbt vs Total Rates is on the right.

This 28 FPGA result was the first main result for the whole sector (28 FPGA, 336 Slink) as shown in Figure 5.17. However, it wasn't the best result for the whole sector. So a redistribution and swap algorithm was developed and applied the Couenne solver connections.

# 5.2.9. Link Connections after Redistribution and Swap Algorithm

What does the Redistribution Algorithm do?

- Take Slinks according to total rates. Match the highest and lowest rated ones and calculate total rate difference between these Slinks.
- Choose an lpGBT from higher rated Slink with rate smaller than and the closest to half of the total rate difference and assign it to low-rated Slink so that both Slink rate get closer to the average Slink rate.

What does the Swap Algorithm do?

- Take Slinks according to total rates. Match the highest and lowest rated ones and calculate total rate difference between these Slinks. Determine half of the difference as a key.
- Search an lpGBT pair and calculate the lpGBT rate difference between higher rated Slink lpGBTs and lower rated Slink lpGBTs. Higher rated lpGBT should be from higher rated Slink and lower rated lpGBT from lower one so that both Slink rate get closer to the average and switch the lpGBT assignment.

5.2.9.1. The Solution of Total Sector (28 FPGA, 336 Slink). Couenne Solver result is shown in Figure 5.21, average Slink rate is 457.41893059523795 and objective function value is 19.246068531537155.



Figure 5.21. Total rate distribution on number of Slinks.

5.2.9.2. The Solution of Total Sector after Redistribution (28 FPGA, 336 Slink). Redistribution algorithm is applied on the connections first and the result is shown in Figure 5.22, average Slink rate is 457.41893059523795 and objective function value is 11.346127885572258.



Figure 5.22. Total rate distribution on number of Slinks.

5.2.9.3. The Solution of the Total Sector (28 FPGA, 336 Slink) after Swap. Swap algorithm is applied after redistribution algorithm is used. The result is shown in Figure 5.23, average Slink rate is 457.41893059523795 and objective function value is 10.180634550816569. It has much narrower range of Slink rates than before. lpGBT connections to Slinks are shown in Figure 5.24 and lpGBT connections to FPGA are shown in Figure 5.25.



Figure 5.23. Total rate distribution on number of Slinks.



Figure 5.24. Total rate of each Slinkis shown on the left and total lpGBT number of each Slink is shown on the right.



Figure 5.25. On the left total lpGBT rates and on the right total lpGBT number distribution on FPGAs are shown.

A method to get optimized lpGBT connections to Slinks was created. Swap and redistribution algorithms can still be improved to get more and more optimized results. After that point, some updates done for BE DAQ architecture and link connections were determined using the same method to get the solution for 28 FPGA, and 336 Slink.

## 5.2.10. The First BE DAQ Architecture Updates, 224 and 312 Slink Studies

- One update is about data transmission speeds. 25 Gb/s optical links (Slinks) should be used instead of 16 Gb/s for data transmission from BE to DTH boards.
- Another update includes consideration of the DAQ architecture cost reduction by decreasing FPGA usage. 26 FPGA and 312 Slink are considered for such a reduced cost architecture.

5.2.10.1. A New Constraint for the Slinks Connection to FPGAs: Event Buffer Number. Event buffer is an IP block existing on FPGAs between lpGBTs and Slink connections as shown in Figure 5.1. Each event buffer can contain maximum 2 lpGBTs and those lpGBTs should be assigned to the same Slinks. Because of that, Slinks having odd number lpGBT connections lead to an increase in total event buffer number of an FPGA. However, the digital design of FPGAs can have a maximum 30 event buffer. So Slinks with odd number of lpGBT connections sometimes lead to FPGAs that exceed maximum event buffer number. To solve this problem, a swap algorithm for Slinks connections to FPGAs is developed.

5.2.10.2. Swap Algorithm for Slink connections on FPGAs. Once the result after redistribution and swap for lpGBT connections to Slinks is reached, swap algorithm for Slink connections on FPGAs does following:

- Calculate event buffer number of each FPGA. Pick the FPGA that has event buffer number more than 30. Search other FPGAs and pick the one with the lowest event buffer number.
- From the picked FPGAs, switch the Slinks that has minimum rate difference between them.
- Do it recursively until event buffer number of all FPGAs have below 30 and close to each other.

#### 5.2.10.3. Work Strategy for New Studies.

- lpGBTs are divided into parts: double, high, medium and low. Double and high lpGBTs are assigned to Slinks with naive distribution algorithm. Mediums and lows are divided into sub parts and then Couenne Solver is used to solve these parts and get the whole solution of the system.
- After Couenne Solver, a redistribution and swap algorithm is used to improve Couenne Results to get better standard deviation of total rates of Slinks.
- Finally another swap algorithm is used for the Slink connections on FPGAs to

get total event buffer number of each FPGA lower than 30 or determine a new maximum.

<u>5.2.10.4.</u> 224 Slink Results. Inputs are 28 FPGAs, 224 Slinks, 110 double and 1288 single lpGBTs, 1508 lpGBT in total. Max number of Slinks per FPGA is 8. Couenne Solver Result for the whole Sector and the result after redistribution and swap algorithms are used for lpGBT connections to Slinks are shown in Figure 5.26.



Figure 5.26. On the left Couenne solver result for the whole sector and on the right the result after algorithms applied are shown.

Before using swap algorithm for Slinks connection to FPGAs, lpGBT connections of FPGAS are shown in Figure 5.27. Event buffer numbers of FPGAs of 224 Slink case are calculated and illustrated in Table 5.1. After using swap algorithm for Slinks connection to FPGAs, lpGBT connections of FPGAS are shown in Figure 5.28 and event buffer numbers of FPGAs are shown in Table 5.1.



Figure 5.27. On the left total lpGBT number and on the right total rate distribution on the FPGAs are shown.

Table 5.1. Event buffer number of FPGAs before using the swap algorithm is given first and then the numbers after using swap algorithm are given in the event buffer number columns (before-after).

| FPGA | Event  | FPGA | Event  | FPGA | Event  | FPGA | Event  |  |
|------|--------|------|--------|------|--------|------|--------|--|
|      | Buffer |      | Buffer |      | Buffer |      | Buffer |  |
|      | Number |      | Number |      | Number |      | Number |  |
| 1    | 29-29  | 8    | 28-28  | 15   | 29-29  | 22   | 27-27  |  |
| 2    | 28-28  | 9    | 31-30  | 16   | 29-29  | 23   | 29-29  |  |
| 3    | 30-30  | 10   | 30-30  | 17   | 32-30  | 24   | 29-29  |  |
| 4    | 26-26  | 11   | 30-30  | 18   | 30-30  | 25   | 29-29  |  |
| 5    | 29-29  | 12   | 29-29  | 19   | 29-29  | 26   | 28-28  |  |
| 6    | 28-28  | 13   | 28-28  | 20   | 30-30  | 27   | 26-29  |  |
| 7    | 28-28  | 14   | 24-24  | 21   | 30-30  | 28   | 31-30  |  |



Figure 5.28. On the left total lpGBT rates and on the right total lpGBT number distribution on the FPGAs are shown.

5.2.10.5. 312 Slink Results. Inputs are 26 FPGAs, 312 Slinks, 110 double and 1288 single lpGBTs, 1508 lpGBT in total. Max number Slink per FPGA is 12. Couenne Solver Result for the whole Sector and the result after redistribution and swap algorithms are used for lpGBT connections of Slinks are shown in Figure 5.29. After using swap algorithm for Slinks connection to FPGAs, lpGBT connections of FPGAs are shown in Figure 5.30 and event buffer numbers of FPGAs are calculated and illustrated in Table 5.2.



Figure 5.29. Total rate distributions on number of Slinks are given. On the left Couenne solver result for the whole sector and on the right, result after algorithms applied are shown.



Figure 5.30. On the left, total lpGBT rates and on the right, total lpGBT number distribution on the FPGAs are shown.

At that point the 312 Slink case should have been completed but it was a very squeezed case to handle max event buffer number constraint. After swap algorithm for Slink connections on FPGAs, event buffer numbers of FPGAs look very problematic, almost all above 30 and so it is not possible to get all below 30. A swap algorithm was improved to get better event buffer numbers to handle this issue.

Table 5.2. Problematic event buffer number of FPGAs after swap.

| FPGA | Event  | FPGA | Event  | FPGA | Event  | FPGA | Event  |
|------|--------|------|--------|------|--------|------|--------|
|      | Buffer |      | Buffer |      | Buffer |      | Buffer |
|      | Number |      | Number |      | Number |      | Number |
| 1    | 36     | 8    | 32     | 15   | 33     | 22   | 33     |
| 2    | 32     | 9    | 32     | 16   | 33     | 23   | 30     |
| 3    | 30     | 10   | 31     | 17   | 33     | 24   | 33     |
| 4    | 31     | 11   | 31     | 18   | 31     | 25   | 31     |
| 5    | 32     | 12   | 32     | 19   | 31     | 26   | 34     |
| 6    | 34     | 13   | 32     | 20   | 35     |      |        |
| 7    | 32     | 14   | 30     | 21   | 33     |      |        |

The problematic event buffer numbers of 312 Slink is solved with some extra works listed below:

- New redistribution algorithm is developed so that total lpGBT number of each Slink is even.
- First even lpGBT number redistribution algorithm is used and after that swap algorithm is used to improve total Slink rate distribution. The results are given in Figure 5.31.
- Improved swap algorithm for Slink connection on FPGAs is used to get optimized event buffer numbers.



Figure 5.31. On the left, result after all lpGBTs are even and on the right, result after algorithms applied are shown.

Before using swap algorithm for Slink connections on FPGAs, when even number of lpGBTs are connected to each Slink, lpGBT connections of FPGAs are shown in Figure 5.32 and event buffer numbers of FPGAs are illustrated in Table 5.3.



Figure 5.32. On the left, lpGBT rates and on the right, total lpGBT number distribution on the FPGAs are shown.
| FPGA | Event  | FPGA | Event  | FPGA | Event  | FPGA | Event  |
|------|--------|------|--------|------|--------|------|--------|
|      | Buffer |      | Buffer |      | Buffer |      | Buffer |
|      | Number |      | Number |      | Number |      | Number |
| 1    | 34     | 8    | 29     | 15   | 31     | 22   | 30     |
| 2    | 27     | 9    | 27     | 16   | 30     | 23   | 28     |
| 3    | 28     | 10   | 30     | 17   | 27     | 24   | 29     |
| 4    | 29     | 11   | 27     | 18   | 31     | 25   | 27     |
| 5    | 29     | 12   | 29     | 19   | 26     | 26   | 28     |
| 6    | 29     | 13   | 30     | 20   | 32     |      |        |
| 7    | 29     | 14   | 27     | 21   | 31     |      |        |

Table 5.3. Event buffer number of FPGAs before swap.

After using swap algorithm for Slink connections on FPGAs, when even number of lpGBTs are connected to each Slink, lpGBT connections of FPGAs are shown in Figure 5.33 and event buffer numbers of FPGAs are shown in Table 5.4.



Figure 5.33. Total lpGBT number (on the left) and rate distribution (on the right) on the number of FPGAs after swap algorithm for Slink connections on FPGAs.

| FPGA | Event FPGA |    | Event  | FPGA | Event  | FPGA | Event  |
|------|------------|----|--------|------|--------|------|--------|
|      | Buffer     |    | Buffer |      | Buffer |      | Buffer |
|      | Number     |    | Number |      | Number |      | Number |
| 1    | 30         | 8  | 29     | 15   | 30     | 22   | 29     |
| 2    | 28         | 9  | 28     | 16   | 30     | 23   | 29     |
| 3    | 28         | 10 | 30     | 17   | 28     | 24   | 29     |
| 4    | 29         | 11 | 28     | 18   | 30     | 25   | 29     |
| 5    | 29         | 12 | 29     | 19   | 28     | 26   | 29     |
| 6    | 29         | 13 | 30     | 20   | 29     |      |        |
| 7    | 30         | 14 | 28     | 21   | 29     |      |        |

Table 5.4. Event buffer number of FPGAs after using swap algorithm.

After 224 Slink and 312 Slink studies, another update was done for the BE DAQ architecture. The DAQ BE architecture have been adapted to use of VU13P FPGAs. As a result, a new hardware baseline is proposed. The new baseline has a DAQ FPGA receiving 108 input lpGBTs (nine 12-channel transceivers) instead of 60 lpGBTs as former studies. There are still 12 Slink outputs to cDAQ, running at 25 Gbit/s instead of 16 Gb/s. With the expected number of FE DAQ lpGBTs, up to 96 boards are likely to be needed.

<u>5.2.10.6. 16 FPGA Studies.</u> Inputs are 16 FPGA, 192 Slink, 110 double and 1288 single lpGBT, in total 1508. Max Slink per FPGA, lpGBT number per FPGA and event buffer number per FPGA are 12, 108 and 54, respectively.



Figure 5.34. On the left total rate distribution on Slinks of Couenne solver result for the whole sector and on the right, result after algorithms applied are shown.

The swap algorithm for lpGBT connections of Slinks was improved so that we get more balanced Slink rate load. Improved swap algorithm does the following:

- Take Slinks according to the total rates. Match the highest and lowest rated ones and calculate the total rate difference between these Slinks. Determine the half of the difference as a key.
- Search an lpGBT pair and calculate the lpGBT rate difference between higher rated Slink lpGBTs and lower rated slink lpGBTs, lpGBT pair should consists of higher rated lpGBT from higher rated Slink and lower rated lpGBT from lower rated Slink so that both Slink rate get closer to the average Slink rate and switches the assigning of lpGBTs.

• When the search is over, keep the objective value and start the algorithm from beginning and update objective values. Do it recursively until the difference between final and adjacent objective value is around 0.003.

After this improvement and related results, the use for redistribution algorithm was not needed; only this improved swap algorithm is used. Slink rate result of improved swap algorithm is given in Figure 5.35. It can be seen that this result is the best result achieved for minimizing standard deviation of Slink rate loads.



Figure 5.35. Total rate distribution on number of Slinks after improved swap algorithm.

What about lpGBTs vs FPGAs? Before using swap algorithm for Slink connections on FPGAs, lpGBT rates and connections of FPGAs are shown in Figure 5.36 and event buffer numbers of FPGAs are calculated and illustrated in Table 5.5.



Figure 5.36. On the left, lpGBT rates and on the right, total lpGBT number distribution on the FPGAs are shown.

| FP | $\mathbf{GA}$ | Event  | FPGA | Event  | FPGA | Event  | FPGA | Event  |
|----|---------------|--------|------|--------|------|--------|------|--------|
|    |               | Buffer |      | Buffer |      | Buffer |      | Buffer |
|    |               | Number |      | Number |      | Number |      | Number |
| 1  |               | 50     | 5    | 48     | 9    | 50     | 13   | 52     |
| 2  |               | 53     | 6    | 52     | 10   | 51     | 14   | 50     |
| 3  |               | 52     | 7    | 52     | 11   | 49     | 15   | 51     |
| 4  |               | 49     | 8    | 50     | 12   | 50     | 16   | 48     |

Table 5.5. Event buffer number of FPGAs before using swap algorithm.

After using the swap algorithm for Slink connections on FPGAs, lpGBT connections to FPGAS are shown Figure 5.37 and event buffer numbers of FPGAs are shown in Table 5.6. All rates and numbers are balanced and within the limits.



Figure 5.37. On the left, total lpGBT rates and on the right, total lpGBT number distribution on the FPGAs are shown.

| F | PGA | Event  | FPGA | Event  | FPGA | Event  | FPGA | Event  |  |  |
|---|-----|--------|------|--------|------|--------|------|--------|--|--|
|   |     | Buffer |      | Buffer |      | Buffer |      | Buffer |  |  |
|   |     | Number |      | Number |      | Number |      | Number |  |  |
| 1 |     | 50     | 5    | 51     | 9    | 50     | 13   | 51     |  |  |
| 2 |     | 51     | 6    | 51     | 10   | 50     | 14   | 50     |  |  |
| 3 |     | 51     | 7    | 50     | 11   | 50     | 15   | 51     |  |  |
| 4 |     | 50     | 8    | 50     | 12   | 50     | 16   | 51     |  |  |

Table 5.6. Event buffer number of FPGAs after using swap algorithm.

The HGCAL BE DAQ electronics have many tasks and one of them is the optimization of link connections to be load balanced. For the 16 FPGA case, very promising results are obtained. Currently, for a load balancing problem, there is a reliable method that gives one of the most optimized results for the lpGBT connections to Slinks and Slink connections on FPGAs. The other task of HGCAL BE DAQ electronics that is studied is the IBERT for different FPGAs to ensure efficient data transmission.

# 6. THE IBERT OF DIFFERENT FPGAS

As mentioned before, lpGBTs will be connected to FPGAs in the BE DAQ and the data will be transmitted from the Slinks to cDAQ. lpGBTs will be connected to the Slinks via the Firefly modules [39]. On the Serenity board, a framework, extensible modular (data) processor (EMP)<sup>2</sup> [40] will be used for the FPGAs, which allows rapid algorithm development independent of the underlying infrastructure or even choice of FPGA. Optical links are designed at 16 Gb/s and 25 Gb/s speeds. Tests are conducted with eye/bathtub diagrams of IBERT for the optical channels and connectors and longterm data transfers to ensure that the quality of the physical connection implementation is good enough. An eye diagram is used to get an idea of the signal quality. Eye diagram of an IBERT gives information of how the bit error rate approaches to the zero in time. For the most ideal case, very efficient data transmission, the eye diagram for signals consist of two parallel lines, forming a bathtub shape and whose fall times are very close to zero.

## 6.1. Serenity ATCA Board as an HGCAL DAQ Prototype

Serenity board is a data engine using optical connectors and allows high data output (up to 5 TBps). It is combined with EMP. As illustrated in Figure 6.1, Serenity is a carrier card with two daughter card sites, a COM Express [41] (small CPU), an FPGA (artix-7) based around an ATCA [27] form factor providing power, ethernet, clock, cooling and a micro-backplane adaptor for benchtop usage.

## 6.1.1. The Serenity Hardware

• Carrier card - the main board containing a CPU, power adaptors, ETHERNET, two daughter card sites, a service FPGA, IPMC card, various LED indicators and other headers.

<sup>&</sup>lt;sup>2</sup>It is a framework that provides common infrastructural firmware as well as corresponding control and monitoring software to support arbitrary created firmware.

- Daughter card site- the two interposers(X0/X1) that contain the main FPGAs for data processing.
- CPU this is a COM Express processor running centos 7. The SSD hard-drive is located on the underside of the board. There are connections for USB3 (keyboard, mouse, exernal drive/ethernet) and display port (switchable to VGA/HDMI). The COM Express can access multiple locations on the board and to the FPGAs.
- ATCA socket the card is designed to be mounted in a full ATCA crate, however, a micro-backplane can also be provided to allow benchtop running with an external power supply (+48V) and RJ45 ethernet connection.



Figure 6.1. A Serenity board with components.

# 6.1.2. The Serenity Software

- EMP is the main firmware development framework which allows the main FPGA infrastructure to be decoupled from algorithmic development and deployment.
- IPBB is an associated tool which manages firmware repositories generation of projects for synthesis and building using the Vivado tool.
- IPMC has an associated slow control protocol and software for monitoring and controlling aspects of the board, e.g. temperature.

• SMASH is a framework providing communication among the Serenity components and attached to the slow control and diagnostics.

## 6.1.3. Logical Connectivity of the Serenity Board

The Serenity board has a flexible connectivity. There are number of ways in which to communicate with the carrier card and the FPGAs and the other hardware components. A logical diagram showing all the connection possibilities is shown in Figure 6.2.



Figure 6.2. All connection possibilities of the Serenity board.

On the diagram, three main methods of connectivity are shown:

- Ethernet to the ComExpress CPU (through the ATCA backplane)
- JTAG for direct programming of the FPGA Daughter Cards
- PCIe to "talk" to the daughter cards through the CPU (empbutler etc...)

Also, there are three main methods to program the FPGA daughter cards:

- Method 1: ATCA  $\rightarrow$  ETHERNET-switch  $\rightarrow$  ComExpress  $\rightarrow$  PCIe  $\rightarrow$  artix-7  $\rightarrow$  JTAG  $\rightarrow$  daughter-card.
- Method 2: ComExpress  $\rightarrow$  PCIe  $\rightarrow$  artix-7  $\rightarrow$  JTAG  $\rightarrow$  daughter-card.
- Method 3: JTAG directly  $\rightarrow$  artix-7  $\rightarrow$  JTAG  $\rightarrow$  daughter-card.

In the IBERT studies, the chosen method of connectivity was method 1 and it has a structure as seen in Figure 6.3.



Figure 6.3. The chosen method of programming FPGA.

#### 6.2. IBERT of different FPGAs using EMP Framework

The FE and read-out systems contain ASICs. These chips digitize signals and send them to the BE part. Sometimes, detector design in charge of data transmission has to be changed or fixed. Fixing minor design changes takes significant time and money, and the replacement chip has to be built from scratch. That's why, the trigger electronics are not implemented with ASICs, as before. Instead, FPGAs will be used. FPGAs can be reprogrammed in the field without effort to take it out. Any programming processes of FPGAs on Serenity Board will use EMP Framework. Hence, IBERT for FPGAs are conducted using EMP framework to ensure efficient data transmission.

To perform IBERT, one first needs to configure the link firmware of the FPGA and then scan the links. The empbutler commands are used for manually configuring and testing the link firmware. As part of these tests, the data path firmware is configured to inject pattern data out of the transceiver (TX) and into the receiver (RX) modules, and then capture the data. To test the FPGA, these steps should be followed;

- Reset the TTC<sup>3</sup> block.
- Load pattern into the output channels' buffers, and configure those buffers to send that data into the TX.
- Reset and configure the link firmware.
- Align data received on each channel, so that the data valid signal rises from 0 to 1 on the same clock edge for all channels.
- Scan the links.

To do these, first of all, we need to create a bit file that contains the necessary algorithm to be loaded on to the FPGA, it also enables FPGA connectors to take data close to real collision conditions. This bit file will work within the EMP framework installed in ComExpress after programming the FPGA, and auxiliary files so that all the other hardware parts on Serenity can work and communicate in harmony while programming the FPGA. Vivado Design Suite [42] was used to create necessary bit files and auxiliary files.

## 6.2.1. Creating Bit and Auxiliary Files using EMP Framework

Two basic algorithms are used for IBERT of FPGA, one is the null algorithm that moves the data in the input buffers to the output buffers, and the other is a custom algorithm (plus 5 algorithm) which is used to add +5 to the input buffers. Using the

<sup>&</sup>lt;sup>3</sup>The TTC system is a unidirectional optical fibre based transmission system. It has two information channels. One channel carries L1A information and the other carries data information for the sending of various reset commands or calibration, control and test parameters.

Vivado design program, several bit files and auxiliary files were created that load these algorithms on to the FPGAs KU15P so2.v0, KU15P so1.v1 and VU7P so1.v1. These are:

For VU7P so1.v1:

- Bit file containing null algorithm and auxiliary files.
- Bit file with plus 5 algorithm and auxiliary files.
- Bit file and auxiliary files with null algorithm at 25 gb/s.
- Bit file and auxiliary files with null algorithm at 16 gb/s.
- Note: The usual frequency value of bit files algorithms is 40 Mhz.

For KU15P so2.v0 :

- Bit file containing null algorithm and auxiliary files.
- Bit file with plus 5 algorithms and auxiliary files.
- Bit file and auxiliary files with empty algorithm at 240 MHz frequency.

For KU15P so1.v1:

• Bit file containing null algorithm and auxiliary files.

# 6.2.2. NearPMA IBERT of VU7P so1.v1

NearPMA IBERT for each generated bit file is done to test data transmission via optical channels of FPGAs. Data move only within FPGA channels, without passing through the optical connector. First of all, NearPMA test was performed for the bit file containing the null algorithm for VU7P so1.v1. The tested channels are channels 12-23. After successful NearPMA tests, bathtub graphs were obtained showing the variation of the bit error rate with time. Bathtub plots for channels 12-23 of VU7P so1.v1 are shown in Figure 6.4. In the Bathtub graphs, the bit error rate decreases to zero over time. The faster and smoother this happens, the wider the gap between the curves on the graphs will be which ensures the data transmission without problems. The percentage on the graph represents the width of this opening.



Figure 6.4. The bathtub plots for VU7P so1.v1.

# 6.2.3. NearPMA IBERT of FPGA KU15P so2.v0

For the test, a bit file was created for KU15P so2.v0 with a null algorithm at 240 MHz. The purpose is to see if testing works using EMP framework during high frequency bit pass (high speed data transmission). All channels of KU15P so2.v0 have been tested with this bit file. The bathtub plots of NearPMA IBERT of KU15P so2.v0 for channels 0-36 and 36-71 using the null algorithm bit file at frequency 240 MHz are shown in Figure 6.5 and Figure 6.6, respectively.



Figure 6.5. The bathtub plots for channels 0-35 of KU15P so2.v0.



Figure 6.6. The bathtub plots for channels 36-71 of KU15P so2.v0.

# 6.2.4. NearPMA IBERT of FPGA KU15P so1.v1

All tests for VU7P so1.v1 and KU15P so2.v0 were conducted at CERN. The same, Serenity ATCA board and all required equipment and software infrastructure was setup in KareBlok lab at Boğaziçi University, in Istanbul. Bit file and auxiliary files with null algorithm and IBERT for FPGA KU15P s01.v1 and its channels 12-23 are succeed and bathtub plots are obtained as shown in Figure 6.7.



Figure 6.7. The bathtub plots for channels 12-23 of KU15P so1.v1.

#### 6.2.5. NearPMA IBERT of FPGA VU7P so.v1 at Different Speeds

Another kind of IBERT is Loopback. It basically receives the data to optical connectors and optical links transmit the data back to FPGA to see the optical connectors performance. Since Loopback IBERT of FPGA VU7P so.v1 give errors, to understand compatibility of tested FPGA and EMP firmware, two separate bit files are created for VU7P so1.v1 at two different speed considering the firmware design of output speed (16 Gb/s and 25 Gb/s) and FPGA is tested separately using these bit files. The created bit files are: bit files and auxiliary files with null algorithm at 16 Gb/s and 25 Gb/s.

6.2.5.1. NearPMA IBERT of FPGA VU7P so1.v1 using 16 Gb/s Null Algorithm. The NearPMA IBERT gave results and bathtub plots were created for channels 0-11 and 12-23 of VU7P so1.v1, using bit file containing 16 Gb/s null algorithm. The plots for channels of 0-11 are shown in Figure 6.8 and of 12-23 are shown in Figure 6.9.



Figure 6.8. The bathtub plots for channels 0-11 of VU7P so1.v1 at speed 16 Gb/s.



Figure 6.9. The bathtub plots for channels 12-23 of VU7P so1.v1 at speed 16 Gb/s.

6.2.5.2. NearPMA IBERT of FPGA VU7P so1.v1 using 25 Gb/s Null Algorithm. The NearPMA IBERT gave results. Bathtub plots were created using a bit file containing 25 Gb/s null algorithm for channels 0-11 as shown in Figure 6.10.



Figure 6.10. The bathtub plots (25Gb/s) for VU7P so1.v1 for channels 0-11.

After the successful results for IBERT at different speeds, the IBERT loopback test are planned to conduct for the studies at Kare Blok Lab in Istanbul. Many IBERT for different FPGAs were conducted. Many successful NearPMA tests were carried out and bathtubs were obtained to test the data transmission via optical channels of FPGAs. It ensures the efficient transmission via those channels. IBERT giving errors for a selected optical channels or connectors gives the information that EMP firmware needs to be enhanced for the tested FPGAs and Serenity ATCA board.

# 7. CONCLUSION

Two main tasks for HGCAL Readout Electronics of Phase-2 upgrade were studied to handle with huge data expected from the HL-LHC collisions. First is the optimization of link connections of FPGAs on Serenity ATCA boards. HGCAL Readout Electronics design relies on high-bandwidth data transmission via optical links and FPGA technology. As the DAQ FPGA firmware progresses, DAQ architecture changes, and for each architecture load balanced link connections are reacquired to create an efficient data acquisition system. The link connections of FPGAs KU15P, VU7P, and VU13P were optimized according to BE DAQ architecture baselines. Currently, for a load balancing problem, there is a reliable method that gives one of the most optimized results, it was used for VU13P, 16 FPGA case.

The other study is IBERT of FPGAs to see if those FPGAs can work within the EMP framework with efficient data transmission. Optical connector tests for VU7P so1.v1 showed that the EMP firmware of FPGAs needs to be improved. The IBERT on the all channels of FPGAs for KU15P so2.v0 and channels 12-23 of FPGA KU15P so1.v0 was completed successfully. The IBERT on the channels 0-11 and 12-23 of FPGA VU7P so1.v1 at the speed of 16 Gb/s and channels 0-11 of FPGA VU7P so1.v1 at the speed of 25 Gb/s were also completed successfully. These ensure that the data transmission using the tested channels of KU15P so2.v0, KU15P so1.v1 and VU7P so1.v1 can work efficiently with the EMP framework.

The optical channels and connectors that give errors for the IBERT provide the information that EMP firmware needs to be enhanced for the tested FPGAs and Serenity ATCA board. It is still in progress. The successful optimization method for link connections offers a reliable tool for link connection studies of BE DAQ. A new optimization study is in progress for the new HGCAL FE simulation data.

# REFERENCES

- CERN, "The Large Hadron Collider", https://home.cern/science/accelerators/ large-hadron-collider, accessed on January 2022.
- Vidal, X. and R. Manzano, "Taking a Closer Look at LHC", https://www.lhc -closer.es/taking\_a\_closer\_look\_at\_lhc/1.home, accessed on January 2022.
- Wikipedia, "The Large Hadron Collider", https://en.wikipedia.org/wiki/Large \_Hadron\_Collider, accessed on January 2022.
- CERN, "About Worldwide LHC Computing Grid.", https://wlcgpublic.web.cern.ch/about, accessed on February 2022.
- CMS Collaboration, High-Luminosity Large Hadron Collider (HL-LHC): Technical Design Report, CERN Yellow Reports: Monographs, CERN, Geneva, 2020.
- CERN, "The Physics Potential of HL-LHC", https://twiki.cern.ch/twiki/ pub/LHCPhysics/HLHELHCWorkshop/report.pdf, accessed on January 2022.
- CERN, "Report Reveals Full Reach of LHC Programme", https://cern courier.com/a/report-reveals-full-reach-of-lhc-programme/, accessed on January 2022.
- Wikipedia, "High Luminosity Large Hadron Collider", https://en.wiki pedia.org/wiki/High\_Luminosity\_Large\_Hadron\_Collider, accessed on January 2022.
- Schmidt, B., "The High-Luminosity Upgrade of the LHC: Physics and Technology Challenges for the Accelerator and the Experiments", *Journal of Physics: Confer*ence Series, Vol. 706, p. 022002, 2016.

- Damerau, H., A. Funken, R. Garoby, S. Gilardoni, B. Goddard, K. Hanke, A. Lombardi, D. Manglunki, M. Meddahi, B. Mikulec, G. Rumolo, E. Shaposhnikova, M. Vretenar and J. Coupard, *LHC Injectors Upgrade, Technical Design Report*, CERN, Geneva, 2014.
- 11. CERN, "LHC Injectors Upgrade, Technical Design Report", http://cds.cern.ch/record/2153863, accessed on February 2022.
- Contardo, D., M. Klute, J. Mans, L. Silvestris and J. Butler, "Technical Proposal for the Phase-II Upgrade of the CMS Detector", https://cds.cern.ch/record/2020886, accessed on March 2022.
- 13. Handl, J., "Jet Energy Calibration of the CMS Detector with  $Z(\rightarrow e + e^{-})+Jet$ Events at  $\sqrt{s}=13$  TeV.", https://publish.etp.kit.edu/record/21378, accessed on January 2022.
- 14. CMS Collaboration, "Tracking CMS Experiment", https://cms.cern/detector/identifying-tracks, accessed on January 2022.
- 15. CMS Collaboration, CMS Physics: Technical Design Report Volume 1: Detector Performance and Software, Technical Design Report. CMS, CERN, Geneva, 2006.
- 16. LHCb Collaboration, "Letter of Intent for the LHCb Upgrade", http://cds.cern.ch/record/1333091, accessed on February 2022.
- CMS Collaboration, "The Phase-2 Upgrade of the CMS Endcap Calorimeter", https://cds.cern.ch/record/2293646, accessed on March 2022.
- CMS Collaboration, "The Phase-2 Upgrade of the CMS Barrel Calorimeters", https://cds.cern.ch/record/2283187, accessed on March 2022.
- 19. Dias De Almeida, P., "Versatile Systems for Characterization of Large-area Silicon

Pad Sensors for Highly-granular Calorimetry", *JINST*, Vol. 15, p. C06050. 8 p, 2020.

- Brondolin, E., "Silicon Sensors for the HGCAL Upgrade Challenges, Sensor Design and Electrical Characterization. Silicon Sensors for the CMS HG-CAL Upgrade: Challenges, Sensor Design&Amp; Electrical Characterization", http://cds.cern.ch/record/2797460, accessed on March 2022.
- 21. Pitters, F., E. Brondolin, D. Dannheim, S. Kulis, A. A. Maier, T. Quast and E. Sicking, "ARRAY: An Open Source, Modular and Probe-Card Based System with Integrated Switching Matrix for Characterisation of Large Area Silicon Pad Sensors", *Nuclear Instruments and Methods in Physics Research*, Vol. 940, pp. 168–173. 6 p, 2019.
- Akchurin, N., P. Almeida, G. Altopp, M. Alyari, T. Bergauer, E. Brondolin, B. Burkle, W. Frey, Z. Gecse, U. Heintz, N. Hinton, V. Kuryatkov, R. Lipton, M. Mannelli, T. Mengke, P. Paulitsch, T. Peltola, F. Pitters, E. Sicking, E. Spencer, M. Tripathi, M. B. Pinto, J. Voelker, Z. Wang and R. Yohay, "Charge Collection and Electrical Characterization of Neutron Irradiated Silicon Pad Detectors for The CMS High Granularity Calorimeter", *Journal of Instrumentation*, Vol. 15, No. 09, pp. P09031–P09031, 2020.
- Thienpont, D. and C. de la Taille, "Performance Study of HGCROCv2: The Front-end Electronics for the CMS High Granularity Calorimeter", https://cds.cern.ch/record/2712269, accessed on April 2022.
- 24. Caratelli, A., S. Bonacini, K. Kloukinas, A. Marchioro, P. Moreira, R. D. Oliveira and C. Paillard, "The GBT-SCA, A Radiation Tolerant ASIC for Detector Control and Monitoring Applications In HEP Experiments", *Journal of Instrumentation*, Vol. 10, No. 03, pp. C03034–C03034, 2015.
- Soós, C., S. Détraz, L. Olanterä, C. Sigaud, J. Troska, F. Vasey and M. Zeiler, "Versatile Link PLUS Transceiver Development", *Journal of Instrumentation*, Vol. 12,

pp. C03068-C03068, 2017.

- Luyi, S., F. Jinyi and Y. Xiaohua, "Forward Error Correction", Fourth International Conference on Computational and Information Sciences, pp. 37–40, 2012.
- PICMG, "AdvancedTCA Overview", https://www.picmg.org/openstandards/ advancedtca/, accessed on May 2022.
- Rose, A., D. Parker, G. Iles, O. Sahin, P.-A. Bausson, A. Tsirou, G. Fedi, P. Verdini,
   L. Ardila, M. Balzer, T. Schuh, T. Williams, A. Thea, K. Harder, S. Dugad,
   R. Shukla and I. Mirza, "Serenity: An ATCA Prototyping Platform for CMS Phase-2", p. 115, 2019.
- Hegeman, J., R. Blažek, U. Behrens, J. Branson, P. Brummer, S. Cittolin, D. Silva-Gomes, G.-L. Darle, C. Deldicque, M. Dobson, N. Doualot, J. Fulcher, D. Gigi, M. Gladki, F. Glege, G. Gomez-Ceballos, M. Hansen, D. Jevtic, W. Li and P. Zejdl, "First Measurements with the CMS DAQ and Timing Hub Prototype-1", p. 111, 2020.
- 30. Varela, J., "CMS L1 Trigger Control System", http://cds.cern.ch/record/687458, accessed on April 2022.
- Ahuja, S., "Concepts and Design of the CMS High Granularity Calorimeter Level 1 Trigger", http://cds.cern.ch/record/2780124, accessed on April 2022.
- 32. CMS Collaboration, "The Phase-2 Upgrade of the CMS Data Acquisition and High Level Trigger", https://cds.cern.ch/record/2759072, accessed on April 2022.
- 33. S. Mallios and P. Vichoudis, "The HGCAL Data Acquisition System: Architecture and Implementation Overview", https://indico.cern.ch/event/1034827/ contributions/4385481/attachments/2256521/3829149/20210602\_hgc \_daq\_overview\_smallios.pdf, accessed on March 2022.

- NEOS, "Mixed Integer Nonlinear Programming", https://www.samtec.com/ optics/optical-cable/mid-board/firefly#features, accessed on March 2022.
- 35. PYOMO, "Pyomo Overview", https://pyomo.readthedocs.io/en/stable/ pyomo\_overview/index.html, accessed on March 2022.
- Belotti, P., J. Lee, L. Liberti, F. Margot and A. Wächter, "Branching and Bounds Tightening Techniques for Non-convex MINLP", *Optimization Methods and Software*, Vol. 24, pp. 597–634, 2009.
- 37. COIN-OR, "Couenne a User's Manual", https://www.coin-or.org/Couenne/ couenne-user-manual.pdf, accessed on March 2022.
- Huang, L., X. Chen, W. Huo, J. Wang, F. Zhang, B. Bai and L. Shi, "Branch and Bound in Mixed Integer Linear Programming Problems: A Survey of Techniques and Trends", *CoRR*, Vol. abs/2111.06257, 2021.
- 39. SAMTEC, "Firefly Micro Flyover System", https://www.samtec.com/ optics/optical-cable/mid-board/firefly#features", accessed on March 2022.
- CERN EMP Collaboration, "EMP Framework Guide", https://serenity.web. cern.ch/serenity/emp-fwk/, accessed on March 2022.
- PICMG, "COM Express Overview.", https://www.picmg.org/openstandards/ com-express/, accessed on May 2022.
- Xilinx, "Vivado Overview.", https://www.xilinx.com/products/design-tools/ vivado.html, accessed on March 2022.
- GEKKO, "GEKKO Optimization Suite", https://gekko.readthedocs.io/en/ latest/index.html#", accessed on March 2022.

# APPENDIX A: ADDITIONAL RESULTS AND INFORMATIONS

## A.1. Early Studies

## A.1.1. A Realistic Toy Example with GEKKO

Studies are started with a small realistic toy example with 66 single and 6 double lpGBT inputs, 12 Slinks and 1 FPGA. lpGBT rates that are used are shown in Figure A.1. An optimization suite GEKKO [43] is used for modelling the link connections and the system constraints. Also, Apopt solver (an open source MINLP solver) is used to get the assigning information of lpGBTs to Slinks. Implemented constraints to imitate the whole sector:

- 1 lpGBT (single or double) can only be connected to 1 Slink.
- Maximum 12 lpGBTs can be connected to 1 Slink.
- 1 Slink can only be connected to 1 FPGA.
- A total of 12 Slinks can be on 1 FPGA.
- Maximum data rate that a Slink can have (Slinkrate Threshold) so that MINLP solver can converge on a solution.

## A.1.2. The First Results for Toy Example

The results for toy example are given in Figure A.1 showing total data load on Slinks after lpGBTs connection, lpGBT rates, double lpGBT connections to Slinks and single lpGBT connections to Slinks according to rates.



Figure A.1. Total rate distribution on number of Slinks is at top left, lpgbt rates for toy example is next to it, double and single lpGBT connections are shown at bottom part of figure.

# A.1.3. Results for Toy Example with 2 FPGA

Toy example conducted with 2 FPGAs. Every input is almost doubled. Results are shown in Figure A.2 and Figure A.3.



Figure A.2. On the left, total rate distribution on number of Slinks and on the right, double lpGBT connections for 2 FPGAs are shown.



Figure A.3. Single lpGBT connections for 2 FPGA case.

Gekko is very parameter sensitive. When the input order changes, solution also changes. Slinkrate threshold has a big impact on whether the solver can find a solution.

#### A.1.4. Input Order Effect on Solutions

The Solver is fed with lpGBT rates in the order of High/Medium/Low and Low/Medium/High as given in Figure A.4. The order of given lpGBTs effects results as shown in Figure A.5. Also there was no pattern on how lpGBTs are assigned on Slinks as can be seen from Figure A.6.

| <pre>array([ 12.189 , 12.1951 , 12.1958 , 12.1976 , 12.1992 , 12.2023 ,<br/>12.2093 , 12.2107 , 12.2131 , 12.2151 , 12.2246 , 12.2254 ,<br/>12.3632 , 12.3746 , 12.3769 , 12.3806 , 12.3828 , 12.3908 ,<br/>12.3921 , 12.3953 , 12.394 , 12.4002 , 97.8859 , 97.9267 ,<br/>97.9481 , 97.9649 , 97.9662 , 98.0463 , 98.0734 , 98.1765 ,<br/>98.408 , 98.421 , 98.4423 , 98.4748 , 98.5403 , 98.583 ,<br/>98.69173 , 98.7012 , 98.6008 , 99.103 , 99.2019 , 99.3158 ,<br/>99.4075 , 99.4091 , 211.6275 , 211.6944 , 213.5566 , 214.1199 ,<br/>214.637 , 214.7666 , 219.1277 , 219.258 , 222.475 , 222.559 ,<br/>223.7453 , 224.1645 , 225.882 , 225.937 , 228.3437 , 228.557 ,<br/>228.602 , 228.8746 , 240.2474 , 240.4077 , 262 , 671 , 262 , 341 )</pre> | <pre>array([ 12.189 , 12.1951 , 12.1958 , 12.1976 , 12.1992 , 12.2023 ,<br/>12.2093 , 12.2107 , 12.2131 , 12.2151 , 12.2246 , 12.2254 ,<br/>12.3632 , 12.3748 , 12.3769 , 12.3806 , 12.3828 , 12.3908 ,<br/>12.3921 , 12.3953 , 12.3994 , 12.4002 , 211.6275 , 211.6944 ,<br/>213.5566 , 214.1199 , 214.637 , 214.7686 , 219.1277 , 219.2958 ,<br/>222.475 , 222.559 , 223.7453 , 224.1645 , 225.887 , 225.937 ,<br/>228.4377 , 228.557 , 228.602 , 228.8748 , 240.2424 , 240.4457 ,<br/>262.6971 , 262.9341 , 97.8859 , 97.9267 , 97.9481 , 97.9649 ,<br/>97.9682 , 98.0463 , 98.6734 , 98.158 , 98.69173 , 98.7012 ,<br/>98.4423 , 98.4748 , 98.5403 , 98.583 , 98.69173 , 98.7012 ,</pre> |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 228.602 , 228.8748 , 240.2424 , 240.4057 , 262.6971 , 262.9341 ])                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 98.4423 , 98.4748 , 98.5403 , 98.583 , 98.69173, 98.7012 ,<br>98.8008 , 99.1103 , 99.2019 , 99.3158 , 99.4075 , 99.4091 ])                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |

Figure A.4. lpGBTs input orders according to rates. On the left, Low/Medium/High and on the right, Low/High/Medium.



Figure A.5. Total rate distribution on Slinks. On the left, Low/Medium/High and on the right, Low/High/Medium.

|             | Slink1 | Slink2 | Slink3 | Slink4 | Slink5 | Slink6 | Slink7 | Slink8 | Slink9 | Slink10 | Slink11 | Slink12 |             | Slink1 | Slink2 | Slink3 | Slink4 | Slink5 | Slink6 | Slink7 | Slink8 | Slink9 | Slink10 | Slink11 | Slink12 |
|-------------|--------|--------|--------|--------|--------|--------|--------|--------|--------|---------|---------|---------|-------------|--------|--------|--------|--------|--------|--------|--------|--------|--------|---------|---------|---------|
| Ipgbt Rates |        |        |        |        |        |        |        |        |        |         |         |         | Ipgbt Rates |        |        |        |        |        |        |        |        |        |         |         |         |
| Single Low  | 1      | 4      | 1      | 0      | 1      | 2      | 2      | 1      | 3      | 2       | 1       | 4       | Single Low  | 1      | 4      | 1      | D      | 1      | 2      | 2      | 1      | 3      | 2       | 1       | 4       |
| Single Mid  | 1      | 0      | 5      | 4      | 5      | 3      | 1      | 0      | 1      | 1       | 1       | D       | Single Mid  | 1      | 0      | 5      | 4      | 5      | 3      | 1      | 0      | 1      | 1       | 1       | 0       |
| Single High | 0      | 3      | 1      | 2      | 1      | 2      | 3      | 2      | 1      | 3       | 1       | 3       | Single High | D      | 3      | 1      | 2      | 1      | 2      | 3      | 2      | 1      | 3       | 1       | 3       |
| Double Low  | D      | 0      | 0      | 0      | 0      | 0      | 0      | 2      | 0      | 0       | 0       | 0       | Double Low  | 0      | 0      | 0      | 0      | 0      | 0      | 0      | 2      | 0      | 0       | 0       | 0       |
| Double Mid  | 2      | 0      | 0      | 0      | 0      | 0      | 0      | 0      | 0      | 0       | 0       | D       | Double Mid  | 2      | 0      | 0      | 0      | 0      | 0      | 0      | 0      | 0      | 0       | 0       | 0       |
| Double High | D      | 0      | 0      | D      | 0      | 0      | 0      | 0      | 1      | 0       | 1       | D       | Double High | 0      | 0      | 0      | 0      | 0      | 0      | 0      | 0      | 1      | 0       | 1       | 0       |
|             |        |        |        |        |        |        |        |        |        |         |         |         |             |        |        |        |        |        |        |        |        |        |         |         |         |

Figure A.6. The assigning pattern of lpGBTs. On the left, Low/Medium/High and on the right, Low/High/Medium.

It is clear from these results that when the order of the lpGBTs is changed, different results are obtained. Also for more than 1 FPGA, it is very difficult to Apopt MINLP solver reach a solution. For more than 2 FPGA, to find any solution seems to be impossible.

# A.2. Pyomo Optimization Suite

Pyomo is used first with Apopt solver and results are shown in Figure A.7. Standard deviation is shown as std and its value is around 25. Also Figure A.7 has the information of how many lpGBTs, based on the event sizes, connected to each Slink.

- With Apopt Solver, integer solutions in Pyomo are found but the solutions are not as good as the solutions in Gekko.
- Input order doesn't effect the results.

|                                                                      | Fc                         | or :                                     | 1 F                        | PG                              | iA:                                          |                      |                            |                              |                  |                                  |                                                          |                                                                                                 |                                                                                             |                                                                                    |                                 |                            |                                  |                                   |                                          |                              |                                       |                                 |                  |                                  |                                                                    |                                                                                                   |                |
|----------------------------------------------------------------------|----------------------------|------------------------------------------|----------------------------|---------------------------------|----------------------------------------------|----------------------|----------------------------|------------------------------|------------------|----------------------------------|----------------------------------------------------------|-------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------|---------------------------------|----------------------------|----------------------------------|-----------------------------------|------------------------------------------|------------------------------|---------------------------------------|---------------------------------|------------------|----------------------------------|--------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|----------------|
| Avera<br>Std:<br>objec<br>K                                          | ge S<br>25.3<br>tive<br>ey | <b>DW</b><br>link<br>3631<br>: S<br>: AC | to<br>Rat<br>1207<br>Size= | Hi<br>te:<br>7180<br>=1,        | <b>gh</b><br>759.2<br>3698<br>Index<br>Value | 2311<br>k=No         | 1158<br>one,               | 3333<br>Act                  | 33<br>ive=       | =T <b>r</b> u                    | 7<br>7<br>7<br>7<br>7<br>7<br>7<br>7<br>7<br>7<br>7<br>7 | 261.494<br>293.915<br>279.343<br>231.247<br>265.782<br>263.750<br>289.714<br>217.922<br>216.378 | 7999999999<br>400000035<br>540000001<br>999999999<br>3699999999<br>3699999999<br>3699999999 | Avera<br>Std:<br>objec                                                             | age S<br>25.<br>ctive<br>Key    | Hi<br>Slin<br>363:<br>: A  | <b>ch</b><br>Ra<br>1120<br>Size: | to<br>te:<br>7180<br>=1,          | <b>LO</b><br>759<br>3710<br>Inde<br>Valu | <b>W</b><br>231<br>6<br>ex=N | 1158<br>one,                          | 33333<br>, Ac <sup>.</sup>      | 833<br>tive      | =Tru                             | 761<br>793<br>779<br>731<br>765<br>763<br>789<br>717<br>716<br>757 | .4948<br>.915406<br>.343546<br>.247999<br>.78237<br>.7504<br>.714159<br>.9223<br>.3788<br>.610600 | 00000          |
|                                                                      | one                        | •                                        | TTUG                       |                                 | 23.30                                        | 121.                 | 1207                       | 1003                         | 090              |                                  | 7<br>7<br>7                                              | 57.610<br>70.533<br>63.079                                                                      | 600000036<br>1999999964<br>8199999963                                                       | ſ                                                                                  | one                             |                            | 110                              |                                   | 23.                                      | 5031                         | 1207                                  | 100.                            | 5710             |                                  | 770<br>763                                                         | .079819                                                                                           | 99999<br>99999 |
|                                                                      | Slink1                     | •<br>Slink2                              | Slink3                     | Slink4                          | Slink5 S                                     | link6                | Slink7                     | Slink8                       | 090<br>Slink9    | Slink10                          | /<br>7<br>Slink11                                        | 57.610<br>70.533<br>63.079<br>Slink12                                                           | 600000036<br>1999999964<br>8199999963                                                       | inght Rates                                                                        | Slink1                          | •<br>Slink2                | Slink3                           | slink4 :                          | ۷J.                                      | Slink6                       | Slink7                                | Slink8                          | Slink9           | Slink10                          | 770<br>763<br>Slink11                                              | .079819<br>.079819<br>Slink12                                                                     | 99999          |
| lpgbt Rates                                                          | Slink1                     | •<br>Slink2                              | Slink3                     | Slink4                          | Slink5 S                                     | link6                | Slink7                     | Slink8                       | Slink9           | Slink10                          | /<br>7<br>Slink11                                        | 57.610<br>70.533<br>63.079<br>Slink12                                                           | 600000036<br>19999999964<br>8199999963                                                      | ipgbt Rates<br>Single Low                                                          | Slink1                          | •<br>Slink2                | Slink3                           | Slink4 :                          | Slink5                                   | Slink6                       | Slink7                                | Slink8                          | Slink9           | Slink10                          | 770<br>763<br>Slink11<br>2                                         | . 0 7 9 8 1 9<br>. 0 7 9 8 1 9<br>. 0 7 9 8 1 9<br>. 0 7 9 8 1 9<br>. 0 7 9 8 1 9                 | 999999         |
| lpgbt Rates<br>Single Low<br>Single Mid                              | Slink1                     | Slink2                                   | Slink3                     | Slink4                          | 2 <b>3.30</b><br>Slink5 S                    | link6                | Slink7                     | Slink8                       | Slink9           | Slink10<br>3<br>1                | /<br>7<br>Slink11<br>2                                   | 27.610<br>70.533<br>63.079<br>1 Slink12                                                         | 600000036<br>19999999964<br>8199999963                                                      | lpgbt Rates<br>Single Low<br>Single Mid                                            | Slink1                          | Slink2                     | Slink3                           | Slink4 :<br>2<br>3                | Slink5                                   | Slink6 1                     | Slink7                                | Slink8<br>3                     | Slink9<br>1      | Slink10<br>3<br>1                | 770<br>763<br>Slink11<br>2<br>1                                    | . 5 3 3 1 95<br>. 0 7 9 8 1 9<br>Slink 1 2<br>2<br>3                                              | 99999          |
| lpgbt Rate:<br>Single Low<br>Single Mic<br>Single High               | Slink1                     | •<br>Slink2<br>1<br>4<br>2               | Slink3                     | Slink4                          | Slink5 S                                     | ilink6<br>5<br>0     | Slink7<br>1<br>4<br>2      | Slink8 :<br>3<br>0<br>3      | Slink9<br>1<br>1 | Slink10<br>3<br>1<br>2           | /<br>7<br>Slink11<br>2<br>1<br>3                         | 57.610<br>70.533<br>63.079<br>1 Slink12<br>2 2<br>1 3<br>3 0                                    | 60000036<br>1999999964<br>8199999963                                                        | lpgbt Rates<br>Single Low<br>Single Mid<br>Single High                             | Slink1                          | •<br>Slink2<br>1<br>4<br>2 | Slink3                           | Slink4 :                          | 2 J<br>Slink5<br>0<br>1<br>3             | Slink6 5<br>0                | 1 2 0 7<br>Slink7<br>1<br>4<br>2      | Slink8<br>3<br>0<br>3           | Slink9<br>1<br>1 | Slink10<br>3<br>1<br>2           | 770<br>763<br>Slink11<br>2<br>1<br>3                               | .079819<br>.079819<br>Slink12<br>2<br>3<br>0                                                      | 99999          |
| lpgbt Rate:<br>Single Low<br>Single Mid<br>Single High<br>Double Low | Slink1                     | •<br>\$link2<br>1<br>4<br>2<br>0         | Slink3                     | Slink4<br>2<br>3<br>2<br>0      | Slink5 S                                     | link6<br>5<br>0<br>1 | Slink7<br>1<br>4<br>2<br>0 | Slink8 3<br>0<br>3<br>0      | Slink9           | Slink10<br>3<br>1<br>2<br>1      | 7<br>7<br>7<br>Slink11<br>2<br>1<br>3<br>0               | 27.610<br>270.533<br>263.079<br>1 Slink12<br>2 2<br>1 3<br>8 0<br>0 0                           | 60000036<br>1999999964<br>8199999963                                                        | lpgbt Rates<br>Single Low<br>Single Mid<br>Single High<br>Double Low               | Slink1<br>1<br>2<br>2<br>1      | Slink2                     | Slink3 1<br>1<br>2<br>1<br>0     | Slink4 :<br>2<br>3<br>2<br>0      | Slink5<br>0<br>1<br>3<br>0               | Slink6 1                     | 1 2 0 7<br>Slink7<br>1<br>4<br>2<br>0 | Slink8<br>3<br>0<br>3<br>0      | Slink9           | Slink10<br>3<br>1<br>2<br>1      | 770<br>763<br>Slink11<br>2<br>1<br>3<br>0                          | .079819<br>.079819<br>Slink12<br>2<br>3<br>0<br>0                                                 | 9999           |
| lpgbt Rates<br>Single Low<br>Single High<br>Double Low<br>Double Mid | Slink1                     | * \$link2 1 1 4 2 0 0 0                  | Slink3                     | Slink4<br>2<br>3<br>2<br>0<br>0 | slink5 s                                     | ilink6               | Slink7                     | Slink8 3<br>0<br>3<br>0<br>0 | Slink9           | Slink10<br>3<br>1<br>2<br>1<br>0 | /<br>77<br>77<br>Slink11<br>22<br>1<br>1<br>33<br>00     | 57.610<br>70.533<br>63.079<br>1 Slink12<br>2 2<br>1 3<br>8 0<br>0 0                             | 60000036<br>1999999964<br>8199999963                                                        | lpgbt Rates<br>Single Low<br>Single Mid<br>Single High<br>Double Low<br>Double Low | Slink1<br>1<br>2<br>2<br>1<br>0 | Slink2                     | Slink3 :                         | Slink4 :<br>2<br>3<br>2<br>0<br>0 | Slink6<br>0<br>1<br>3<br>0               | Slink6 :                     | Slink7                                | Slink8<br>3<br>0<br>3<br>0<br>0 | Slink9           | Slink10<br>3<br>1<br>2<br>1<br>0 | 770<br>763<br>Slink11<br>2<br>1<br>3<br>0<br>0                     | 2<br>0<br>0<br>0<br>0<br>0<br>0                                                                   | 9999           |

Figure A.7. Results of Apopt Solver for Pyomo.

## A.3. Couenne MINLP Solver

Couenne solver is an open-source library for MINLP problems. The source code for Couenne is available for download in the Computational Infrastructure for Operations Research (COIN-OR) repository and on Github. <sup>4</sup>

Couenne uses Spatial Branch and Bound (sBB) algorithm that is a divide-andconquer technique of Branch and Bound (BB) to reach the solution for global optimization problems. A large problem is divided into a few smaller ones (sub-problems) and named as the "branch" part. The "conquering" part is to estimate how good a solution to be obtained for each sub-problem and that is named as the "bound" part. The goal of a BB algorithm is to find a value x that maximizes or minimizes the value of a objective function f(x) and x exists over a set of candidate solutions in the feasible region of the problem. The set of candidate solutions is considered to form a solution tree. Where a candidate solution exists is named as node. Root node contains first candidate solution and this node also is divided into other nodes. Every candidate solution is contained in a node and every node is divided into other nodes which is called branching and forming a solution tree. More information about BB, MINLP and Couenne can be found in references [36–38].

<sup>&</sup>lt;sup>4</sup>https://github.com/coin-or/Couenne.

# A.3.1. How does Couenne Work?

- Reformulation: In this part, the objective function and all other variables created in connection with it are reformulated to solve the MINLP using reliable nonlinear programing (NLP) solvers.
- Generic Relaxation Strategies (Linearization): For most MINLP problems, the objective function and constraints are factorable rather than linear. Therefore, they can be expressed as the sum of the products of the unary functions (fsin, cos, exp, log, abs) of a finite set so that they can be linearized. After Reformulation and Linearization Part:
- Couenne adds the following components:
  - (i) bound tightening techniques: Bounds tightening is a class of algorithms aiming at reducing the bound intervals on the variables. It provides a lower bound on the optimal objective function value of the sub-problem. They are essential for obtaining an optimal solution in a reasonable time. Two bound reduction techniques are commonly used: feasibility based (FBBT) and optimality-based bound tightening (OBBT). FBBT algorithms are used to create better bounds on all variables to get a tighter lower bound in large sized problems to allow for fast implementation. OBBT uses the existing linearization to improve the bounding box and provides extra information to improve the linearization.
  - (ii) Heuristics: Some problems cannot be solved easily. Since they are too largesized, they can generate a large-sized search tree. Also, these problems should be solved in a reasonable time. In these situations obtaining a good solution is more preferable than waiting for an optimal solution. In such situations, heuristic search techniques provide a feasible solution without offering any optimality.
  - (iii) Branching Tecniques aim at minimizing the size of the sBB tree to increase the performance of sBB algorithms.

Work Progress of Couenne Solver:

 $\label{eq:Reformulation} \mbox{Reformulation} \rightarrow \mbox{Linearization} \rightarrow \mbox{Bound Tightening} \rightarrow \mbox{Heuristics} \rightarrow \mbox{Branching} \\ \rightarrow \mbox{Integer Solution}$ 

## A.4. First Results with Counne Solver

# A.4.1. For 5 FPGA

Inputs are 60 Slinks, 230 single and 20 double lpGBTs. Results are shown below. In Figure A.8, the value of standard deviation, in Figure A.9, the single lpGBT connections to Slinks and Slink connections on FPGAs and in Figure A.10, Slink rates and total FPGA rates after assignment can be seen.



Figure A.8. Standard deviation and double lpGBT assigning results.



Figure A.9. Single lpGBTs assignment to Slinks is on the left and Slinks assignment to FPGAs is on the right of figure.



Figure A.10. Slink Rates vs Number of Slinks and total FPGA rate plot.

# A.4.2. For 6 FPGA

Inputs are 72 Slinks, 276 single and 24 double lpGBTs. Results are shown below. In Figure A.11, the value of standard deviation, in Figure A.12, Slink rates and total FPGA rates after assignment and in Figure A.13, the single lpGBT connections to Slinks and Slink connections on FPGAs can be seen.



Figure A.11. Standard deviation and double lpGBT assigning results for 6 FPGA case.



Figure A.12. Total rate distribution on Slinks and FPGA rate plot after Slink connection.



Figure A.13. Single lpGBTs assignment to Slinks is on the left and Slinks assignment to FPGAs is on the right of figure.

## A.5. Some Results of Multi-Run Solving

# A.5.1. 7-FPGA 2-MultiRuns

Multi-Run solving method were applied first for 7 FPGA case. Inputs are 7 FPGA, 84 Slinks, 462 single and 42 double lpGBTs. First run contains double and high lpGBT rates. Second run contains medium and low lpGBT rates. Results are shown in Figure A.14. Total Slink and FPGA rates after all lpGBTs assignment and objective values can be seen as well as how lpGBTs assign to Slinks based on the sizes (high, medium and low).

## A.5.2. 7-FPGA 3-MultiRuns

First run contains double and high lpGBT rates. Second run contains medium lpGBT rates an third run contains low lpGBT rates. Results are shown in Figure A.15.



Figure A.14. For 7 FPGA 2 Multi-Run case, total rate distribution on number of Slinks is at top left, value of new objective funciton is next to it, all lpGBT assignment and total FPGA rates are shown at bottom part of figure.



Figure A.15. For 7 FPGA 3-MultiRuns case, total rate distribution on number of Slinks is on the left and on the right, objective function values.

# A.6. HardStart and WarmStart

# A.6.1. 7 FPGA HardStart Results

First Run is solved for double and high lpGBT rates with the naive distribution, the results are shown in Figure A.16 and Figure A.17.



Figure A.16. Total rate distribution to number of Slinks is on the left and naive distribution result for 7 FPGA is on the right of figrue.

# A.6.2. 7 FPGA -2 Multi-Runs WarmStart Results

The naive distribution solution fixed and a run including mediums and low lpGBT rates added to it and solved with Couenne solver. Results shown in Figure A.17.



Figure A.17. Total rate distribution on Slinks and FPGAs are on the left, value of objective function and lpGBT assignment to Slinks are top and bottom right.

# A.6.3. 7 FPGA 3 Multi-Run WarmStart Results

The naive distribution solution fixed. Two runs including medium and low lpGBT rates respectively are solved with Couenne solver and added to first solution. Results are shown in Figure A.18.



Figure A.18. For 7 FPGA 3 Multi-Run WarmStart case, total rate distribution on number of Slinks is at top left, value of objective function is next to it, lpGBT assignment to Slinks and total FPGA rates are shown at the bottom part of figure.