.. SPDX-FileCopyrightText: © 2020 Open Networking Foundation SPDX-License-Identifier: Apache-2.0 Overview ======== There are many ways to deploy Aether, depending on the requirements of the edge site. The Reliability, Availability, and Serviceability (RAS) of each set of equipment will differ depending on the characteristics of each edge. This document provides several hardware deployment options and explains the differences between them. For assistance in setting up a production deployment of Aether, please contact Timon Sloane at timon@opennetworking.org. Aether Central -------------- A production deployment of Aether typically has a single set of centralized control components, informally referred to as ``Aether Central`` together with one or more edge sites. Aether Central is often deployed in the cloud. |MULTIEDGE| Edge and Central can be further expanded to show their internal architecture: |ARCHITECTURE| The architecture shown is but one potential deployment of Aether Central, leveraging the following subsystems: * Rancher, to handle deployment of Kubernetes as well as to manage the lifecycle of Kubernetes-based components. Rancher RKE or RKE2 are potential Kubernetes deployments. * EFK Stack and Prometheus stack for Logging and Monitoring. * Cloud storage using a cloud storage mechanism, such as cloud volumes provided by Google, as well as a Velero, a backup mechanism to handle backup and restore of the cloud volumes. * Ubuntu and Docker as the underlying operating system and containerization solution. * Keycloak and LDAP, as an authentication mechanism for the Aether GUI. * Netbox, to inventory equipment and describe relationships between equipment. * Helm and Docker repositories for holding helm charts and images. * Gerrit and Jenkins, to support the CI/CD pipeline. An example CI/CD pipeline is depicted below: |CICD| The example CI/CD pipeline uses Jenkins as an automation tool to perform the necessary acceptance testing of incoming patches, as well as to carry out post-merge operations. The desired deployment state is described in GitOps repos contained in Gerrit, and Fleet and Terraform are used to automate the deployment. Deployment Options ------------------ Development Environments """""""""""""""""""""""" For users looking for a development or fully software-simulated environment, there is ``Aether-in-a-Box (AiaB)`` - instructions for running this can be found in the :doc:`Setting Up Aether-in-a-Box `. AiaB is only suitable for testing and developing software, and can't connect to physical hardware, but is a good choice for learning about the different software components within Aether. Production Environments """"""""""""""""""""""" Deploying Aether on hardware is required for both production deployments and hardware testing. Before deploying Aether, a detailed plan including the network topology, hardware, and all cabling needs to be created. For redundancy of workloads running in Kubernetes, at least 3 compute nodes must be available. A single or pair of compute nodes can be used, but software would need to be configured without High Availability (HA) enabled. The topologies below are *simplified physical topologies* to show the equipment needed and the minimal connectivity between devices. Within these topologies, multiple VLANs, routing, and other network-level configuration is required to make a functional Aether edge. There are also possible RAS improvements that can be done at a topology level - for example, fabric switch connections can be made with two cables, and configured to tolerate the failure or replacement of one cable or NIC port, which is recommended especially for inter-switch links. SD-Fabric Network Topology -------------------------- The P4-based SD-Fabric UPF is an advanced feature and has graduated to production use in the Aether 2.0 release. It requires one or more P4-capable switches using the Tofino chipset. This topology can run both the P4-based UPF on switching hardware as well as the software-based BESS UPF on compute servers. Single or multi-switch topologies can be used as described in the :ref:`SD-Fabric Specifications for Topology `. The following topologies are actively being tested as a part of Aether: If only a single P4 switch is used, the **Single Switch** topology can be used, but provides no network redundancy: .. image:: images/edge_single.svg :alt: Single Switch Topology If another switch is added, the **Paired Leaves** (aka "Single Leaf Pair") topology can be used, which can tolerate the loss of a leaf switch and still retain connections for all dual-homed devices. Single homed devices on the failed leaf would lose their connections (the single-homed server is shown for reference, and not required). If HA is needed for single-homed devices, one option would be to deploying multiple of those devices in a way that provides that redundancy - for example, multiple eNBs where some are connected to each leaf and have overlapping radio coverage: .. image:: images/edge_paired_leaves.svg :alt: Paired Leaves Topology All SD-Fabric P4-based topologies can support running both the BESS UPF and P4 UPF on the same hardware at the same time within an edge deployment. Software-only UPF Network Topology ---------------------------------- If a P4-based switch is not available, the software-based BESS UPF can be used on compute hardware. The :doc:`Software-only BESS UPF `, is supported for production as of the Aether 1.5 and later releases. .. image:: images/edge_mgmt_only.svg :alt: BESS network topology `BESS `_ runs on an x86 compute server, and is deployed using Kubernetes. In production it requires an SR-IOV capable network card configured with virtual function (VF) interfaces in the base OS, and specific K8s CNIs to be used to use VFs within the container. Additionally the Management Router and Switch must be configured with multiple VLANs and subnets with routing required for the BESS UPF. Connectivity Alternatives ------------------------- The diagrams above show logical topologies, but depending on the site strategy, alternative topologies may be desirable. The below diagrams use the "Single Switch" topology, but could be applied to any of the Aether equipment topologies given above. One example would be to place the rackmount equipment in a datacenter environment away from the radio hardware and use existing networking equipment to route from the radios back to the Aether edge hardware. Also shown in this example is using a PoE switch to power the radios. .. image:: images/edge_routed_radios.svg :alt: Edge with routed radios Another example would be to use the management switch as the main network connection point, and possibly use it to PoE power the radios as well: .. image:: images/edge_mgmtswitch_primary.svg :alt: Edge with mgmtswitch as primary connection point Note that these topologies may require additional configuration in the switching and routing equipment, including the equipment outside of the Aether edge. Hardware Descriptions --------------------- Fabric Switch """"""""""""" See the :ref:`SD-Fabric Switch Hardware Selection Documentation `. Compute Server """""""""""""" The Compute Servers run Kubernetes, Aether connectivity apps, and edge applications. Minimum hardware specifications: * AMD64 (aka x86-64) architecture * 8 CPU Cores (minimum), 16+ recommended * 32GB of RAM (minimum), 128GB+ recommended * 250 GB of storage (SSD preferred), 1TB+ recommended * 2x 40GbE or 100GbE Ethernet network card to P4 switches, with DPDK support * 1x 1GbE management network port, with PXE boot support. 2x required for BESS UPF. Optional but highly recommended: * Lights out management support, with either a shared or separate NIC and support for HTML5 console access. Management Router """"""""""""""""" One Management Router is required - this is a standard server which must have at least two 1GbE network ports, and performs network tasks such as running a VPN connection to Aether Central, performing NAT for the management network, as well as running a variety of network services to bootstrap and support the edge. Minimum hardware specifications: * AMD64 (aka x86-64) architecture * 4 CPU cores, or more * 8GB of RAM, or more * 120GB of storage (SSD preferred), or more * 2x 1GbE Network interfaces (one for WAN, one to the management switch) with PXE boot support. Optional: * 10GbE or 40GbE network card with DPDK support to connect to fabric switch * Lights out management support, with either a shared or separate NIC and support for HTML5 console access. Management Switch """"""""""""""""" A managed L2/L3 management switch is required to provide connectivity within the cluster for bootstrapping equipment. It is configured with multiple VLANs to separate the management plane, fabric, and the out-of-band and lights out management connections on the equipment. Minimum requirements: * 8x 1GbE Copper Ethernet ports (adjust to provide a sufficient number for every copper 1GbE port in the system) * 2x 10GbE SFP+ or 40GbE QSFP interfaces (only required if management router does not have a network card with these ports) * Managed via SSH or web interface * LLDP protocol support, for debugging cabling issues * Capable supporting VLANs on each port, with both tagged and untagged traffic sharing a port. Optional: * PoE+ support, which can power eNB and monitoring hardware, if using Management switch to host these devices. eNB Radio """"""""" The LTE eNB used in most deployments is the `Sercomm P27-SCE4255W Indoor CBRS Small Cell `_. While this unit ships with a separate power brick, it also supports PoE+ power on the WAN port, which provides deployment location flexibility. Either a PoE+ capable switch or PoE+ power injector should be purchased. If connecting directly to the fabric switch through a QSFP to 4x SFP+ split cable, a 10GbE SFP+ to 1GbE Copper media converter should be purchased. The `FS UMC-1S1T `_ has been used for this purpose successfully. Alternatively, the Fabric's 10GbE SFP+ could be connected to another switch (possibly the Management Switch) which would adapt the speed difference, and provide PoE+ power, and power control for remote manageability. Testing Hardware ---------------- The following hardware is used to test the network and determine uptime of edges. It is currently required, to properly validate that an edge site is functioning properly. Monitoring Raspberry Pi and CBRS dongle """"""""""""""""""""""""""""""""""""""" One pair of Raspberry Pi and CBRS band supported LTE dongle is required to monitor the connectivity service at the edge. The Raspberry Pi model used in Aether is a `Raspberry Pi 4 Model B/2GB `_ Which is configured with: * Raspberry Pi case (HiPi is recommended for PoE Hat) * A power source, either one of: * PoE Hat used with a PoE switch (recommended, allows remote power control) * USB-C Power Supply * MicroSD Card with Raspbian - 16GB One LTE dongle model supported in Aether is the `Sercomm Adventure Wingle `_. Example BoMs ------------ To help provision a site, a few example Bill of Materials (BoM) are given below, which reference the hardware descriptions given above. Some quantities are dependent on other quantities - for example, the number of DAC cables frequently depends on the number of servers in use. These BoMs do not include UE devices. It's recommended that the testing hardware given above be added to every BoM for monitoring purposes. BESS UPF Testing BoM """""""""""""""""""" The following is the minimum BoM required to run Aether with the BESS UPF. ============ ===================== =============================================== Quantity Type Purpose ============ ===================== =============================================== 1 Management Switch Must be Layer 2/3 capable for BESS VLANs 1 Management Router 1-3 Compute Servers Recommended at least 3 for Kubernetes HA 1 (or more) eNB 1x #eNB PoE+ Injector Required unless using a PoE+ Switch Sufficient Cat6 Network Cabling Between all equipment ============ ===================== =============================================== P4 UPF Testing BoM """""""""""""""""" ============ ===================== =============================================== Quantity Type Description/Use ============ ===================== =============================================== 1 P4 Fabric Switch 1 Management Switch Must be Layer 2/3 capable 1 Management Router At least 1x 40GbE QSFP ports recommended 1-3 Compute Servers Recommended at least 3 for Kubernetes HA 2x #Server 40GbE QSFP DAC cable Between Compute, Management, and Fabric Switch 1 QSFP to 4x SFP+ DAC Split cable between Fabric and eNB 1 (or more) eNB 1x #eNB 10GbE to 1GbE Media Required unless using switch to convert from converter fabric to eNB 1x #eNB PoE+ Injector Required unless using a PoE+ Switch Sufficient Cat6 Network Cabling Between all equipment ============ ===================== =============================================== P4 UPF Paired Leaves BoM """""""""""""""""""""""" ============ ===================== =============================================== Quantity Type Description/Use ============ ===================== =============================================== 2 P4 Fabric Switch 1 Management Switch Must be Layer 2/3 capable 1 Management Router 2x 40GbE QSFP ports recommended 3 Compute Servers 2 100GbE QSFP DAC cable Between Fabric switches 2x #Server 40GbE QSFP DAC cable Between Compute, Management, and Fabric Switch 1 (or more) QSFP to 4x SFP+ DAC Split cable between Fabric and eNB 1 (or more) eNB 1x #eNB 10GbE to 1GbE Media Required unless using switch to convert from converter fabric to eNB 1x #eNB PoE+ Injector Required unless using a PoE+ Switch Sufficient Cat6 Network Cabling Between all equipment ============ ===================== =============================================== .. |MULTIEDGE| image:: images/aether-multi-edge.svg :width: 1000 :alt: Aether Central and Edge Deployment .. |ARCHITECTURE| image:: images/aether-central-architecture.svg :width: 1500 :alt: Aether Central and Edge Architecture .. |CICD| image:: images/aether-CICD.svg :width: 1500 :alt: Aether Central and Edge Architecture