Aurora represents Eurotech’s latest HPC product offering, an innovative and leading edge architecture. Aurora systems can scale from a single working unit (node card) to many computing racks, with no performance degradation. Aurora HPC systems offer high density and power efficiency, thanks to liquid cooling used for all their modules, and to an advanced design.
A range of Aurora configurations can be obtained, and all of them are
based on the same building blocks.
The Aurora node card is the main processing unit of every Aurora system.
A single blade hosting two Intel Xeon 5500 series processors (Quad core, up to 2.93GHz), each connected to 6/12GB of DDR3 memory, running at 1333MHz. CPUs are linked to peripherals and interfaces via Intel 5500 chipset (Tylersburg), using QPI (Intel QuickPath Interconnect). Nodes host one Mellanox Infiniband ConnectX2 device, which provides a bandwidth of 40Gbps, used to implement a switched network and also for data storage.
A large Altera Stratix IV Fpga allows implementation of a point to point 3D-Torus shaped network, for applications which do not require a centralized switched network. The 3D-Torus main features and advantages are low latency (less than 1ns), an aggregated bandwidth of 60+60Gbps, high robustness and reliability, thanks to redundant lines. Based on the same FPGA device present on board, some reconfigurable computing functions such as coprocessing, acceleration, GPU-like arithmetics are possible thanks to available logic resources on the FPGA (up to 700Gops per device). One node card can provide a maximum computing power of 95Gflops for a total power consumption of 300W.
Heat removal is performed by means of an aluminium plate with channels drilled inside, which are run by water, purified and treated in order to prevent corrosion, blockages or freezing, where this applies.
Aurora node cards are hosted in chassis, in multiples of 16. Each chassis
features also a 36-port QDR Infiniband switch, which leaves 20 usable ports for external connections, each at a 40Gbps speed. All chassis feature two independent monitoring and control networks, for redundancy and safe operation. Maintenance and inspection of Aurora systems can be carried out by personnel also using a touchscreen interface on a monitor, where also dfiagnostic data are shown, resulting in a user friendly man-machine interface.
Power conversion takes place in two separate steps for Aurora, always with maximum efficiency in mind. Chassis receive a 48VDC supply, which has the obvious advantage of inherent safety, making Aurora a low voltage piece of equipment, and therefore making safety approvals and regulatory tests easier to perform. 48VDC are stepped down to 12VDC by a PSU within each chassis (DCDC Tray). This conversion is performed with a 92% efficiency: all Aurora boards are then fed with a 12VDC supply. Of course failsafe operation is taken into account, and conversion modules within each PSU board are n+1 redundant. Each Chassis is attached via rails to the main rack metalwork, and can be extracted in order to perform maintenance procedures, or for assembly and disassembly.
Aurora Racks can contain up to 16 chassis, providing power, cabling,
signal connections, and piping for heat removal via liquid cooling circuit.
Aurora systems can scale to many racks, each connected to its nearest neighbour via short and in-rack (therefore invisible) cables. Such an arrangement does not cause performance degradation or difficulties
in installation, management and maintenance of systems, regardless
of their size.
SOFTWAREAdoption of Intel processors ensures compatibility with a vast range of applications, tools and OS. Intel software can be used, as well as free toolchains. The advantage of being x86-based allows Aurora to have an almost unlimited choice of compilers, debuggers, libraries, clustering tools,
be them open source or proprietary.
Monitoring and control tools are a very important part of the Aurora software stack, and must offer highest reliability and robustness in operation. There are two different and independent monitoring and
control networks, which operate in parallel for maximum coverage. One
of them, ServNet, can operate even when an Aurora system is completely