Big compute architecture style

The term big compute describes large-scale workloads that require a large number of cores, often numbering in the hundreds or thousands. Scenarios include image rendering, fluid dynamics, financial risk modeling, oil exploration, drug design, and engineering stress analysis, among others.

Source: Big compute architecture style

.

Logical diagram for big compute architecture style

Here are some typical characteristics of big compute applications:

  • The work can be split into discrete tasks, which can be run across many cores simultaneously.
  • Each task is finite. It takes some input, does some processing, and produces output. The entire application runs for a finite amount of time (minutes to days). A common pattern is to provision a large number of cores in a burst, and then spin down to zero once the application completes.
  • The application does not need to stay up 24/7. However, the system must handle node failures or application crashes.
  • For some applications, tasks are independent and can run in parallel. In other cases, tasks are tightly coupled, meaning they must interact or exchange intermediate results. In that case, consider using high-speed networking technologies such as InfiniBand and remote direct memory access (RDMA).
  • Depending on your workload, you might use compute-intensive VM sizes (H16r, H16mr, and A9).

When to use this architecture

  • Computationally intensive operations such as simulation and number crunching.
  • Simulations that are computationally intensive and must be split across CPUs in multiple computers (10-1000s).
  • Simulations that require too much memory for one computer, and must be split across multiple computers.
  • Long-running computations that would take too long to complete on a single computer.
  • Smaller computations that must be run 100s or 1000s of times, such as Monte Carlo simulations.

Benefits

  • High performance with “embarrassingly parallel” processing.
  • Can harness hundreds or thousands of computer cores to solve large problems faster.
  • Access to specialized high-performance hardware, with dedicated high-speed InfiniBand networks.
  • You can provision VMs as needed to do work, and then tear them down.

Challenges

  • Managing the VM infrastructure.
  • Managing the volume of number crunching
  • Provisioning thousands of cores in a timely manner.
  • For tightly coupled tasks, adding more cores can have diminishing returns. You may need to experiment to find the optimum number of cores.

Big compute using Azure Batch

Azure Batch is a managed service for running large-scale high-performance computing (HPC) applications.

SaleBestseller No. 1
HP Elite Desktop PC Computer Intel Core i5 3.1-GHz, 8 gb Ram, 1 TB Hard Drive, DVDRW, 19 Inch LCD Monitor, Keyboard, Mouse, Wireless WiFi, Windows 10 (Renewed)
  • This Certified Refurbished product is tested and...
  • HP Elite 6200 Small Form Factor Desktop PC, Intel...
  • Includes: USB Keyboard & Mouse, WiFi Adapter,...
  • Ports: USB 2.0, DisplayPort, VGA, PS/2 keyboard,...
  • Operating System: Windows 10 64 Bit –...
SaleBestseller No. 2
HP 2022 Newest All-in-One Desktop, 21.5" FHD Display, Intel Celeron J4025 Processor, 16GB RAM, 512GB PCIe SSD, Webcam, HDMI, RJ-45, Wired Keyboard&Mouse, WiFi, Windows 11 Home, White
  • 【High Speed RAM And Enormous Space】16GB DDR4...
  • 【Processor】Intel Celeron J4025 processor (2...
  • 【Display】21.5" diagonal FHD VA ZBD anti-glare...
  • 【Tech Specs】2 x SuperSpeed USB Type-A 5Gbps...
  • 【Authorized KKE Mousepad】Include KKE Mousepad

Using Azure Batch, you configure a VM pool, and upload the applications and data files. Then the Batch service provisions the VMs, assign tasks to the VMs, runs the tasks, and monitors the progress. Batch can automatically scale out the VMs in response to the workload. Batch also provides job scheduling.

Diagram of big compute using Azure Batch

Big compute running on Virtual Machines

You can use Original Postowershell/high-performance-computing/overview?view=hpc19-ps&preserve-view=true" target="_blank" rel="noreferrer noopener">Microsoft HPC Pack to administer a cluster of VMs, and schedule and monitor HPC jobs. With this approach, you must provision and manage the VMs and network infrastructure. Consider this approach if you have existing HPC workloads and want to move some or all it to Azure. You can move the entire HPC cluster to Azure, or you can keep your HPC cluster on-premises but use Azure for burst capacity. For more information, see Batch and HPC solutions for large-scale computing workloads.

HPC Pack deployed to Azure

In this scenario, the HPC cluster is created entirely within Azure.

Diagram of HPC Pack deployed to Azure
New
HP Stream 14 inch Laptop for Student and Business, Intel Quad-Core Processor, 16GB RAM, 64GB eMMC, 1-Year Office 365, Webcam, 12H Long Battery Life, Lightweight & Slim Laptop, Wi-Fi, Win 11 H in S
  • 【Processor】Intel Celeron N4120, 4 Cores & 4...
  • 【Display】14.0-inch diagonal, HD (1366 x 768),...
  • 【Storage】16GB high-bandwidth DDR4 Memory (2400...
  • 【Connectivity】1 x USB 3.1 Type-C ports, 2 x...
  • 【System】Windows 11 Home in S mode operating...
New
HAJAAN SuperX Gaming PC | Liquid Cooled | GeForce RTX 4060 8GB | AMD Ryzen 5 5600G | 32GB DDR4 | 1TB SSD | Windows 11 Pro | WiFi | Bluetooth - Black
  • Configured with AMD Ryzen 5 5600G Processor and...
  • 8GB GeForce RTX 4060 GDDR6 dedicated graphics card...
  • Liquid cooling system keeps internal components at...
  • Integrated PCIE Wi-Fi provides excellent wireless...
  • Includes USB Gaming RGB Mechanical Keyboard, Mouse...
New
Lenovo 2023 IdeaPad 1i Essential Laptop Computer, Intel Core i5-1235U 12th Gen, 15.6" FHD Anti-Glare Display, (16GB DDR4 RAM, 512GB SSD), HDMI, Bluetooth, Windows 11, Cloud Grey, W/GaLiMu
  • ✔【Display】 15.6" FHD (1920x1080) TN 220nits...
  • ✔【Memory & Storage】RAM Size 16GB 3200MHz...
  • ✔【Connectivity】 1x USB 2.0, 1x USB 3.2 Gen...
  • ✔【Processor & Graphics】 12th Generation...
  • ✔【Operating System】 Windows 11

The head node provides management and job scheduling services to the cluster. For tightly coupled tasks, use an RDMA network that provides very high bandwidth, low latency communication between VMs. For more information, see Deploy an HPC Pack 2016 cluster in Azure.

Burst an HPC cluster to Azure

In this scenario, an organization is running HPC Pack on-premises, and uses Azure VMs for burst capacity. The cluster head node is on-premises. ExpressRoute or VPN Gateway connects the on-premises network to the Azure VNet.

Diagram of a hybrid big compute cluster