Lucas Matheus

Posted on Jan 31

How Supercomputers Simulate the Real World: Parallel Heat Diffusion!

#programming #ai #learning #science

Introduction

Have you ever wondered how heat spreads in an object? How the temperature in a metal plate is distributed over time? These phenomena are essential in engineering, climatology and even medicine.

But how can scientists and engineers predict this? The answer lies in computer simulation, which allows us to model the behavior of heat and predict its propagation. However, simulating these phenomena can be computationally expensive, requiring resources that go beyond a simple home computer.

In this article, we will explore:

How heat diffusion works.
How we can program a computer to simulate this phenomenon.
How we use parallel computing to speed up the simulation.
How to use MPI (Message Passing Interface) to divide the problem among multiple processors.

What is Heat Diffusion?

Heat diffusion is a physical process where temperature spreads from hotter to colder regions. This behavior can be modeled mathematically by the Heat Equation:

$$
\frac{\partial u}{\partial t} = \alpha \left( \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} \right)
$$

Where:

( u(x,y,t) ) is the temperature at a point in space over time.
( \alpha ) is the thermal diffusivity coefficient, which depends on the material.
( x, y ) are the spatial coordinates.
( t ) represents time.

This model is useful for predicting the distribution of heat in solid objects, such as metal parts, electronic circuit boards, and even the Earth's mantle.

Parallel Computing: Why Do We Need It?

If we want to simulate heat diffusion in a small metal sheet with just a few points, a regular computer can solve the problem quickly. But what if we want to simulate a large piece of metal, a nuclear reactor, or the Earth's crust?

As we increase the size of the simulation, the amount of calculations grows exponentially. A laptop can take days or even weeks to solve these problems.

The solution to this problem is parallel computing, where we distribute the calculations among multiple processors. Using MPI (Message Passing Interface), we can divide the simulation matrix among multiple cores, allowing multiple calculations to happen simultaneously.

MPI (Message Passing Interface) and a little parallelism:

Parallelism is a technique used in computing to perform multiple operations simultaneously. Instead of processing a task sequentially—one instruction after another—parallelism divides the task into several smaller parts, which can be executed at the same time by different processing units. This speeds up the total execution time and optimizes the use of available resources.

MPI (Message Passing Interface) is a standard for interprocess communication, specifically designed for distributed parallel computing. It is widely used in environments where multiple computers (or nodes) work together to solve a large and complex problem. MPI implements parallelism, where different parts of a task are processed by different nodes in a cluster. Processes can establish communication with each other in a way that contributes to the progress of the computed problem. Since there is no memory sharing between processes, it is possible to use MPI to scale to thousands of nodes with networking. Below is an example of interprocess communication using MPI_Send and MPI_Recv:


MPI_Send(&grid[1][0], cols, MPI_DOUBLE, rank - 1, 0, MPI_COMM_WORLD);
MPI_Recv(&grid[0][0], cols, MPI_DOUBLE, rank - 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);

Here, a processor sends its first internal line to its upper neighbor and receives the lower edge from its upper neighbor. This ensures that the simulation continues without inconsistencies.

Parallelism significantly increases processing efficiency and speed. MPI takes advantage of this concept, allowing independent processes in distributed systems to communicate effectively. Together, they are essential to solve complex problems in areas such as scientific simulations, large-scale data analysis, and machine learning.

How Does the Simulation Code Work?

The simulation was implemented in C++ with MPI, where:

1️ We create a 2D grid representing the material using the language's data structure.

2️⃣ We define the initial conditions, with a heated edge in conditions that we can change.

3️⃣ We use an iterative method to calculate the temperature of each point over time.

4️⃣ We divide the matrix between different processors, where each one calculates a part.

5️⃣ In the end, we merge the results in a single processor and save the data.

Step 1: Grid Initialization

Each point in the matrix starts with an initial temperature, with the top edge of the grid being heated (simulating a heat source).

void initialize_grid(std::vector<std::vector<double>> &grid, int rank, int size) {
for (int i = 0; i < grid.size(); ++i) {
for (int j = 0; j < grid[0].size(); ++j) {
if (rank == 0 && i == 0) {
grid[i][j] = 100.0; // Hot row at the top
} else {
grid[i][j] = 0.0; }
}
}
}

Step 2: Update the Temperature

We use the finite difference method to calculate the temperature at each point in the array:

void update_grid(std::vector<std::vector<double>> &current, std::vector<std::vector<double>> &next) {
for (int i = 1; i < current.size() - 1; ++i) {
for (int j = 1; j < current[0].size() - 1; ++j) {
next[i][j] = current[i][j] +
ALPHA * (current[i - 1][j] + current[i + 1][j] +
current[i][j - 1] + current[i][j + 1] - 4 * current[i][j]); }
}
}

Step 3: Running in parallel

Now we use parallelism to work efficiently with the calculations: using MPI and functions like Rank to enumerate the processes and Size to obtain the number of processes running:


int main(int argc, char **argv) {
MPI_Init(&argc, &argv);

int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);

// Division of rows between processes
int local_rows = GRID_SIZE / size; std::vector<std::vector<double>> current_grid(local_rows + 2, std::vector<double>(GRID_SIZE, 0.0));
 std::vector<std::vector<double>> next_grid = current_grid;

 initialize_grid(current_grid, rank, size);

 for (int t = 0; t < TIME_STEPS; ++t) {
 update_grid(current_grid, next_grid);
 current_grid.swap(next_grid);

 // Communication between processes
 if (rank > 0) {
 MPI_Send(&current_grid[1][0], GRID_SIZE, MPI_DOUBLE, rank - 1, 0, MPI_COMM_WORLD);
 MPI_Recv(&current_grid[0][0], GRID_SIZE, MPI_DOUBLE, rank - 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
 }
 if (rank < size - 1) {
 MPI_Recv(&current_grid[local_rows + 1][0], GRID_SIZE, MPI_DOUBLE, rank + 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
 MPI_Send(&current_grid[local_rows][0], GRID_SIZE, MPI_DOUBLE, rank + 1, 0, MPI_COMM_WORLD);
 }
 }

Step 4: Final aggregation

The final temperature grid is collected and reconstructed by the master process.

 if (rank == 0) {
 std::vector<std::vector<double>> full_grid(GRID_SIZE, std::vector<double>(GRID_SIZE, 0.0));
 for (int i = 1; i <= local_rows; ++i) {
 full_grid[i - 1] = current_grid[i];
 }
 for (int p = 1; p < size; ++p) {
 MPI_Recv(&full_grid[p * local_rows][0], local_rows * GRID_SIZE, MPI_DOUBLE, p, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
 }
 save_grid_to_file(full_grid, rank); // Save the complete grid to file
 print_grid(full_grid, rank); // Optional: Prints the grid in the terminal
 } else {
 MPI_Send(&current_grid[1][0], local_rows * GRID_SIZE, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);
 }

 MPI_Finalize();
 return 0;
}

Compilation

To compile the code, use the following command:


mpicxx -o heat_simulation heat_simulation.cpp

Execution

Run the program with mpirun, specifying the number of processes:

mpirun -np 4 ./heat_simulation

mpirun --oversubscribe -np 4 ./heat_simulation

Results

The program will save the final temperature grid in output_grid.txt.
If enabled, the program will also print the grid to the terminal

Conclusion

Well, I wanted to go into detail line by line and explain each function. But to not disappoint, I will leave the complete code below. Supercomputing is a very, very cool area. I would say that this is the path for those who want computer science applied to solve scientific problems. In the future, I plan to bring articles that are aimed at training deep learning models using supercomputing processes.

Link to Github: https://github.com/samsepiol1/HeatSimulation-HPC

Link to Linkedin: https://www.linkedin.com/in/lucas-matheus-3809aa121/

DEV Community

How Supercomputers Simulate the Real World: Parallel Heat Diffusion!

Introduction

What is Heat Diffusion?

Parallel Computing: Why Do We Need It?

MPI (Message Passing Interface) and a little parallelism:

How Does the Simulation Code Work?

Step 1: Grid Initialization

Step 2: Update the Temperature

Step 3: Running in parallel

Step 4: Final aggregation

Compilation

Execution

Results

Conclusion

Top comments (0)

Read next

¡Obtén tu certificación AWS gratis! Vales para niveles Fundacional y Asociado disponibles

Muse: Microsoft’s AI Game Changer

The Future of Forex Trading: AI and Real-Time Data Integration

Use anchors, not buttons, for navigation!

Introduction

What is Heat Diffusion?

Parallel Computing: Why Do We Need It?

MPI (Message Passing Interface) and a little parallelism:

How ​​Does the Simulation Code Work?

Step 1: Grid Initialization

Step 2: Update the Temperature

Step 3: Running in parallel

Step 4: Final aggregation

Compilation

Execution

Results

Conclusion

Read next

¡Obtén tu certificación AWS gratis! Vales para niveles Fundacional y Asociado disponibles

Muse: Microsoft’s AI Game Changer

The Future of Forex Trading: AI and Real-Time Data Integration

Use anchors, not buttons, for navigation!

How Does the Simulation Code Work?