License : Creative Commons Attribution 4.0 International (CC BY-NC-SA 4.0)
Copyright : Frédéric Pennerath, CentraleSupelec
Last modified : April 19, 2024 10:22
Link to the source : futures.md

Multi-core programming based on futures

Installation

Download futures.zip and generate a suitable build environment using CMake:

unzip futures.zip ; cd futures
mkdir build ; cd build
cmake ../src
make
./futures 0  # Launch the compiled binary passing as argument the test number 0 to run the demo code

The code to edit is in subdirectory src. The subdirectory solution contains a possible solution.

Introduction

This labwork is composed of a series of exercises whose goal is to practice multi-core programming using threads and futures. Each exercise is designed to introduce new classes or functions (first std::future and std::async, then std::shared_future, std::thread, std::promise, std::jthread, etc).

Every exercise consists in computing as quickly as possible a task formalized by a computational graph, as the one given below:

Symbol table

Nodes of the graph represent so-called atomic tasks that cannot be furthed divided. Every node shows the expected time in milliseconds that it takes for the corresponding task to finish. Arcs of the graph indicate the inputs and the output of every atomic task. Depending on the graph structure, some of these atomic tasks can be run in parallel on multiple cores (e.g tasks A and B) whereas others are dependent, and must be run sequentially (e.g task C requires as inputs the results produced by task A and B). The objective is to design a multithreaded implementation of each computational graph that produces the expected result in a minimal time, whatever the processing times of atomic tasks.

For simplification purposes, the atomic computations will be simulated: they will consist in producing a std::string containing the name of the computation and the values of the received arguments, themselves supposed to be const std::string &. The duration of the computation will be simulated by introducing a timeout. For instance computation C of the previous graph can be implemented as:

// a and b are supposed to contain return values of computation A and B
std::string C(const std::string& a, const std::string& b) {
  std::this_thread::sleep_for(100ms);
  return "C("s + a + "," + b + ")"; 
}

The expected result for the previous graph and for the input "X" will thus be "C(A(X),B(X))".

For every exercise, you should only modify in file futures.cpp the corresponding computationX function (X being the number 1, 2, etc of the exercise), whose goal is to implement the computational graph of exercise number X. The C++ functions of the various atomic tasks (like the previous C function) are already implemented (using file test_utils.hpp that you should not modify). These functions are accessible as attributes of the graph object passed as argument to the computationX() functions. For example, function demo() in file futures.cpp computes sequentially, without any parallelization, the previous computation graph:

std::string demo(std::string input, const Graph1& graph) {
  std::string a = graph.A(input); // Functions A, B and C are members of the graph argument
  std::string b = graph.B(input);
  return graph.C(a, b);
}

When running from a terminal the corresponding test (0 for the demo, 1 for exercise 1, etc), the test framework will check:

> ./futures 0  # 0 for running the demo

|--------|--------|--------|--------|--------|------|
|        |        |        |EXPECTED|MEASURED|      |
|       A|       B|       C|  TIME  |  TIME  |STATUS|
|--------|--------|--------|--------|--------|------|
|  100 ms|  200 ms|  100 ms|  400 ms|  401 ms|    OK|

Last few remarks to take into account in your implementation:

“Simple” parallel computing

Implement the computation expressed by the previous dependency graph by completing function computation1 in file futures.cpp.

“Interleaved” parallel computations

Consider the following computational graph, in which atomic tasks C and D both need the results of tasks A and B. However, tasks C and D can start working on one of the results without waiting for the other (e.g. C can work on the result of A without immediately needing the result of B). This is reflected in the dependency graph by splitting tasks C and D into two sub-taskss (C into C1 and C2, D into D1 and D2).

Symbol table

Code your solution, this time by completing function computation2.

“Concurrent” parallel computations

Consider the following dependency graph, in which task C can be executed as soon as one of the two computations A OR B produces a result, without having to wait for the result of the other task. Such a case occurs, for example, in parallelizable search problems (such as searching for a given element in two arrays) where the result is determined as soon as the searched element is found by one of the threads.

Symbol table

Code your solution, this time by completing function computation3.

Improvement of the previous problem

Uncomment the two last lines of function test3 and runs the test again. Why does it fail?

To regain optimality, it is necessary to interrupt the threads still active as soon as a thread finds the result (for the durations given in the graph aboce, task B should be interrupted as soon as A terminates). To do this, we can

Whatever the solution, this can only work by checking periodically if the stopping flag (prior to C++20) or the internal state of std::jthread (from C++20) has been set. If so the thread should return, otherwise it should continue its job. To implement this polling, the functions implementing atomic tasks provided to function computation4 are slightly different as a single call to them only lasts 10ms. While the total time spent calling such a function is less than the one required to compute the results, the return value of this function is an empty string. Once the total time reaches the required time, the return values becomes equal to the expected output. The example of code given below illustrates how to compute the result of A with such function:

std::string res; // String to contain the result of A
while(res.empty()) { // While the result is not ready
  res = graph.A(input); // "Works" for 10ms
}

Code such a solution, this time by completing function computation4.

Frédéric Pennerath,