UBCO COSC370, BSPlib project: Week 1: Getting up to speed with MulticoreBSP

In this exercise you will install MulticoreBSP, learn how to write a simple BSPlib program and perform some simple communication between processes.

MulticoreBSP is available here.

1 Install MulticoreBSP

First install MulticoreBSP as described here. Follow the instructions and verify that you get the same output when running the compiled example.c (the example program you find on the quick-start page, also available here):

Task 1: Verify that the output you get when running example.c is similar to the following:

How many threads do you want started? There are 4 cores available.
4
Hello world from thread 3 out of 4!
Hello world from thread 2 out of 4!
Hello world from thread 0 out of 4!
Hello world from thread 1 out of 4!

Of course, the output will differ depending on the number of cores in your machine. Furthermore, the order of the output will change for each run.

Do not worry if you do not understand all the code of example.c. The important code is between bsp_begin() and bsp_end() in the function spmd:

bsp_begin( P );
printf( "Hello world from thread %d out of %d!\n", bsp_pid(), bsp_nprocs() );
bsp_end();

The first line (bsp_begin(P)) starts a SPMD-section. SPMD stands for Single Program, Multiple Data, and denotes the part of the program that will be run in parallel. As the name implies, all processors run the same code, (Single Program), but each process has their own variables and memory (Multiple Data).

The argument P given to bsp_begin denotes the number of processors that are requested by the user when running the program. The code between bsp_begin() and bsp_end() will then be executed by P procesors running in parallel. There are thus P calls to printf. The functions bsp_pid() and bsp_nprocs() returns the unique identifer of each process (an integer between \(0\) and \(P-1\)) and bsp_nprocs() returns P.

Task 2: Note the number the number of cores available in your machine, which you can read from the output of the program (for instance, the output above comes from a machine with 4 cores). What happens if you change the bsp_begin( P ); line to one less or one more than the number of available cores, e.g. bsp_begin( 3 ); or bsp_begin( 5 );?

2 A first foray into BSP

After the example program, we shall try writing a real program. BSP programs consist of a series of supersteps. In each superstep, each process does some local computation. Processes then synchronize before starting the next superstep.

The first BSP program we write is called shift and consists of 3 supersteps:

(1) Each process \(i\) declares a variable x in which it stores the it's own process identifier.
(2) Then process \(i\) sends its processes identifier to their neighbor to the left, i.e. process \(i+1 % p\), where \(p\) is the number of processors. This ensure that the last process sends its identifier to the first process.
(3) Finally, each process outputs their own identifier and the one they have received from their neighbor.

In more detail, the program shall do the following:

First superstep:
1. Each process declares an integer s and stores its process identifier inside the integer (given by bsp_pid()). It also declares an integer r.
2. Each process registers the address of r, given by &r using bsp_push_reg(). In order for processes to communicate, they must register the source or destination memory areas.
3. Registers take effect at the next superstep, so we must complete this superstep using bsp_sync() before communicating.
Second superstep:
1. Each process sends their value of s to the variable r of their neighbor to the right using bsp_put().
2. The register r is now no longer needed, and can be deregistered using bsp_pop_reg.
3. Communication is buffered and takes effect at the next superstep, so we must complete this superstep using bsp_sync() before continuing.
Third and final superstep:
1. Each process now have the identifier of their neighbor in their variable r. The value of r is printed using printf.
2. We have finished the computation and end the SPMD-section with the bsp_end()-function.

In code, this gives:

bsp_begin(P);

// 1.1
int s = bsp_pid();
int r;
// 1.2
bsp_push_reg(&r, sizeof(r));
// 1.3
bsp_sync();

// 2.1
bsp_put((bsp_pid() + 1) % bsp_nprocs(), &s, &r, 0, sizeof(s));
// 2.2
bsp_pop_reg(&r);
// 2.3
bsp_sync();

// 3.1
printf("Process %d received the identifier %d\n", s, r);
// 3.2

bsp_end();

Task 2: Paste the code above in the function spmd of example.c. Compile and run the program. Note the output. If it works, you should see something like:

Process 0 received the identifier 3
Process 3 received the identifier 2
Process 1 received the identifier 0
Process 2 received the identifier 1

Task 3: Modify the program so that processes sends their identifier to the neighbor to the left instead of right.

Task 4: Modify the program so that a random value is sent instead of a the process identifier. A random value can be obtained using the rand-function.

Task 5: If we shift \(p\) times, where \(p\) is the number of processors as returned by bsp_nprocs(), then each process should have the same value in r as in s. Verify that this is the case.

Back to overview.