{"id":484,"date":"2019-11-06T09:15:49","date_gmt":"2019-11-06T08:15:49","guid":{"rendered":"https:\/\/www.kth.se\/blogs\/pdc\/?p=484"},"modified":"2019-11-06T10:34:52","modified_gmt":"2019-11-06T09:34:52","slug":"parallel-programming-in-python-mpi4py-part-2","status":"publish","type":"post","link":"https:\/\/www.kth.se\/blogs\/pdc\/2019\/11\/parallel-programming-in-python-mpi4py-part-2\/","title":{"rendered":"Parallel programming in Python: mpi4py (part 2)"},"content":{"rendered":"<div class=\"post-content-wrapper\"><p>In <a href=\"https:\/\/www.kth.se\/blogs\/pdc\/2019\/08\/parallel-programming-in-python-mpi4py-part-1\/\">part 1 of this post<\/a>, we introduced the <a href=\"https:\/\/mpi4py.readthedocs.io\/en\/stable\/index.html\"><code>mpi4py<\/code><\/a> module (MPI for Python) which provides an object-oriented interface for Python resembling the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Message_Passing_Interface\">message passing interface (MPI)<\/a> and enables Python programs to exploit multiple processors on multiple compute nodes.<\/p>\n<p>The <code>mpi4py<\/code> module provides methods for communicating various types of Python objects in different ways. In <a href=\"https:\/\/www.kth.se\/blogs\/pdc\/2019\/08\/parallel-programming-in-python-mpi4py-part-1\/\">part 1 of this post<\/a> we showed you how to communicate generic Python objects between MPI processes \u2013 the methods for doing this have names that are all lowercase letters. Some of these methods were introduced in <a href=\"https:\/\/www.kth.se\/blogs\/pdc\/2019\/08\/parallel-programming-in-python-mpi4py-part-1\/\">part 1 of this post<\/a>. It is also possible to directly send buffer-like objects, where the data is exposed in a raw format and can be\u00a0accessed without copying, between MPI processes. The methods for doing this start with an uppercase letter.<\/p>\n<p>In this post we continue introducing the <a href=\"https:\/\/mpi4py.readthedocs.io\/en\/stable\/index.html\"><code>mpi4py<\/code><\/a> module, with a focus on the direct communication of buffer-like objects using the latter type of methods (that is, those starting with a capital letter), including <code>Send<\/code>, <code>Recv<\/code>, <code>Isend<\/code>, <code>Irecv<\/code>, <code>Bcast<\/code>, and <code>Reduce<\/code>, as well as <code>Scatterv<\/code>\u00a0and <code>Gatherv<\/code>, which are vector variants of <code>Scatter<\/code>\u00a0and <code>Gather<\/code>, respectively.<\/p>\n<p><!--more--><\/p>\n<h2>Buffer-like objects<\/h2>\n<p>The <code>mpi4py<\/code> module provides methods for directly sending and receiving buffer-like objects. The advantage of working with buffer-like objects is that the communication is fast (close to the speed of MPI communication in C). However, there is a disadvantage in that the user needs to be more explicit when it comes to handling the allocation of memory space<span style=\"font-size: 1.125rem\">. For example,<\/span><span style=\"font-size: 1.125rem\"> the memory of the receiving buffer needs to be allocated prior to the communication, and the size of the sending buffer should not exceed that of the receiving buffer. One should also be aware that <code>mpi4py<\/code> <\/span><span style=\"font-size: 1.125rem\">expects the buffer-like objects to have contiguous memory. Fortunately, this is usually the case with<\/span>\u00a0<a href=\"https:\/\/numpy.org\/\">Numpy<\/a> arrays, which are probably the most commonly used buffer-like objects in scientific computing in Python.<\/p>\n<p>In <code>mpi4py<\/code>, a buffer-like object can be specified using a list or tuple with 2 or 3 elements (or 4 elements for the vector variants, which will be elaborated in later sections). For example, for a Numpy array that is named \u201c<code>data<\/code>\u201d and that consists of double-precision numbers, we can use <code>[data, n, MPI.DOUBLE]<\/code>, where <code>n<\/code> is a positive integer, to refer to the buffer of the first <code>n<\/code> elements. It is also possible to use <code>[data, MPI.DOUBLE]<\/code>, or simply <code>data<\/code>, to refer to the buffer of the whole array. In the following sections, we&#8217;ll demonstrate the communication of <a href=\"https:\/\/numpy.org\/\">Numpy<\/a> arrays using <code>mpi4py<\/code>.<\/p>\n<p><!--more--><\/p>\n<h2>Point-to-point communication<\/h2>\n<p>In the <a href=\"https:\/\/www.kth.se\/blogs\/pdc\/2019\/08\/parallel-programming-in-python-mpi4py-part-1\/\">previous post<\/a> we have introduced point-to-point communication using all-lowercase methods (<code>send<\/code> and <code>recv<\/code>) in <code>mpi4py<\/code>. The use of methods with a leading uppercase letter (<code>Send<\/code> and <code>Recv<\/code>) is quite similar, except that the receiving buffer needs to be initialized before the <code>Recv<\/code> method is called. An example code for passing a Numpy array from the master node (which has rank = 0) to the worker processes (that all have rank &gt; 0) is shown below.<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\"><span style=\"color: #800000;font-weight: bold\">from<\/span> mpi4py <span style=\"color: #800000;font-weight: bold\">import<\/span> MPI\r\n<span style=\"color: #800000;font-weight: bold\">import<\/span> numpy <span style=\"color: #800000;font-weight: bold\">as<\/span> np\r\n\r\ncomm <span style=\"color: #808030\">=<\/span> MPI<span style=\"color: #808030\">.<\/span>COMM_WORLD\r\nrank <span style=\"color: #808030\">=<\/span> comm<span style=\"color: #808030\">.<\/span>Get_rank<span style=\"color: #808030\">(<\/span><span style=\"color: #808030\">)<\/span>\r\nsize <span style=\"color: #808030\">=<\/span> comm<span style=\"color: #808030\">.<\/span>Get_size<span style=\"color: #808030\">(<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\n<span style=\"color: #696969\"># master process<\/span>\r\n<span style=\"color: #800000;font-weight: bold\">if<\/span> rank <span style=\"color: #44aadd\">==<\/span> <span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">:<\/span>\r\n    data <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>arange<span style=\"color: #808030\">(<\/span><span style=\"color: #008000\">4.<\/span><span style=\"color: #808030\">)<\/span>\r\n    <span style=\"color: #696969\"># master process sends data to worker processes by<\/span>\r\n    <span style=\"color: #696969\"># going through the ranks of all worker processes<\/span>\r\n    <span style=\"color: #800000;font-weight: bold\">for<\/span> i <span style=\"color: #800000;font-weight: bold\">in<\/span> <span style=\"color: #400000\">range<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #008c00\">1<\/span><span style=\"color: #808030\">,<\/span> size<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">:<\/span>\r\n        comm<span style=\"color: #808030\">.<\/span>Send<span style=\"color: #808030\">(<\/span>data<span style=\"color: #808030\">,<\/span> dest<span style=\"color: #808030\">=<\/span>i<span style=\"color: #808030\">,<\/span> tag<span style=\"color: #808030\">=<\/span>i<span style=\"color: #808030\">)<\/span>\r\n        <span style=\"color: #800000;font-weight: bold\">print<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #0000e6\">'Process {} sent data:'<\/span><span style=\"color: #808030\">.<\/span>format<span style=\"color: #808030\">(<\/span>rank<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">,<\/span> data<span style=\"color: #808030\">)<\/span>\r\n\r\n<span style=\"color: #696969\"># worker processes<\/span>\r\n<span style=\"color: #800000;font-weight: bold\">else<\/span><span style=\"color: #808030\">:<\/span>\r\n    <span style=\"color: #696969\"># initialize the receiving buffer<\/span>\r\n    data <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>zeros<span style=\"color: #808030\">(<\/span><span style=\"color: #008c00\">4<\/span><span style=\"color: #808030\">)<\/span>\r\n    <span style=\"color: #696969\"># receive data from master process<\/span>\r\n    comm<span style=\"color: #808030\">.<\/span>Recv<span style=\"color: #808030\">(<\/span>data<span style=\"color: #808030\">,<\/span> source<span style=\"color: #808030\">=<\/span><span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">,<\/span> tag<span style=\"color: #808030\">=<\/span>rank<span style=\"color: #808030\">)<\/span>\r\n    <span style=\"color: #800000;font-weight: bold\">print<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #0000e6\">'Process {} received data:'<\/span><span style=\"color: #808030\">.<\/span>format<span style=\"color: #808030\">(<\/span>rank<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">,<\/span> data<span style=\"color: #808030\">)<\/span>\r\n<\/pre>\n<\/div>\n<\/div>\n<p>Note that the <code>data<\/code> array was initialized on the worker processes before the <code>Recv<\/code> method was called, and that the <code>Recv<\/code> method takes <code>data<\/code> as the first argument (in contrast to the <code>recv<\/code> method which returns the <code>data<\/code> object). The output from running the above example (send-recv.py) on 4 processes is as follows:<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\">$ module load mpi4py\/3.0.2\/py37\r\n\r\n$ salloc --nodes 1 -t 00:10:00 -A &lt;your-project-account&gt;\r\nsalloc: Granted job allocation &lt;your-job-id&gt;\r\n\r\n$ srun -n 4 python3 send-recv.py\r\nProcess 0 sent data: [0. 1. 2. 3.]\r\nProcess 0 sent data: [0. 1. 2. 3.]\r\nProcess 0 sent data: [0. 1. 2. 3.]\r\nProcess 1 received data: [0. 1. 2. 3.]\r\nProcess 2 received data: [0. 1. 2. 3.]\r\nProcess 3 received data: [0. 1. 2. 3.]<\/pre>\n<\/div>\n<\/div>\n<p>When using the <code>Send<\/code> and <code>Recv<\/code> methods, one thing to keep in mind is that the sending buffer and the receiving buffer should match in size. An error will occur if the size of the sending buffer is greater than that of the receiving buffer. It is, however, possible to have a receiving buffer that is larger than the sending buffer. This is illustrated in the following example, where the sending buffer has 4 elements and the size of the receiving buffer is 6. The communication will complete without error, but only the first 4 elements of the receiving buffer will be overwritten (the last 2 elements will remain zero).<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\"><span style=\"color: #800000;font-weight: bold\">if<\/span> rank <span style=\"color: #44aadd\">==<\/span> <span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">:<\/span>\r\n    data <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>arange<span style=\"color: #808030\">(<\/span><span style=\"color: #008000\">4.<\/span><span style=\"color: #808030\">)<\/span>\r\n    <span style=\"color: #800000;font-weight: bold\">for<\/span> i <span style=\"color: #800000;font-weight: bold\">in<\/span> <span style=\"color: #400000\">range<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #008c00\">1<\/span><span style=\"color: #808030\">,<\/span> size<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">:<\/span>\r\n        comm<span style=\"color: #808030\">.<\/span>Send<span style=\"color: #808030\">(<\/span>data<span style=\"color: #808030\">,<\/span> dest<span style=\"color: #808030\">=<\/span>i<span style=\"color: #808030\">,<\/span> tag<span style=\"color: #808030\">=<\/span>i<span style=\"color: #808030\">)<\/span>\r\n        <span style=\"color: #800000;font-weight: bold\">print<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #0000e6\">'Process {} sent data:'<\/span><span style=\"color: #808030\">.<\/span>format<span style=\"color: #808030\">(<\/span>rank<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">,<\/span> data<span style=\"color: #808030\">)<\/span>\r\n\r\n<span style=\"color: #800000;font-weight: bold\">else<\/span><span style=\"color: #808030\">:<\/span>\r\n    <span style=\"color: #696969\"># note: the size of the receiving buffer is larger than<\/span>\r\n    <span style=\"color: #696969\"># that of the sending buffer<\/span>\r\n    data <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>zeros<span style=\"color: #808030\">(<\/span><span style=\"color: #008c00\">6<\/span><span style=\"color: #808030\">)<\/span>\r\n    comm<span style=\"color: #808030\">.<\/span>Recv<span style=\"color: #808030\">(<\/span>data<span style=\"color: #808030\">,<\/span> source<span style=\"color: #808030\">=<\/span><span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">,<\/span> tag<span style=\"color: #808030\">=<\/span>rank<span style=\"color: #808030\">)<\/span>\r\n    <span style=\"color: #800000;font-weight: bold\">print<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #0000e6\">'Process {} has data:'<\/span><span style=\"color: #808030\">.<\/span>format<span style=\"color: #808030\">(<\/span>rank<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">,<\/span> data<span style=\"color: #808030\">)<\/span>\r\n<\/pre>\n<\/div>\n<\/div>\n<p>The output is as follows. Note that the last two elements of the receiving buffers are zero.<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\">$ srun -n 4 python3 send-recv.py\r\nProcess 0 sent data: [0. 1. 2. 3.]\r\nProcess 0 sent data: [0. 1. 2. 3.]\r\nProcess 0 sent data: [0. 1. 2. 3.]\r\nProcess 1 has data: [0. 1. 2. 3. 0. 0.]\r\nProcess 2 has data: [0. 1. 2. 3. 0. 0.]\r\nProcess 3 has data: [0. 1. 2. 3. 0. 0.]<\/pre>\n<\/div>\n<\/div>\n<p>In <a href=\"https:\/\/www.kth.se\/blogs\/pdc\/2019\/08\/parallel-programming-in-python-mpi4py-part-1\/\">part 1 of this post<\/a> we also discussed blocking and non-blocking methods for point-to-point communication. In <code>mpi4py<\/code>, the non-blocking communication methods for buffer-like objects are <code>Isend<\/code> and <code>Irecv<\/code>. The use of non-blocking methods are shown in the example below.<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\"><span style=\"color: #800000;font-weight: bold\">if<\/span><span style=\"color: #000000\"> rank <\/span><span style=\"color: #44aadd\">==<\/span> <span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">:<\/span><span style=\"color: #000000\">\r\n    data <\/span><span style=\"color: #808030\">=<\/span><span style=\"color: #000000\"> np<\/span><span style=\"color: #808030\">.<\/span><span style=\"color: #000000\">arange<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #008000\">4.<\/span><span style=\"color: #808030\">)<\/span>\r\n    <span style=\"color: #800000;font-weight: bold\">for<\/span><span style=\"color: #000000\"> i <\/span><span style=\"color: #800000;font-weight: bold\">in<\/span> <span style=\"color: #400000\">range<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #008c00\">1<\/span><span style=\"color: #808030\">,<\/span><span style=\"color: #000000\"> size<\/span><span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">:<\/span><span style=\"color: #000000\">\r\n        req <\/span><span style=\"color: #808030\">=<\/span><span style=\"color: #000000\"> comm<\/span><span style=\"color: #808030\">.<\/span><span style=\"color: #000000\">Isend<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #000000\">data<\/span><span style=\"color: #808030\">,<\/span><span style=\"color: #000000\"> dest<\/span><span style=\"color: #808030\">=<\/span><span style=\"color: #000000\">i<\/span><span style=\"color: #808030\">,<\/span><span style=\"color: #000000\"> tag<\/span><span style=\"color: #808030\">=<\/span><span style=\"color: #000000\">i<\/span><span style=\"color: #808030\">)<\/span><span style=\"color: #000000\">\r\n        req<\/span><span style=\"color: #808030\">.<\/span><span style=\"color: #000000\">Wait<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #808030\">)<\/span>\r\n        <span style=\"color: #800000;font-weight: bold\">print<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #0000e6\">'Process {} sent data:'<\/span><span style=\"color: #808030\">.<\/span><span style=\"color: #000000\">format<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #000000\">rank<\/span><span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">,<\/span><span style=\"color: #000000\"> data<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\n<span style=\"color: #800000;font-weight: bold\">else<\/span><span style=\"color: #808030\">:<\/span>\r\n<span style=\"color: #000000\">    data <\/span><span style=\"color: #808030\">=<\/span><span style=\"color: #000000\"> np<\/span><span style=\"color: #808030\">.<\/span><span style=\"color: #000000\">zeros<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #008c00\">4<\/span><span style=\"color: #808030\">)\r\n<\/span><span style=\"color: #000000\">    <span style=\"color: #000000\">req <\/span><span style=\"color: #808030\">=<\/span> comm<\/span><span style=\"color: #808030\">.<\/span><span style=\"color: #000000\">Irecv<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #000000\">data<\/span><span style=\"color: #808030\">,<\/span><span style=\"color: #000000\"> source<\/span><span style=\"color: #808030\">=<\/span><span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">,<\/span><span style=\"color: #000000\"> tag<\/span><span style=\"color: #808030\">=<\/span><span style=\"color: #000000\">rank<\/span><span style=\"color: #808030\">)<\/span>\r\n    <span style=\"color: #000000\">req<\/span><span style=\"color: #808030\">.<\/span><span style=\"color: #000000\">wait<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #808030\">)<\/span>\r\n    <span style=\"color: #800000;font-weight: bold\">print<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #0000e6\">'Process {} received data:'<\/span><span style=\"color: #808030\">.<\/span><span style=\"color: #000000\">format<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #000000\">rank<\/span><span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">,<\/span><span style=\"color: #000000\"> data<\/span><span style=\"color: #808030\">)<\/span>\r\n<\/pre>\n<\/div>\n<h2>Collective communication<\/h2>\n<p>As we mentioned in <a href=\"https:\/\/www.kth.se\/blogs\/pdc\/2019\/08\/parallel-programming-in-python-mpi4py-part-1\/\">part 1 of this post<\/a>, collective communication methods are very useful in parallel programming. In the example below we use the <code>Bcast<\/code> method to broadcast a buffer-like object <code>data<\/code> from the master process to all the worker processes. Note that <code>data<\/code> needs to be initialized on the worker processes before <code>Bcast<\/code>\u00a0is called.<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\"><span style=\"color: #800000;font-weight: bold\">from<\/span> mpi4py <span style=\"color: #800000;font-weight: bold\">import<\/span> MPI\r\n<span style=\"color: #800000;font-weight: bold\">import<\/span> numpy <span style=\"color: #800000;font-weight: bold\">as<\/span> np\r\n\r\ncomm <span style=\"color: #808030\">=<\/span> MPI<span style=\"color: #808030\">.<\/span>COMM_WORLD\r\nrank <span style=\"color: #808030\">=<\/span> comm<span style=\"color: #808030\">.<\/span>Get_rank<span style=\"color: #808030\">(<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\n<span style=\"color: #800000;font-weight: bold\">if<\/span> rank <span style=\"color: #44aadd\">==<\/span> <span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">:<\/span>\r\n    data <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>arange<span style=\"color: #808030\">(<\/span><span style=\"color: #008000\">4.0<\/span><span style=\"color: #808030\">)<\/span>\r\n<span style=\"color: #800000;font-weight: bold\">else<\/span><span style=\"color: #808030\">:<\/span>\r\n    data <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>zeros<span style=\"color: #808030\">(<\/span><span style=\"color: #008c00\">4<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\ncomm<span style=\"color: #808030\">.<\/span>Bcast<span style=\"color: #808030\">(<\/span>data<span style=\"color: #808030\">,<\/span> root<span style=\"color: #808030\">=<\/span><span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\n<span style=\"color: #800000;font-weight: bold\">print<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #0000e6\">'Process {} has data:'<\/span><span style=\"color: #808030\">.<\/span>format<span style=\"color: #808030\">(<\/span>rank<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">,<\/span> data<span style=\"color: #808030\">)<\/span>\r\n<\/pre>\n<\/div>\n<\/div>\n<p>The output from running the above code (broadcast.py) on 4 processes is as follows:<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\">$ srun -n 4 python3 broadcast.py\r\nProcess 0 has data: [0. 1. 2. 3.]\r\nProcess 1 has data: [0. 1. 2. 3.]\r\nProcess 2 has data: [0. 1. 2. 3.]\r\nProcess 3 has data: [0. 1. 2. 3.]<\/pre>\n<\/div>\n<\/div>\n<p>Another collective communication method is <code>Scatter<\/code>, which sends slices of a large array to the worker processes. In <a href=\"https:\/\/www.kth.se\/blogs\/pdc\/2019\/08\/parallel-programming-in-python-mpi4py-part-1\/\">part 1 of this post<\/a>, we showed that the all-lowercase <code>scatter<\/code> method is convenient when sending slices of Python objects. However,\u00a0 <code>Scatter<\/code> is not as convenient in practice since it requires the size of the large array to be divisible by the number of processes. For instance, if we know beforehand that the array we are going to distribute has 16 elements, it will be straightforward to use <code>Scatter<\/code> to distribute the array on 4 processes in such a way that each process gets 4 elements. The problem is that, in practice, the size of the array is not known beforehand and is therefore not guaranteed to be divisible by the number of available processes. It is more practical to use <code>Scatterv<\/code>, the vector version of <code>Scatter<\/code>, which offers a much more flexible way to distribute the array. The code below distributes 15 numbers over 4 processes. Note that we use \u201c<code>[sendbuf, count, displ, MPI.DOUBLE]<\/code>\u201d to specify the buffer-like object, where <code>count<\/code> contains the number of elements to be sent to each process and <code>displ<\/code> contains the starting indices of the sub-tasks. (These indices are often known as displacements.)<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\"><span style=\"color: #800000;font-weight: bold\">from<\/span> mpi4py <span style=\"color: #800000;font-weight: bold\">import<\/span> MPI\r\n<span style=\"color: #800000;font-weight: bold\">import<\/span> numpy <span style=\"color: #800000;font-weight: bold\">as<\/span> np\r\n\r\ncomm <span style=\"color: #808030\">=<\/span> MPI<span style=\"color: #808030\">.<\/span>COMM_WORLD\r\nrank <span style=\"color: #808030\">=<\/span> comm<span style=\"color: #808030\">.<\/span>Get_rank<span style=\"color: #808030\">(<\/span><span style=\"color: #808030\">)<\/span>\r\nnprocs <span style=\"color: #808030\">=<\/span> comm<span style=\"color: #808030\">.<\/span>Get_size<span style=\"color: #808030\">(<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\n<span style=\"color: #800000;font-weight: bold\">if<\/span> rank <span style=\"color: #44aadd\">==<\/span> <span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">:<\/span>\r\n    sendbuf <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>arange<span style=\"color: #808030\">(<\/span><span style=\"color: #008000\">15.0<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\n    <span style=\"color: #696969\"># count: the size of each sub-task<\/span>\r\n    ave<span style=\"color: #808030\">,<\/span> res <span style=\"color: #808030\">=<\/span> <span style=\"color: #400000\">divmod<\/span><span style=\"color: #808030\">(<\/span>sendbuf<span style=\"color: #808030\">.<\/span>size<span style=\"color: #808030\">,<\/span> nprocs<span style=\"color: #808030\">)<\/span>\r\n    count <span style=\"color: #808030\">=<\/span> <span style=\"color: #808030\">[<\/span>ave <span style=\"color: #44aadd\">+<\/span> <span style=\"color: #008c00\">1<\/span> <span style=\"color: #800000;font-weight: bold\">if<\/span> p <span style=\"color: #44aadd\">&lt;<\/span> res <span style=\"color: #800000;font-weight: bold\">else<\/span> ave <span style=\"color: #800000;font-weight: bold\">for<\/span> p <span style=\"color: #800000;font-weight: bold\">in<\/span> <span style=\"color: #400000\">range<\/span><span style=\"color: #808030\">(<\/span>nprocs<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">]<\/span>\r\n    count <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>array<span style=\"color: #808030\">(<\/span>count<span style=\"color: #808030\">)<\/span>\r\n\r\n    <span style=\"color: #696969\"># displacement: the starting index of each sub-task<\/span>\r\n    displ <span style=\"color: #808030\">=<\/span> <span style=\"color: #808030\">[<\/span><span style=\"color: #400000\">sum<\/span><span style=\"color: #808030\">(<\/span>count<span style=\"color: #808030\">[<\/span><span style=\"color: #808030\">:<\/span>p<span style=\"color: #808030\">]<\/span><span style=\"color: #808030\">)<\/span> <span style=\"color: #800000;font-weight: bold\">for<\/span> p <span style=\"color: #800000;font-weight: bold\">in<\/span> <span style=\"color: #400000\">range<\/span><span style=\"color: #808030\">(<\/span>nprocs<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">]<\/span>\r\n    displ <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>array<span style=\"color: #808030\">(<\/span>displ<span style=\"color: #808030\">)<\/span>\r\n<span style=\"color: #800000;font-weight: bold\">else<\/span><span style=\"color: #808030\">:<\/span>\r\n    sendbuf <span style=\"color: #808030\">=<\/span> <span style=\"color: #074726\">None<\/span>\r\n    <span style=\"color: #696969\"># initialize count on worker processes<\/span>\r\n    count <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>zeros<span style=\"color: #808030\">(<\/span>nprocs<span style=\"color: #808030\">,<\/span> dtype<span style=\"color: #808030\">=<\/span>np<span style=\"color: #808030\">.<\/span><span style=\"color: #400000\">int<\/span><span style=\"color: #808030\">)<\/span>\r\n    displ <span style=\"color: #808030\">=<\/span> <span style=\"color: #074726\">None<\/span>\r\n\r\n<span style=\"color: #696969\"># broadcast count<\/span>\r\ncomm<span style=\"color: #808030\">.<\/span>Bcast<span style=\"color: #808030\">(<\/span>count<span style=\"color: #808030\">,<\/span> root<span style=\"color: #808030\">=<\/span><span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\n<span style=\"color: #696969\"># initialize recvbuf on all processes<\/span>\r\nrecvbuf <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>zeros<span style=\"color: #808030\">(<\/span>count<span style=\"color: #808030\">[<\/span>rank<span style=\"color: #808030\">]<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\ncomm<span style=\"color: #808030\">.<\/span>Scatterv<span style=\"color: #808030\">(<\/span><span style=\"color: #808030\">[<\/span>sendbuf<span style=\"color: #808030\">,<\/span> count<span style=\"color: #808030\">,<\/span> displ<span style=\"color: #808030\">,<\/span> MPI<span style=\"color: #808030\">.<\/span>DOUBLE<span style=\"color: #808030\">]<\/span><span style=\"color: #808030\">,<\/span> recvbuf<span style=\"color: #808030\">,<\/span> root<span style=\"color: #808030\">=<\/span><span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\n<span style=\"color: #800000;font-weight: bold\">print<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #0000e6\">'After Scatterv, process {} has data:'<\/span><span style=\"color: #808030\">.<\/span>format<span style=\"color: #808030\">(<\/span>rank<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">,<\/span> recvbuf<span style=\"color: #808030\">)<\/span>\r\n<\/pre>\n<\/div>\n<\/div>\n<p>The output from running the above code on 4 processes is as follows. Note that, as there are 15 elements in total, each of the first three processes received 4 elements, and the last process received 3 elements.<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\">$ srun -n 4 python3 scatter-array.py\r\nAfter Scatterv, process 0 has data: [0. 1. 2. 3.]\r\nAfter Scatterv, process 1 has data: [4. 5. 6. 7.]\r\nAfter Scatterv, process 2 has data: [8. 9. 10. 11.]\r\nAfter Scatterv, process 3 has data: [12. 13. 14.]<\/pre>\n<\/div>\n<\/div>\n<p><code>Gatherv<\/code> is the reverse operation of <code>Scatterv<\/code>. When using <code>Gatherv<\/code>, one needs to specify the receiving buffer as \u201c<code>[recvbuf2, count, displ, MPI.DOUBLE]<\/code>\u201d, as shown in the following code. The <code>sendbuf2<\/code> arrays will be gathered into a large array <code>recvbuf2<\/code> on the master process.<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\">sendbuf2 <span style=\"color: #808030\">=<\/span> recvbuf\r\nrecvbuf2 <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>zeros<span style=\"color: #808030\">(<\/span><span style=\"color: #400000\">sum<\/span><span style=\"color: #808030\">(<\/span>count<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">)<\/span>\r\ncomm<span style=\"color: #808030\">.<\/span>Gatherv<span style=\"color: #808030\">(<\/span>sendbuf2<span style=\"color: #808030\">,<\/span> <span style=\"color: #808030\">[<\/span>recvbuf2<span style=\"color: #808030\">,<\/span> count<span style=\"color: #808030\">,<\/span> displ<span style=\"color: #808030\">,<\/span> MPI<span style=\"color: #808030\">.<\/span>DOUBLE<span style=\"color: #808030\">]<\/span><span style=\"color: #808030\">,<\/span> root<span style=\"color: #808030\">=<\/span><span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\n<span style=\"color: #800000;font-weight: bold\">if<\/span> comm<span style=\"color: #808030\">.<\/span>Get_rank<span style=\"color: #808030\">(<\/span><span style=\"color: #808030\">)<\/span> <span style=\"color: #44aadd\">==<\/span> <span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">:<\/span>\r\n    <span style=\"color: #800000;font-weight: bold\">print<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #0000e6\">'After Gatherv, process 0 has data:'<\/span><span style=\"color: #808030\">,<\/span> recvbuf2<span style=\"color: #808030\">)<\/span>\r\n<\/pre>\n<\/div>\n<\/div>\n<p>The output from running the <code>Gatherv<\/code> code on 4 processes is as follows.<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\">$ srun -n 4 python3 scatter-gather.py\r\nAfter Gatherv, process 0 has data: [ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14.]<\/pre>\n<\/div>\n<\/div>\n<p><code>Reduce<\/code>, on the other hand, can be used to compute the sum of (or to perform another operation on) the data collected from all processes. For example, after calling <code>Scatterv<\/code>, we can compute the sum of the numbers in <code>recvbuf<\/code> on each process, and then call <code>Reduce<\/code> to add all of those partial contributions and store the result on the master process.<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\">partial_sum <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>zeros<span style=\"color: #808030\">(<\/span><span style=\"color: #008c00\">1<\/span><span style=\"color: #808030\">)<\/span>\r\npartial_sum<span style=\"color: #808030\">[<\/span><span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">]<\/span> <span style=\"color: #808030\">=<\/span> <span style=\"color: #400000\">sum<\/span><span style=\"color: #808030\">(<\/span>recvbuf<span style=\"color: #808030\">)<\/span>\r\n<span style=\"color: #800000;font-weight: bold\">print<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #0000e6\">'Partial sum on process {} is:'<\/span><span style=\"color: #808030\">.<\/span>format<span style=\"color: #808030\">(<\/span>rank<span style=\"color: #808030\">)<\/span><span style=\"color: #808030\">,<\/span> partial_sum<span style=\"color: #808030\">[<\/span><span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">]<\/span><span style=\"color: #808030\">)<\/span>\r\n\r\ntotal_sum <span style=\"color: #808030\">=<\/span> np<span style=\"color: #808030\">.<\/span>zeros<span style=\"color: #808030\">(<\/span><span style=\"color: #008c00\">1<\/span><span style=\"color: #808030\">)<\/span>\r\ncomm<span style=\"color: #808030\">.<\/span><span style=\"color: #400000\">Reduce<\/span><span style=\"color: #808030\">(<\/span>partial_sum<span style=\"color: #808030\">,<\/span> total_sum<span style=\"color: #808030\">,<\/span> op<span style=\"color: #808030\">=<\/span>MPI<span style=\"color: #808030\">.<\/span><span style=\"color: #400000\">SUM<\/span><span style=\"color: #808030\">,<\/span> root<span style=\"color: #808030\">=<\/span><span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">)<\/span>\r\n<span style=\"color: #800000;font-weight: bold\">if<\/span> comm<span style=\"color: #808030\">.<\/span>Get_rank<span style=\"color: #808030\">(<\/span><span style=\"color: #808030\">)<\/span> <span style=\"color: #44aadd\">==<\/span> <span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">:<\/span>\r\n    <span style=\"color: #800000;font-weight: bold\">print<\/span><span style=\"color: #808030\">(<\/span><span style=\"color: #0000e6\">'After Reduce, total sum on process 0 is:'<\/span><span style=\"color: #808030\">,<\/span> total_sum<span style=\"color: #808030\">[<\/span><span style=\"color: #008c00\">0<\/span><span style=\"color: #808030\">]<\/span><span style=\"color: #808030\">)<\/span>\r\n<\/pre>\n<\/div>\n<\/div>\n<p>The output from running the <code>Reduce<\/code> code on 4 processes is as follows.<\/p>\n<div class=\"highlight-default\">\n<div class=\"highlight\">\n<pre style=\"color: #000000;background: #ffffff\">$ srun -n 4 python3 scatter-reduce.py\r\nPartial sum on process 0 is: 6.0\r\nPartial sum on process 1 is: 22.0\r\nPartial sum on process 2 is: 38.0\r\nPartial sum on process 3 is: 39.0\r\nAfter Reduce, total sum on process 0 is: 105.0<\/pre>\n<\/div>\n<\/div>\n<h2>Summary<\/h2>\n<p>We have shown how to directly communicate buffer-like objects using the <code>mpi4py<\/code> module and its methods that start with an uppercase letter. The communication of buffer-like objects is f<span style=\"font-size: 1.125rem\">aster, but less flexible, than the communication of Python objects<\/span><span style=\"font-size: 1.125rem\">.\u00a0<\/span><\/p>\n<p>You may read the <a href=\"https:\/\/mpi4py.readthedocs.io\/en\/stable\/tutorial.html\">MPI for Python tutorial page<\/a> for more information about the <code>mpi4py<\/code> module.<\/p>\n<\/div>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>In part 1 of this post, we introduced the mpi4py module (MPI for Python) which provides an object-oriented interface for Python resembling the message passing interface (MPI) and enables Python programs to exploit multiple processors on multiple compute nodes. The mpi4py module provides methods for communicating various types of Python objects in different ways. In [&hellip;]<\/p>\n","protected":false},"author":1140,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[5],"tags":[9,19],"class_list":["post-484","post","type-post","status-publish","format-standard","hentry","category-performance","tag-parallelization","tag-python"],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p9W9Im-7O","_links":{"self":[{"href":"https:\/\/www.kth.se\/blogs\/pdc\/wp-json\/wp\/v2\/posts\/484","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kth.se\/blogs\/pdc\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kth.se\/blogs\/pdc\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kth.se\/blogs\/pdc\/wp-json\/wp\/v2\/users\/1140"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kth.se\/blogs\/pdc\/wp-json\/wp\/v2\/comments?post=484"}],"version-history":[{"count":33,"href":"https:\/\/www.kth.se\/blogs\/pdc\/wp-json\/wp\/v2\/posts\/484\/revisions"}],"predecessor-version":[{"id":543,"href":"https:\/\/www.kth.se\/blogs\/pdc\/wp-json\/wp\/v2\/posts\/484\/revisions\/543"}],"wp:attachment":[{"href":"https:\/\/www.kth.se\/blogs\/pdc\/wp-json\/wp\/v2\/media?parent=484"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kth.se\/blogs\/pdc\/wp-json\/wp\/v2\/categories?post=484"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kth.se\/blogs\/pdc\/wp-json\/wp\/v2\/tags?post=484"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}