[Cin] OpenMP

Andrew Randrianasulu randrianasulu at gmail.com
Wed Mar 11 01:17:13 CET 2020

Hi, all!

Currently I'm experimenting with OpenMP


Support in different compilers
GCC (GNU Compiler Collection) supports OpenMP 4.5 since version 6.1, OpenMP 4.0 since version 4.9, OpenMP 3.1 since version 4.7, OpenMP 3.0 since version 4.4, and OpenMP 2.5 since version 4.2. Add the commandline option -fopenmp to enable it. OpenMP offloading is supported for Intel MIC targets only (Intel Xeon Phi KNL + emulation) since version 5.1, and to NVidia (NVPTX) targets since version 7 or so.


The syntax
 All OpenMP constructs in C and C++ are indicated with a #pragma omp followed by parameters, ending in a newline. The pragma usually applies only into the statement immediately following it, except for the barrier and flush commands, which do not have associated statements. 

The parallel construct
 The parallel construct starts a parallel block. It creates a team of N threads (where N is determined at runtime, usually from the number of CPU cores, but may be affected by a few things), all of which execute the next statement (or the next block, if the statement is a {…} -enclosure). After the statement, the threads join back into one. 

  #pragma omp parallel
    // Code inside this region runs in parallel.

 This code creates a team of threads, and each thread executes the same code. It prints the text "Hello!" followed by a newline, as many times as there are threads in the team created. For a dual-core system, it will output the text twice. (Note: It may also output something like "HeHlellolo", depending on system, because the printing happens in parallel.) At the }, the threads are joined back into one, as if in non-threaded program. 
Internally, GCC implements this by creating a magic function and moving the associated code into that function, so that all the variables declared within that block become local variables of that function (and thus, locals to each thread).

 ICC, on the other hand, uses a mechanism resembling fork(), and does not create a magic function. Both implementations are, of course, valid, and semantically identical. 
Variables shared from the context are handled transparently, sometimes by passing a reference and sometimes by using register variables which are flushed at the end of the parallel block (or whenever a flush is executed).
--quote end---

Multicore Image Processing with OpenMP
Greg Slabaugh, Richard Boyes, Xiaoyun Yang

OpenMP by Rob Bateman

 OpenMP is an open standard that lets you easily make use of multi-threaded processors. It's currently supported by the following compilers: Visual C++, gcc (though not the Win32 version that comes with cygwin), XCode, and the Intel compiler; and It's supported on the following platforms: Win32, Linux, MacOS, XBox360*, and PS3*.
 * Not amazingly well on those platforms
--quote end--

I used bcast2000 example , namely bcast/overlayframe.C

and those CFLAGS:

CFLAGS = -O3 -fpermissive -fomit-frame-pointer -march=pentium3 -ffast-math -mfpmath=both -fopenmp -I/usr/local/include
+ enabled linking with libgomp (gcc 5.5.0) by adding  -lgomp to bcast-2000c/bcast/Makefile

it makes code slower, so far  :}

but it eats all processors :} unlike original code
-------------- next part --------------
A non-text attachment was scrubbed...
Name: openmp_overlay.C.diff
Type: text/x-diff
Size: 6014 bytes
Desc: not available
URL: <https://lists.cinelerra-gg.org/pipermail/cin/attachments/20200311/1e43b703/attachment.bin>

More information about the Cin mailing list