1365 lines
39 KiB
Plaintext
1365 lines
39 KiB
Plaintext
|
\input texinfo @c -*-texinfo-*-
|
||
|
|
||
|
@c %**start of header
|
||
|
@setfilename libgomp.info
|
||
|
@settitle GNU libgomp
|
||
|
@c %**end of header
|
||
|
|
||
|
|
||
|
@copying
|
||
|
Copyright @copyright{} 2006 Free Software Foundation, Inc.
|
||
|
|
||
|
Permission is granted to copy, distribute and/or modify this document
|
||
|
under the terms of the GNU Free Documentation License, Version 1.1 or
|
||
|
any later version published by the Free Software Foundation; with the
|
||
|
Invariant Sections being ``GNU General Public License'' and ``Funding
|
||
|
Free Software'', the Front-Cover
|
||
|
texts being (a) (see below), and with the Back-Cover Texts being (b)
|
||
|
(see below). A copy of the license is included in the section entitled
|
||
|
``GNU Free Documentation License''.
|
||
|
|
||
|
(a) The FSF's Front-Cover Text is:
|
||
|
|
||
|
A GNU Manual
|
||
|
|
||
|
(b) The FSF's Back-Cover Text is:
|
||
|
|
||
|
You have freedom to copy and modify this GNU Manual, like GNU
|
||
|
software. Copies published by the Free Software Foundation raise
|
||
|
funds for GNU development.
|
||
|
@end copying
|
||
|
|
||
|
@ifinfo
|
||
|
@dircategory GNU Libraries
|
||
|
@direntry
|
||
|
* libgomp: (libgomp). GNU OpenMP runtime library
|
||
|
@end direntry
|
||
|
|
||
|
This manual documents the GNU implementation of the OpenMP API for
|
||
|
multi-platform shared-memory parallel programming in C/C++ and Fortran.
|
||
|
|
||
|
Published by the Free Software Foundation
|
||
|
51 Franklin Street, Fifth Floor
|
||
|
Boston, MA 02110-1301 USA
|
||
|
|
||
|
@insertcopying
|
||
|
@end ifinfo
|
||
|
|
||
|
|
||
|
@setchapternewpage odd
|
||
|
|
||
|
@titlepage
|
||
|
@title The GNU OpenMP Implementation
|
||
|
@page
|
||
|
@vskip 0pt plus 1filll
|
||
|
@comment For the @value{version-GCC} Version*
|
||
|
@sp 1
|
||
|
Published by the Free Software Foundation @*
|
||
|
51 Franklin Street, Fifth Floor@*
|
||
|
Boston, MA 02110-1301, USA@*
|
||
|
@sp 1
|
||
|
@insertcopying
|
||
|
@end titlepage
|
||
|
|
||
|
@summarycontents
|
||
|
@contents
|
||
|
@page
|
||
|
|
||
|
|
||
|
@node Top
|
||
|
@top Introduction
|
||
|
@cindex Introduction
|
||
|
|
||
|
This manual documents the usage of libgomp, the GNU implementation of the
|
||
|
@uref{http://www.openmp.org, OpenMP} Application Programming Interface (API)
|
||
|
for multi-platform shared-memory parallel programming in C/C++ and Fortran.
|
||
|
|
||
|
|
||
|
|
||
|
@comment
|
||
|
@comment When you add a new menu item, please keep the right hand
|
||
|
@comment aligned to the same column. Do not use tabs. This provides
|
||
|
@comment better formatting.
|
||
|
@comment
|
||
|
@menu
|
||
|
* Enabling OpenMP:: How to enable OpenMP for your applications.
|
||
|
* Runtime Library Routines:: The OpenMP runtime application programming
|
||
|
interface.
|
||
|
* Environment Variables:: Influencing runtime behavior with environment
|
||
|
variables.
|
||
|
* The libgomp ABI:: Notes on the external ABI presented by libgomp.
|
||
|
* Reporting Bugs:: How to report bugs in GNU OpenMP.
|
||
|
* Copying:: GNU general public license says
|
||
|
how you can copy and share libgomp.
|
||
|
* GNU Free Documentation License::
|
||
|
How you can copy and share this manual.
|
||
|
* Funding:: How to help assure continued work for free
|
||
|
software.
|
||
|
* Index:: Index of this documentation.
|
||
|
@end menu
|
||
|
|
||
|
|
||
|
@c ---------------------------------------------------------------------
|
||
|
@c Enabling OpenMP
|
||
|
@c ---------------------------------------------------------------------
|
||
|
|
||
|
@node Enabling OpenMP
|
||
|
@chapter Enabling OpenMP
|
||
|
|
||
|
To activate the OpenMP extensions for C/C++ and Fortran, the compile-time
|
||
|
flag @command{-fopenmp} must be specified. This enables the OpenMP directive
|
||
|
@code{#pragma omp} in C/C++ and @code{!$omp} directives in free form,
|
||
|
@code{c$omp}, @code{*$omp} and @code{!$omp} directives in fixed form,
|
||
|
@code{!$} conditional compilation sentinels in free form and @code{c$},
|
||
|
@code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
|
||
|
arranges for automatic linking of the OpenMP runtime library
|
||
|
(@ref{Runtime Library Routines}).
|
||
|
|
||
|
A complete description of all OpenMP directives accepted may be found in
|
||
|
the @uref{http://www.openmp.org, OpenMP Application Program Interface} manual,
|
||
|
version 2.5.
|
||
|
|
||
|
|
||
|
@c ---------------------------------------------------------------------
|
||
|
@c Runtime Library Routines
|
||
|
@c ---------------------------------------------------------------------
|
||
|
|
||
|
@node Runtime Library Routines
|
||
|
@chapter Runtime Library Routines
|
||
|
|
||
|
The runtime routines described here are defined by section 3 of the OpenMP
|
||
|
specifications in version 2.5.
|
||
|
|
||
|
Control threads, processors and the parallel environment.
|
||
|
|
||
|
@menu
|
||
|
* omp_get_dynamic:: Dynamic teams setting
|
||
|
* omp_get_max_threads:: Maximum number of threads
|
||
|
* omp_get_nested:: Nested parallel regions
|
||
|
* omp_get_num_procs:: Number of processors online
|
||
|
* omp_get_num_threads:: Size of the active team
|
||
|
* omp_get_thread_num:: Current thread ID
|
||
|
* omp_in_parallel:: Whether a parallel region is active
|
||
|
* omp_set_dynamic:: Enable/disable dynamic teams
|
||
|
* omp_set_nested:: Enable/disable nested parallel regions
|
||
|
* omp_set_num_threads:: Set upper team size limit
|
||
|
@end menu
|
||
|
|
||
|
Initialize, set, test, unset and destroy simple and nested locks.
|
||
|
|
||
|
@menu
|
||
|
* omp_init_lock:: Initialize simple lock
|
||
|
* omp_set_lock:: Wait for and set simple lock
|
||
|
* omp_test_lock:: Test and set simple lock if available
|
||
|
* omp_unset_lock:: Unset simple lock
|
||
|
* omp_destroy_lock:: Destroy simple lock
|
||
|
* omp_init_nest_lock:: Initialize nested lock
|
||
|
* omp_set_nest_lock:: Wait for and set simple lock
|
||
|
* omp_test_nest_lock:: Test and set nested lock if available
|
||
|
* omp_unset_nest_lock:: Unset nested lock
|
||
|
* omp_destroy_nest_lock:: Destroy nested lock
|
||
|
@end menu
|
||
|
|
||
|
Portable, thread-based, wall clock timer.
|
||
|
|
||
|
@menu
|
||
|
* omp_get_wtick:: Get timer precision.
|
||
|
* omp_get_wtime:: Elapsed wall clock time.
|
||
|
@end menu
|
||
|
|
||
|
@node omp_get_dynamic
|
||
|
@section @code{omp_get_dynamic} -- Dynamic teams setting
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
This function returns @code{true} if enabled, @code{false} otherwise.
|
||
|
Here, @code{true} and @code{false} represent their language-specific
|
||
|
counterparts.
|
||
|
|
||
|
The dynamic team setting may be initialized at startup by the
|
||
|
@code{OMP_DYNAMIC} environment variable or at runtime using
|
||
|
@code{omp_set_dynamic}. If undefined, dynamic adjustment is
|
||
|
disabled by default.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{int omp_get_dynamic();}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{logical function omp_get_dynamic()}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_set_dynamic}, @ref{OMP_DYNAMIC}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.2.8.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_get_max_threads
|
||
|
@section @code{omp_get_max_threads} -- Maximum number of threads
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Return the maximum number of threads used for parallel regions that do
|
||
|
not use the clause @code{num_threads}.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{int omp_get_max_threads();}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{integer function omp_get_max_threads()}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_set_num_threads}, @ref{omp_set_dynamic}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.2.3.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_get_nested
|
||
|
@section @code{omp_get_nested} -- Nested parallel regions
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
This function returns @code{true} if nested parallel regions are
|
||
|
enabled, @code{false} otherwise. Here, @code{true} and @code{false}
|
||
|
represent their language-specific counterparts.
|
||
|
|
||
|
Nested parallel regions may be initialized at startup by the
|
||
|
@code{OMP_NESTED} environment variable or at runtime using
|
||
|
@code{omp_set_nested}. If undefined, nested parallel regions are
|
||
|
disabled by default.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{int omp_get_nested();}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{integer function omp_get_nested()}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_set_nested}, @ref{OMP_NESTED}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.2.10.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_get_num_procs
|
||
|
@section @code{omp_get_num_procs} -- Number of processors online
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Returns the number of processors online.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{int omp_get_num_procs();}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{integer function omp_get_num_procs()}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.2.5.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_get_num_threads
|
||
|
@section @code{omp_get_num_threads} -- Size of the active team
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
The number of threads in the current team. In a sequential section of
|
||
|
the program @code{omp_get_num_threads} returns 1.
|
||
|
|
||
|
The default team size may be initialized at startup by the
|
||
|
@code{OMP_NUM_THREADS} environment variable. At runtime, the size
|
||
|
of the current team may be set either by the @code{NUM_THREADS}
|
||
|
clause or by @code{omp_set_num_threads}. If none of the above were
|
||
|
used to define a specific value and @code{OMP_DYNAMIC} is disabled,
|
||
|
one thread per CPU online is used.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{int omp_get_num_threads();}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{integer function omp_get_num_threads()}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_get_max_threads}, @ref{omp_set_num_threads}, @ref{OMP_NUM_THREADS}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.2.2.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_get_thread_num
|
||
|
@section @code{omp_get_thread_num} -- Current thread ID
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Unique thread identification number. In a sequential parts of the program,
|
||
|
@code{omp_get_thread_num} always returns 0. In parallel regions the return
|
||
|
value varies from 0 to @code{omp_get_max_threads}-1 inclusive. The return
|
||
|
value of the master thread of a team is always 0.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{int omp_get_thread_num();}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{integer function omp_get_thread_num()}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_get_max_threads}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.2.4.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_in_parallel
|
||
|
@section @code{omp_in_parallel} -- Whether a parallel region is active
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
This function returns @code{true} if currently running in parallel,
|
||
|
@code{false} otherwise. Here, @code{true} and @code{false} represent
|
||
|
their language-specific counterparts.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{int omp_in_parallel();}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{logical function omp_in_parallel()}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.2.6.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
@node omp_set_dynamic
|
||
|
@section @code{omp_set_dynamic} -- Enable/disable dynamic teams
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Enable or disable the dynamic adjustment of the number of threads
|
||
|
within a team. The function takes the language-specific equivalent
|
||
|
of @code{true} and @code{false}, where @code{true} enables dynamic
|
||
|
adjustment of team sizes and @code{false} disables it.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{void omp_set_dynamic(int);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_set_dynamic(set)}
|
||
|
@item @tab @code{integer, intent(in) :: set}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{OMP_DYNAMIC}, @ref{omp_get_dynamic}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.2.7.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_set_nested
|
||
|
@section @code{omp_set_nested} -- Enable/disable nested parallel regions
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Enable or disable nested parallel regions, i.e., whether team members
|
||
|
are allowed to create new teams. The function takes the language-specific
|
||
|
equivalent of @code{true} and @code{false}, where @code{true} enables
|
||
|
dynamic adjustment of team sizes and @code{false} disables it.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{void omp_set_dynamic(int);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_set_dynamic(set)}
|
||
|
@item @tab @code{integer, intent(in) :: set}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{OMP_NESTED}, @ref{omp_get_nested}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.2.9.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_set_num_threads
|
||
|
@section @code{omp_set_num_threads} -- Set upper team size limit
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Specifies the number of threads used by default in subsequent parallel
|
||
|
sections, if those do not specify a @code{num_threads} clause. The
|
||
|
argument of @code{omp_set_num_threads} shall be a positive integer.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{void omp_set_num_threads(int);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_set_num_threads(set)}
|
||
|
@item @tab @code{integer, intent(in) :: set}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{OMP_NUM_THREADS}, @ref{omp_get_num_threads}, @ref{omp_get_max_threads}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.2.1.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_init_lock
|
||
|
@section @code{omp_init_lock} -- Initialize simple lock
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Initialize a simple lock. After initialization, the lock is in
|
||
|
an unlocked state.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{void omp_init_lock(omp_lock_t *lock);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_init_lock(lock)}
|
||
|
@item @tab @code{integer(omp_lock_kind), intent(out) :: lock}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_destroy_lock}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.3.1.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_set_lock
|
||
|
@section @code{omp_set_lock} -- Wait for and set simple lock
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Before setting a simple lock, the lock variable must be initialized by
|
||
|
@code{omp_init_lock}. The calling thread is blocked until the lock
|
||
|
is available. If the lock is already held by the current thread,
|
||
|
a deadlock occurs.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{void omp_set_lock(omp_lock_t *lock);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_set_lock(lock)}
|
||
|
@item @tab @code{integer(omp_lock_kind), intent(out) :: lock}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_init_lock}, @ref{omp_test_lock}, @ref{omp_unset_lock}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.3.3.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_test_lock
|
||
|
@section @code{omp_test_lock} -- Test and set simple lock if available
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Before setting a simple lock, the lock variable must be initialized by
|
||
|
@code{omp_init_lock}. Contrary to @code{omp_set_lock}, @code{omp_test_lock}
|
||
|
does not block if the lock is not available. This function returns
|
||
|
@code{true} upon success,@code{false} otherwise. Here, @code{true} and
|
||
|
@code{false} represent their language-specific counterparts.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{int omp_test_lock(omp_lock_t *lock);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_test_lock(lock)}
|
||
|
@item @tab @code{logical(omp_logical_kind) :: omp_test_lock}
|
||
|
@item @tab @code{integer(omp_lock_kind), intent(out) :: lock}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.3.5.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_unset_lock
|
||
|
@section @code{omp_unset_lock} -- Unset simple lock
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
A simple lock about to be unset must have been locked by @code{omp_set_lock}
|
||
|
or @code{omp_test_lock} before. In addition, the lock must be held by the
|
||
|
thread calling @code{omp_unset_lock}. Then, the lock becomes unlocked. If one
|
||
|
ore more threads attempted to set the lock before, one of them is chosen to,
|
||
|
again, set the lock for itself.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{void omp_unset_lock(omp_lock_t *lock);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_unset_lock(lock)}
|
||
|
@item @tab @code{integer(omp_lock_kind), intent(out) :: lock}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_set_lock}, @ref{omp_test_lock}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.3.4.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_destroy_lock
|
||
|
@section @code{omp_destroy_lock} -- Destroy simple lock
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Destroy a simple lock. In order to be destroyed, a simple lock must be
|
||
|
in the unlocked state.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{void omp_destroy_lock(omp_lock_t *);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_destroy_lock(lock)}
|
||
|
@item @tab @code{integer(omp_lock_kind), intent(inout) :: lock}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_init_lock}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.3.2.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_init_nest_lock
|
||
|
@section @code{omp_init_nest_lock} -- Initialize nested lock
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Initialize a nested lock. After initialization, the lock is in
|
||
|
an unlocked state and the nesting count is set to zero.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{void omp_init_nest_lock(omp_nest_lock_t *lock);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_init_nest_lock(lock)}
|
||
|
@item @tab @code{integer(omp_nest_lock_kind), intent(out) :: lock}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_destroy_nest_lock}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.3.1.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
@node omp_set_nest_lock
|
||
|
@section @code{omp_set_nest_lock} -- Wait for and set simple lock
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Before setting a nested lock, the lock variable must be initialized by
|
||
|
@code{omp_init_nest_lock}. The calling thread is blocked until the lock
|
||
|
is available. If the lock is already held by the current thread, the
|
||
|
nesting count for the lock in incremented.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{void omp_set_nest_lock(omp_nest_lock_t *lock);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_set_nest_lock(lock)}
|
||
|
@item @tab @code{integer(omp_nest_lock_kind), intent(out) :: lock}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_init_nest_lock}, @ref{omp_unset_nest_lock}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.3.3.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_test_nest_lock
|
||
|
@section @code{omp_test_nest_lock} -- Test and set nested lock if available
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Before setting a nested lock, the lock variable must be initialized by
|
||
|
@code{omp_init_nest_lock}. Contrary to @code{omp_set_nest_lock},
|
||
|
@code{omp_test_nest_lock} does not block if the lock is not available.
|
||
|
If the lock is already held by the current thread, the new nesting count
|
||
|
is returned. Otherwise, the return value equals zero.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{int omp_test_nest_lock(omp_nest_lock_t *lock);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{integer function omp_test_nest_lock(lock)}
|
||
|
@item @tab @code{integer(omp_integer_kind) :: omp_test_nest_lock}
|
||
|
@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: lock}
|
||
|
@end multitable
|
||
|
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.3.5.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_unset_nest_lock
|
||
|
@section @code{omp_unset_nest_lock} -- Unset nested lock
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
A nested lock about to be unset must have been locked by @code{omp_set_nested_lock}
|
||
|
or @code{omp_test_nested_lock} before. In addition, the lock must be held by the
|
||
|
thread calling @code{omp_unset_nested_lock}. If the nesting count drops to zero, the
|
||
|
lock becomes unlocked. If one ore more threads attempted to set the lock before,
|
||
|
one of them is chosen to, again, set the lock for itself.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{void omp_unset_nest_lock(omp_nest_lock_t *lock);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_unset_nest_lock(lock)}
|
||
|
@item @tab @code{integer(omp_nest_lock_kind), intent(out) :: lock}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_set_nest_lock}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.3.4.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_destroy_nest_lock
|
||
|
@section @code{omp_destroy_nest_lock} -- Destroy nested lock
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Destroy a nested lock. In order to be destroyed, a nested lock must be
|
||
|
in the unlocked state and its nesting count must equal zero.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{void omp_destroy_nest_lock(omp_nest_lock_t *);}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{subroutine omp_destroy_nest_lock(lock)}
|
||
|
@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: lock}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_init_lock}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.3.2.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_get_wtick
|
||
|
@section @code{omp_get_wtick} -- Get timer precision
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Gets the timer precision, i.e., the number of seconds between two
|
||
|
successive clock ticks.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{double omp_get_wtick();}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{double precision function omp_get_wtick()}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_get_wtime}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.4.2.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node omp_get_wtime
|
||
|
@section @code{omp_get_wtime} -- Elapsed wall clock time
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Elapsed wall clock time in seconds. The time is measured per thread, no
|
||
|
guarantee can bee made that two distinct threads measure the same time.
|
||
|
Time is measured from some "time in the past". On POSIX compliant systems
|
||
|
the seconds since the Epoch (00:00:00 UTC, January 1, 1970) are returned.
|
||
|
|
||
|
@item @emph{C/C++}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Prototype}: @tab @code{double omp_get_wtime();}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{Fortran}:
|
||
|
@multitable @columnfractions .20 .80
|
||
|
@item @emph{Interface}: @tab @code{double precision function omp_get_wtime()}
|
||
|
@end multitable
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_get_wtick}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 3.4.1.
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@c ---------------------------------------------------------------------
|
||
|
@c Environment Variables
|
||
|
@c ---------------------------------------------------------------------
|
||
|
|
||
|
@node Environment Variables
|
||
|
@chapter Environment Variables
|
||
|
|
||
|
The variables @env{OMP_DYNAMIC}, @env{OMP_NESTED}, @env{OMP_NUM_THREADS} and
|
||
|
@env{OMP_SCHEDULE} are defined by section 4 of the OpenMP specifications in
|
||
|
version 2.5, while @env{GOMP_CPU_AFFINITY} and @env{GOMP_STACKSIZE} are GNU
|
||
|
extensions.
|
||
|
|
||
|
@menu
|
||
|
* OMP_DYNAMIC:: Dynamic adjustment of threads
|
||
|
* OMP_NESTED:: Nested parallel regions
|
||
|
* OMP_NUM_THREADS:: Specifies the number of threads to use
|
||
|
* OMP_SCHEDULE:: How threads are scheduled
|
||
|
* GOMP_CPU_AFFINITY:: Bind threads to specific CPUs
|
||
|
* GOMP_STACKSIZE:: Set default thread stack size
|
||
|
@end menu
|
||
|
|
||
|
|
||
|
@node OMP_DYNAMIC
|
||
|
@section @env{OMP_DYNAMIC} -- Dynamic adjustment of threads
|
||
|
@cindex Environment Variable
|
||
|
@cindex Implementation specific setting
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Enable or disable the dynamic adjustment of the number of threads
|
||
|
within a team. The value of this environment variable shall be
|
||
|
@code{TRUE} or @code{FALSE}. If undefined, dynamic adjustment is
|
||
|
disabled by default.
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_set_dynamic}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 4.3
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node OMP_NESTED
|
||
|
@section @env{OMP_NESTED} -- Nested parallel regions
|
||
|
@cindex Environment Variable
|
||
|
@cindex Implementation specific setting
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Enable or disable nested parallel regions, i.e., whether team members
|
||
|
are allowed to create new teams. The value of this environment variable
|
||
|
shall be @code{TRUE} or @code{FALSE}. If undefined, nested parallel
|
||
|
regions are disabled by default.
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_set_nested}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 4.4
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node OMP_NUM_THREADS
|
||
|
@section @env{OMP_NUM_THREADS} -- Specifies the number of threads to use
|
||
|
@cindex Environment Variable
|
||
|
@cindex Implementation specific setting
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Specifies the default number of threads to use in parallel regions. The
|
||
|
value of this variable shall be positive integer. If undefined one thread
|
||
|
per CPU online is used.
|
||
|
|
||
|
@item @emph{See also}:
|
||
|
@ref{omp_set_num_threads}
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, section 4.2
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node OMP_SCHEDULE
|
||
|
@section @env{OMP_SCHEDULE} -- How threads are scheduled
|
||
|
@cindex Environment Variable
|
||
|
@cindex Implementation specific setting
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Allows to specify @code{schedule type} and @code{chunk size}.
|
||
|
The value of the variable shall have the form: @code{type[,chunk]} where
|
||
|
@code{type} is one of @code{static}, @code{dynamic} or @code{guided}.
|
||
|
The optional @code{chunk size} shall be a positive integer. If undefined,
|
||
|
dynamic scheduling and a chunk size of 1 is used.
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://www.openmp.org/, OpenMP specifications v2.5}, sections 2.5.1 and 4.1
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node GOMP_CPU_AFFINITY
|
||
|
@section @env{GOMP_CPU_AFFINITY} -- Bind threads to specific CPUs
|
||
|
@cindex Environment Variable
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
A patch for this extension has been submitted, but was not yet applied at the
|
||
|
time of writing.
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://gcc.gnu.org/ml/gcc-patches/2006-05/msg00982.html,
|
||
|
GCC Patches Mailinglist}
|
||
|
@uref{http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01133.html,
|
||
|
GCC Patches Mailinglist}
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@node GOMP_STACKSIZE
|
||
|
@section @env{GOMP_STACKSIZE} -- Set default thread stack size
|
||
|
@cindex Environment Variable
|
||
|
@cindex Implementation specific setting
|
||
|
@table @asis
|
||
|
@item @emph{Description}:
|
||
|
Set the default thread stack size in kilobytes. This is in opposition
|
||
|
to @code{pthread_attr_setstacksize} which gets the number of bytes as an
|
||
|
argument. If the stacksize can not be set due to system constraints, an
|
||
|
error is reported and the initial stacksize is left unchanged. If undefined,
|
||
|
the stack size is system dependent.
|
||
|
|
||
|
@item @emph{Reference}:
|
||
|
@uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html,
|
||
|
GCC Patches Mailinglist},
|
||
|
@uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
|
||
|
GCC Patches Mailinglist}
|
||
|
@end table
|
||
|
|
||
|
|
||
|
|
||
|
@c ---------------------------------------------------------------------
|
||
|
@c The libgomp ABI
|
||
|
@c ---------------------------------------------------------------------
|
||
|
|
||
|
@node The libgomp ABI
|
||
|
@chapter The libgomp ABI
|
||
|
|
||
|
The following sections present notes on the external ABI as
|
||
|
presented by libgomp. Only maintainers should need them.
|
||
|
|
||
|
@menu
|
||
|
* Implementing MASTER construct::
|
||
|
* Implementing CRITICAL construct::
|
||
|
* Implementing ATOMIC construct::
|
||
|
* Implementing FLUSH construct::
|
||
|
* Implementing BARRIER construct::
|
||
|
* Implementing THREADPRIVATE construct::
|
||
|
* Implementing PRIVATE clause::
|
||
|
* Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses::
|
||
|
* Implementing REDUCTION clause::
|
||
|
* Implementing PARALLEL construct::
|
||
|
* Implementing FOR construct::
|
||
|
* Implementing ORDERED construct::
|
||
|
* Implementing SECTIONS construct::
|
||
|
* Implementing SINGLE construct::
|
||
|
@end menu
|
||
|
|
||
|
|
||
|
@node Implementing MASTER construct
|
||
|
@section Implementing MASTER construct
|
||
|
|
||
|
@smallexample
|
||
|
if (omp_get_thread_num () == 0)
|
||
|
block
|
||
|
@end smallexample
|
||
|
|
||
|
Alternately, we generate two copies of the parallel subfunction
|
||
|
and only include this in the version run by the master thread.
|
||
|
Surely that's not worthwhile though...
|
||
|
|
||
|
|
||
|
|
||
|
@node Implementing CRITICAL construct
|
||
|
@section Implementing CRITICAL construct
|
||
|
|
||
|
Without a specified name,
|
||
|
|
||
|
@smallexample
|
||
|
void GOMP_critical_start (void);
|
||
|
void GOMP_critical_end (void);
|
||
|
@end smallexample
|
||
|
|
||
|
so that we don't get COPY relocations from libgomp to the main
|
||
|
application.
|
||
|
|
||
|
With a specified name, use omp_set_lock and omp_unset_lock with
|
||
|
name being transformed into a variable declared like
|
||
|
|
||
|
@smallexample
|
||
|
omp_lock_t gomp_critical_user_<name> __attribute__((common))
|
||
|
@end smallexample
|
||
|
|
||
|
Ideally the ABI would specify that all zero is a valid unlocked
|
||
|
state, and so we wouldn't actually need to initialize this at
|
||
|
startup.
|
||
|
|
||
|
|
||
|
|
||
|
@node Implementing ATOMIC construct
|
||
|
@section Implementing ATOMIC construct
|
||
|
|
||
|
The target should implement the @code{__sync} builtins.
|
||
|
|
||
|
Failing that we could add
|
||
|
|
||
|
@smallexample
|
||
|
void GOMP_atomic_enter (void)
|
||
|
void GOMP_atomic_exit (void)
|
||
|
@end smallexample
|
||
|
|
||
|
which reuses the regular lock code, but with yet another lock
|
||
|
object private to the library.
|
||
|
|
||
|
|
||
|
|
||
|
@node Implementing FLUSH construct
|
||
|
@section Implementing FLUSH construct
|
||
|
|
||
|
Expands to the @code{__sync_synchronize} builtin.
|
||
|
|
||
|
|
||
|
|
||
|
@node Implementing BARRIER construct
|
||
|
@section Implementing BARRIER construct
|
||
|
|
||
|
@smallexample
|
||
|
void GOMP_barrier (void)
|
||
|
@end smallexample
|
||
|
|
||
|
|
||
|
@node Implementing THREADPRIVATE construct
|
||
|
@section Implementing THREADPRIVATE construct
|
||
|
|
||
|
In _most_ cases we can map this directly to @code{__thread}. Except
|
||
|
that OMP allows constructors for C++ objects. We can either
|
||
|
refuse to support this (how often is it used?) or we can
|
||
|
implement something akin to .ctors.
|
||
|
|
||
|
Even more ideally, this ctor feature is handled by extensions
|
||
|
to the main pthreads library. Failing that, we can have a set
|
||
|
of entry points to register ctor functions to be called.
|
||
|
|
||
|
|
||
|
|
||
|
@node Implementing PRIVATE clause
|
||
|
@section Implementing PRIVATE clause
|
||
|
|
||
|
In association with a PARALLEL, or within the lexical extent
|
||
|
of a PARALLEL block, the variable becomes a local variable in
|
||
|
the parallel subfunction.
|
||
|
|
||
|
In association with FOR or SECTIONS blocks, create a new
|
||
|
automatic variable within the current function. This preserves
|
||
|
the semantic of new variable creation.
|
||
|
|
||
|
|
||
|
|
||
|
@node Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
|
||
|
@section Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
|
||
|
|
||
|
Seems simple enough for PARALLEL blocks. Create a private
|
||
|
struct for communicating between parent and subfunction.
|
||
|
In the parent, copy in values for scalar and "small" structs;
|
||
|
copy in addresses for others TREE_ADDRESSABLE types. In the
|
||
|
subfunction, copy the value into the local variable.
|
||
|
|
||
|
Not clear at all what to do with bare FOR or SECTION blocks.
|
||
|
The only thing I can figure is that we do something like
|
||
|
|
||
|
@smallexample
|
||
|
#pragma omp for firstprivate(x) lastprivate(y)
|
||
|
for (int i = 0; i < n; ++i)
|
||
|
body;
|
||
|
@end smallexample
|
||
|
|
||
|
which becomes
|
||
|
|
||
|
@smallexample
|
||
|
@{
|
||
|
int x = x, y;
|
||
|
|
||
|
// for stuff
|
||
|
|
||
|
if (i == n)
|
||
|
y = y;
|
||
|
@}
|
||
|
@end smallexample
|
||
|
|
||
|
where the "x=x" and "y=y" assignments actually have different
|
||
|
uids for the two variables, i.e. not something you could write
|
||
|
directly in C. Presumably this only makes sense if the "outer"
|
||
|
x and y are global variables.
|
||
|
|
||
|
COPYPRIVATE would work the same way, except the structure
|
||
|
broadcast would have to happen via SINGLE machinery instead.
|
||
|
|
||
|
|
||
|
|
||
|
@node Implementing REDUCTION clause
|
||
|
@section Implementing REDUCTION clause
|
||
|
|
||
|
The private struct mentioned in the previous section should have
|
||
|
a pointer to an array of the type of the variable, indexed by the
|
||
|
thread's @var{team_id}. The thread stores its final value into the
|
||
|
array, and after the barrier the master thread iterates over the
|
||
|
array to collect the values.
|
||
|
|
||
|
|
||
|
@node Implementing PARALLEL construct
|
||
|
@section Implementing PARALLEL construct
|
||
|
|
||
|
@smallexample
|
||
|
#pragma omp parallel
|
||
|
@{
|
||
|
body;
|
||
|
@}
|
||
|
@end smallexample
|
||
|
|
||
|
becomes
|
||
|
|
||
|
@smallexample
|
||
|
void subfunction (void *data)
|
||
|
@{
|
||
|
use data;
|
||
|
body;
|
||
|
@}
|
||
|
|
||
|
setup data;
|
||
|
GOMP_parallel_start (subfunction, &data, num_threads);
|
||
|
subfunction (&data);
|
||
|
GOMP_parallel_end ();
|
||
|
@end smallexample
|
||
|
|
||
|
@smallexample
|
||
|
void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads)
|
||
|
@end smallexample
|
||
|
|
||
|
The @var{FN} argument is the subfunction to be run in parallel.
|
||
|
|
||
|
The @var{DATA} argument is a pointer to a structure used to
|
||
|
communicate data in and out of the subfunction, as discussed
|
||
|
above with respect to FIRSTPRIVATE et al.
|
||
|
|
||
|
The @var{NUM_THREADS} argument is 1 if an IF clause is present
|
||
|
and false, or the value of the NUM_THREADS clause, if
|
||
|
present, or 0.
|
||
|
|
||
|
The function needs to create the appropriate number of
|
||
|
threads and/or launch them from the dock. It needs to
|
||
|
create the team structure and assign team ids.
|
||
|
|
||
|
@smallexample
|
||
|
void GOMP_parallel_end (void)
|
||
|
@end smallexample
|
||
|
|
||
|
Tears down the team and returns us to the previous @code{omp_in_parallel()} state.
|
||
|
|
||
|
|
||
|
|
||
|
@node Implementing FOR construct
|
||
|
@section Implementing FOR construct
|
||
|
|
||
|
@smallexample
|
||
|
#pragma omp parallel for
|
||
|
for (i = lb; i <= ub; i++)
|
||
|
body;
|
||
|
@end smallexample
|
||
|
|
||
|
becomes
|
||
|
|
||
|
@smallexample
|
||
|
void subfunction (void *data)
|
||
|
@{
|
||
|
long _s0, _e0;
|
||
|
while (GOMP_loop_static_next (&_s0, &_e0))
|
||
|
@{
|
||
|
long _e1 = _e0, i;
|
||
|
for (i = _s0; i < _e1; i++)
|
||
|
body;
|
||
|
@}
|
||
|
GOMP_loop_end_nowait ();
|
||
|
@}
|
||
|
|
||
|
GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0);
|
||
|
subfunction (NULL);
|
||
|
GOMP_parallel_end ();
|
||
|
@end smallexample
|
||
|
|
||
|
@smallexample
|
||
|
#pragma omp for schedule(runtime)
|
||
|
for (i = 0; i < n; i++)
|
||
|
body;
|
||
|
@end smallexample
|
||
|
|
||
|
becomes
|
||
|
|
||
|
@smallexample
|
||
|
@{
|
||
|
long i, _s0, _e0;
|
||
|
if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0))
|
||
|
do @{
|
||
|
long _e1 = _e0;
|
||
|
for (i = _s0, i < _e0; i++)
|
||
|
body;
|
||
|
@} while (GOMP_loop_runtime_next (&_s0, _&e0));
|
||
|
GOMP_loop_end ();
|
||
|
@}
|
||
|
@end smallexample
|
||
|
|
||
|
Note that while it looks like there is trickyness to propagating
|
||
|
a non-constant STEP, there isn't really. We're explicitly allowed
|
||
|
to evaluate it as many times as we want, and any variables involved
|
||
|
should automatically be handled as PRIVATE or SHARED like any other
|
||
|
variables. So the expression should remain evaluable in the
|
||
|
subfunction. We can also pull it into a local variable if we like,
|
||
|
but since its supposed to remain unchanged, we can also not if we like.
|
||
|
|
||
|
If we have SCHEDULE(STATIC), and no ORDERED, then we ought to be
|
||
|
able to get away with no work-sharing context at all, since we can
|
||
|
simply perform the arithmetic directly in each thread to divide up
|
||
|
the iterations. Which would mean that we wouldn't need to call any
|
||
|
of these routines.
|
||
|
|
||
|
There are separate routines for handling loops with an ORDERED
|
||
|
clause. Bookkeeping for that is non-trivial...
|
||
|
|
||
|
|
||
|
|
||
|
@node Implementing ORDERED construct
|
||
|
@section Implementing ORDERED construct
|
||
|
|
||
|
@smallexample
|
||
|
void GOMP_ordered_start (void)
|
||
|
void GOMP_ordered_end (void)
|
||
|
@end smallexample
|
||
|
|
||
|
|
||
|
|
||
|
@node Implementing SECTIONS construct
|
||
|
@section Implementing SECTIONS construct
|
||
|
|
||
|
A block as
|
||
|
|
||
|
@smallexample
|
||
|
#pragma omp sections
|
||
|
@{
|
||
|
#pragma omp section
|
||
|
stmt1;
|
||
|
#pragma omp section
|
||
|
stmt2;
|
||
|
#pragma omp section
|
||
|
stmt3;
|
||
|
@}
|
||
|
@end smallexample
|
||
|
|
||
|
becomes
|
||
|
|
||
|
@smallexample
|
||
|
for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ())
|
||
|
switch (i)
|
||
|
@{
|
||
|
case 1:
|
||
|
stmt1;
|
||
|
break;
|
||
|
case 2:
|
||
|
stmt2;
|
||
|
break;
|
||
|
case 3:
|
||
|
stmt3;
|
||
|
break;
|
||
|
@}
|
||
|
GOMP_barrier ();
|
||
|
@end smallexample
|
||
|
|
||
|
|
||
|
@node Implementing SINGLE construct
|
||
|
@section Implementing SINGLE construct
|
||
|
|
||
|
A block like
|
||
|
|
||
|
@smallexample
|
||
|
#pragma omp single
|
||
|
@{
|
||
|
body;
|
||
|
@}
|
||
|
@end smallexample
|
||
|
|
||
|
becomes
|
||
|
|
||
|
@smallexample
|
||
|
if (GOMP_single_start ())
|
||
|
body;
|
||
|
GOMP_barrier ();
|
||
|
@end smallexample
|
||
|
|
||
|
while
|
||
|
|
||
|
@smallexample
|
||
|
#pragma omp single copyprivate(x)
|
||
|
body;
|
||
|
@end smallexample
|
||
|
|
||
|
becomes
|
||
|
|
||
|
@smallexample
|
||
|
datap = GOMP_single_copy_start ();
|
||
|
if (datap == NULL)
|
||
|
@{
|
||
|
body;
|
||
|
data.x = x;
|
||
|
GOMP_single_copy_end (&data);
|
||
|
@}
|
||
|
else
|
||
|
x = datap->x;
|
||
|
GOMP_barrier ();
|
||
|
@end smallexample
|
||
|
|
||
|
|
||
|
|
||
|
@c ---------------------------------------------------------------------
|
||
|
@c
|
||
|
@c ---------------------------------------------------------------------
|
||
|
|
||
|
@node Reporting Bugs
|
||
|
@chapter Reporting Bugs
|
||
|
|
||
|
Bugs in the GNU OpenMP implementation should be reported via
|
||
|
@uref{http://gcc.gnu.org/bugzilla/, bugzilla}. In all cases, please add
|
||
|
"openmp" to the keywords field in the bug report.
|
||
|
|
||
|
|
||
|
|
||
|
@c ---------------------------------------------------------------------
|
||
|
@c GNU General Public License
|
||
|
@c ---------------------------------------------------------------------
|
||
|
|
||
|
@include gpl.texi
|
||
|
|
||
|
|
||
|
|
||
|
@c ---------------------------------------------------------------------
|
||
|
@c GNU Free Documentation License
|
||
|
@c ---------------------------------------------------------------------
|
||
|
|
||
|
@include fdl.texi
|
||
|
|
||
|
|
||
|
|
||
|
@c ---------------------------------------------------------------------
|
||
|
@c Funding Free Software
|
||
|
@c ---------------------------------------------------------------------
|
||
|
|
||
|
@include funding.texi
|
||
|
|
||
|
@c ---------------------------------------------------------------------
|
||
|
@c Index
|
||
|
@c ---------------------------------------------------------------------
|
||
|
|
||
|
@node Index
|
||
|
@unnumbered Index
|
||
|
|
||
|
@printindex cp
|
||
|
|
||
|
@bye
|