Parallel Programming with OpenMP
1 Parallel
Programming with OpenMP
2 Outline
3 Setting
the State
4 Overview
of Parallel Programming
5 Writing a
Parallel Application
6 Parallel
Architectures
7 Parallel
Programming Models
8 Parallel
Computing: Hardware
9 Parallel
Computing: Software
10
Introduction to OpenMP
11 The
OpenMP Programming Model: master/slave, regions
12 The
OpenMP Programming Model: directives
13 The
OpenMP Programming Model: number of threads, compiling
14 How do
Threads Interact?
15 The
Basics of OpenMP
16 General
Syntax Rules: directives
17 General
Syntax Rules: structured blocks
18 The
Parallel Region: Fortran
19 The
Parallel Region: C/C++
20 The
Parallel Region: thread rules
21 The
Parallel Region: thread number
22 Parallel
Regions: Execution Modes
23 OpenMP
Directive Clauses
24 The
private, default and if clauses
25
firstprivate
26 OpenMP
Work-Sharing Constructs: overview
27 OpenMP
Work-Sharing Constructs - for/DO
28 Detailed
syntax - for
29 Detailed
syntax - DO
30 The
schedule(type,[chunk]) clause
31
schedule(static)
32
schedule(static,chunk)
33
schedule(dynamic,chunk)
34
schedule(guided,chunk)
35
schedule(runtime)
36
lastprivate
37
reduction(operator|instrinsic:var1[,var2])
38
reduction(operator|instrinsic:var1[,var2])
39 OpenMP
Work-Sharing Constructs - sections
40 OpenMP
Work-Sharing Constructs - sections: Fortran
41 OpenMP
Work-Sharing Constructs - sections: C
42 OpenMP
Work-Sharing Constructs - single
43 OpenMP
Work-Sharing Constructs - single: Fortran
44 OpenMP
Work-Sharing Constructs - single: C
45 Combined
Parallel Work-Sharing Constructs
46 Parallel
DO/for Directive
47 Parallel
sections Directive
48 OpenMP
Environment Variables
49 OpenMP
Environment Variables: examples
50 OpenMP
Runtime Environment Routines
51 OpenMP
Runtime Environment Routines
52 OpenMP
Runtime Environment Routines: rules
53
Interlude: Data Dependencies
54 Data
Dependencies (cont.)
55 Data
Dependencies (cont.)
56 Data
Dependencies (cont.)
57 Data
Dependencies (cont.)
58
Minimizing the Cost of a Recurrence
59 Loop Nest
Parallization Possibilites
60 Loop Nest
Parallization Possibilites
61 OpenMP
Synchronization Constructs
62 OpenMP
Synchronization - critical Section
63 OpenMP
Synchronization - critical Section Example
64 OpenMP -
Critical Section Problem
65 OpenMP
Synchronization - atomic Update
66 OpenMP
Synchronization - atomic Update: example
67 OpenMP
Synchronization - barrier
68 OpenMP
Synchronization - master Region
69 OpenMP
Synchronization - master Region: example
70 OpenMP
Synchronization - ordered Region
71 OpenMP
Synchronization - ordered Region: example
72 OpenMP
Synchronization - flush Directive
73 Sample
Program: flush Directive
74 Sample
Program: flush Directive (cont.)
75 Some
Advanced Features of OpenMP
76 Advanced
Data Scoping - threadprivate Directive (Fortran)
77 Advanced
Data Scoping - threadprivate Directive (C/C++)
78 Sample
Program: threadprivate
79
Initializing threadprivate Variables - The copyin Clause
80 Sample
Program: The copyin Clause
81
"Orphaning" OpenMP Directives
82 Sample
Program: Orphaned parallel do
83 Sample
Program: Output
84 Advanced
Synchronization: Lock Functions (C/C++)
85 Advanced
Synchronization: Lock Functions (Fortran)
86 Lock
Functions: Example
87 Debugging
OpenMP Code
88 Debugging
OpenMP Code
89 Examples:
Race Conditions
90 Examples:
Race Conditions
91 Examples:
Race Conditions
92 Examples:
Deadlock
93 Examples:
Deadlock
94 Other
Danger Zones
95
Performance Tuning and OpenMP
96 Basic
Strategies
97 Automatic
Parallelization
98 Strategy
for Using Automatic Parallelization
99 Performance
Tuning: Example 1
100 Example
1(cont.)
101 Example
1 - Parallel Version
102
Performance Tuning: Example 2
103 The
Memory Hierarchy
104
Performance Tuning - Cache Locality
105
Performance Tuning - Data Locality
106 SGI
Origin 2000: Basic Architecture
107 SGI
Origin 2000: Basic Architecture
108 SGI
Origin 2000: Interconnection Network Performance
109 Data
Placement Techniques - "First-Touch Policy"
110 Example:
First Touch Policy
111
Incorrect use of First-touch Policy
112 The
Future of OpenMP
113 Current
and Future Releases
114 What's
Coming in OpenMP 2.0 (F2.0 rel 11/00)
115
References
116 OpenMP
Problem Set (1-3)
117 OpenMP
Problem Set (4-6)
118 OpenMP
Problem Set (5-9)
119 Appendix
A: Auto-Parallelization on the SGI Origin 2000
120 Using
the Auto-Parallelizer
121
Auto-Parallelizer Files
122 Example
Subroutine (successful)
123 Example
Subroutine (successful) (cont.)
124 Data
Dependence Example - Indirect Indexing
125 Data
Dependence Example - Function Call
126 Appendix
B: Data Placement Features - SGI Origin 2000
127
Automatic Page Migration
128 SGI
Extensions to OpenMP: Data Distribution Directives (F90 syntax)
129 Regular
Data Distributions (cont.)
130 Types of
Distributions
131 Data
Distribution Examples (4 threads)
132 The ONTO
Clause: Data Distribution
133 Data
Affinity Clause
134
page_place Directive
135 On whose
memory did my array element end up?
136
Reshaping Data Distributions
137 Reshape
Sample Program
138 Reshape
Sample Program: Output
139 The
price you pay...