Parallel Programming with OpenMP

 

1 Parallel Programming with OpenMP

2 Outline

3 Setting the State

4 Overview of Parallel Programming

5 Writing a Parallel Application

6 Parallel Architectures

7 Parallel Programming Models

8 Parallel Computing: Hardware

9 Parallel Computing: Software

10 Introduction to OpenMP

11 The OpenMP Programming Model: master/slave, regions

12 The OpenMP Programming Model: directives

13 The OpenMP Programming Model: number of threads, compiling

14 How do Threads Interact?

15 The Basics of OpenMP

16 General Syntax Rules: directives

17 General Syntax Rules: structured blocks

18 The Parallel Region: Fortran

19 The Parallel Region: C/C++

20 The Parallel Region: thread rules

21 The Parallel Region: thread number

22 Parallel Regions: Execution Modes

23 OpenMP Directive Clauses

24 The private, default and if clauses

25 firstprivate

26 OpenMP Work-Sharing Constructs: overview

27 OpenMP Work-Sharing Constructs - for/DO

28 Detailed syntax - for

29 Detailed syntax - DO

30 The schedule(type,[chunk]) clause

31 schedule(static)

32 schedule(static,chunk)

33 schedule(dynamic,chunk)

34 schedule(guided,chunk)

35 schedule(runtime)

36 lastprivate

37 reduction(operator|instrinsic:var1[,var2])

38 reduction(operator|instrinsic:var1[,var2])

39 OpenMP Work-Sharing Constructs - sections

40 OpenMP Work-Sharing Constructs - sections: Fortran

41 OpenMP Work-Sharing Constructs - sections: C

42 OpenMP Work-Sharing Constructs - single

43 OpenMP Work-Sharing Constructs - single: Fortran

44 OpenMP Work-Sharing Constructs - single: C

45 Combined Parallel Work-Sharing Constructs

46 Parallel DO/for Directive

47 Parallel sections Directive

48 OpenMP Environment Variables

49 OpenMP Environment Variables: examples

50 OpenMP Runtime Environment Routines

51 OpenMP Runtime Environment Routines

52 OpenMP Runtime Environment Routines: rules

53 Interlude: Data Dependencies

54 Data Dependencies (cont.)

55 Data Dependencies (cont.)

56 Data Dependencies (cont.)

57 Data Dependencies (cont.)

58 Minimizing the Cost of a Recurrence

59 Loop Nest Parallization Possibilites

60 Loop Nest Parallization Possibilites

61 OpenMP Synchronization Constructs

62 OpenMP Synchronization - critical Section

63 OpenMP Synchronization - critical Section Example

64 OpenMP - Critical Section Problem

65 OpenMP Synchronization - atomic Update

66 OpenMP Synchronization - atomic Update: example

67 OpenMP Synchronization - barrier

68 OpenMP Synchronization - master Region

69 OpenMP Synchronization - master Region: example

70 OpenMP Synchronization - ordered Region

71 OpenMP Synchronization - ordered Region: example

72 OpenMP Synchronization - flush Directive

73 Sample Program: flush Directive

74 Sample Program: flush Directive (cont.)

75 Some Advanced Features of OpenMP

76 Advanced Data Scoping - threadprivate Directive (Fortran)

77 Advanced Data Scoping - threadprivate Directive (C/C++)

78 Sample Program: threadprivate

79 Initializing threadprivate Variables - The copyin Clause

80 Sample Program: The copyin Clause

81 "Orphaning" OpenMP Directives

82 Sample Program: Orphaned parallel do

83 Sample Program: Output

84 Advanced Synchronization: Lock Functions (C/C++)

85 Advanced Synchronization: Lock Functions (Fortran)

86 Lock Functions: Example

87 Debugging OpenMP Code

88 Debugging OpenMP Code

89 Examples: Race Conditions

90 Examples: Race Conditions

91 Examples: Race Conditions

92 Examples: Deadlock

93 Examples: Deadlock

94 Other Danger Zones

95 Performance Tuning and OpenMP

96 Basic Strategies

97 Automatic Parallelization

98 Strategy for Using Automatic Parallelization

99 Performance Tuning: Example 1

100 Example 1(cont.)

101 Example 1 - Parallel Version

102 Performance Tuning: Example 2

103 The Memory Hierarchy

104 Performance Tuning - Cache Locality

105 Performance Tuning - Data Locality

106 SGI Origin 2000: Basic Architecture

107 SGI Origin 2000: Basic Architecture

108 SGI Origin 2000: Interconnection Network Performance

109 Data Placement Techniques - "First-Touch Policy"

110 Example: First Touch Policy

111 Incorrect use of First-touch Policy

112 The Future of OpenMP

113 Current and Future Releases

114 What's Coming in OpenMP 2.0 (F2.0 rel 11/00)

115 References

116 OpenMP Problem Set (1-3)

117 OpenMP Problem Set (4-6)

118 OpenMP Problem Set (5-9)

119 Appendix A: Auto-Parallelization on the SGI Origin 2000

120 Using the Auto-Parallelizer

121 Auto-Parallelizer Files

122 Example Subroutine (successful)

123 Example Subroutine (successful) (cont.)

124 Data Dependence Example - Indirect Indexing

125 Data Dependence Example - Function Call

126 Appendix B: Data Placement Features - SGI Origin 2000

127 Automatic Page Migration

128 SGI Extensions to OpenMP: Data Distribution Directives (F90 syntax)

129 Regular Data Distributions (cont.)

130 Types of Distributions

131 Data Distribution Examples (4 threads)

132 The ONTO Clause: Data Distribution

133 Data Affinity Clause

134 page_place Directive

135 On whose memory did my array element end up?

136 Reshaping Data Distributions

137 Reshape Sample Program

138 Reshape Sample Program: Output

139 The price you pay...