CS 575 Supercomputing - Final Group Report - Performance on Rohan and SDSC's IBM Blue Horizon


Lecture: 03Dec03

Due 15Dec03 Group Point of Contact Email (with cc: to members) to stewart@rohan.sdsu.edu from your class account indicating your web report is ready in your public_html directory. As always, do not make this directory or file readable by others.
Due 15Dec03 Each Individual Group Member sends their own Contribution Form to stewart@rohan.sdsu.edu from your class account.

Dr. Kris Stewart (stewart@rohan.sdsu.edu)
San Diego State University

Purpose: To gain experience using MPI on two computing platforms, Rohan (SunFire 4800 at SDSU) and Blue Horizon (IBM SP2 at SDSC)

You have been provided a running code, producing the solution to a model of the Stommel Equation for Ocean Circulation, stf_00.f or stc_00.c, which you analyzed for its serial performance on Rohan using the autopar capabilities of the compiler. The goal in this assignment was to become familiar with the source code, become familiar with your group members, and develop techniques for working together productively.

You have been provided resources on MPI from our Class Home page. Today you are asked to take the MPI code for the 1D Domain Decomposition of the model using the Stommel Equation, stf_01.f90 or stc_01.c, and add additional MPI calls to create the 2D Domain Decomposition.

Lecture on today, presented the results running the instructors (student masc0155) 1D domain decomposition on Rohan: np=1, 2, and 4 Rohan

yielding for np = number of processors, and runtime = MPI-wallclock time,

np  runtime  
1    12.8
2     7.86
4     9.46 

Why does the performance improve at first, and then degrade?

Ask yourself: What is the ratio of boundary (communication) node to interior (computation) nodes?

Therefore we must look into making the problem larger, using a finer mesh (or grid). Examine the st.in input file

200 200
2000000 2000000
1.0e-9 2.25e-11 3.0e-6
750
What do these input values represent? In the file: stf_01.f90, the master process reads:
number of x-grid points and y-grid points,
length of the x-grid and y-grid,
parameters of the model,
number of iteration steps to take of the Jacobi-iteration process.
    if(myid .eq. mpi_master)then
        read(*,*)nx,ny
        read(*,*)lx,ly
        read(*,*)alpha,beta,gamma
        read(*,*)steps
    endif

When assigning a larger number of processors to work on your problem, your group should use a finer grid resolution, which will provide a more accurate model of the Stommel Equation. Note the 5-point stencil that is implemented in your Jacobi-iteration has a basic accuracy that is O(dx2) in the x-direction and O(dy2) in the y-direction. This was discussed earlier, "Stommel Model and MPI".

Working together as the member of the group, you are to test out your 2D enhancement of the 1D decomposition using MPI on Rohan until you have a running code. Gather experimental data with MPI Wall Clock Times output as metric of performance.

You will want to investigate the effective of optimization (default or -O3 for your C of Fortran 90) codes on both Rohan and Blue Horizon.

You will also want to examine the effect of finer grids and the performance that results on Rohan and then on Blue Horizon, using more processors on Blue Horizon, given the limit on interactive runs.

Since this final report will be a web-page only submission, your report needs to be organized and follow our standard template. At the end of your report, you should have

your appendix of data to support the conclusions made in your narrative
The Appendix of Data should consist of the information you compute that supports the conclusions that you reach in the narrative of your report. I would like the Appendix of Data to include both summarizing tables and the actual data what are being summarized.
a link to the code that you wrote
a link to a README file of how the code was compiled and linked and run - I expect a different README file for Rohan and for Blue Horizon.

Back to CS 575 Home Page