This URL is http://www.stewart.cs.sdsu.edu/cs575/HPC2text_toc.html
Preface
PART 1: Modern Computer Architectures
Chapter 1. What Is High Performance Computing?
Why Worry About Performance?
Scope of High Performance Computing
Studying High Performance Computing
Measuring Performance
The Next StepChapter 2. High Performance Microprocessors
Why CISC?
Fundamentals of RISC
Second-Generation RISC Processors
RISC Means Fast
Out-of-Order Execution: The Post-RISC Architecture
Future Trends: Intel IA-64 and EPIC
Closing NotesChapter 3. Memory
Memory Technology
Registers
Caches
Cache Organization
Virtual Memory
Improving Memory Performance
Closing NotesChapter 4. Floating-Point Numbers
Reality
Representation
Effects of Floating-Point Representation
Improving Accuracy Using Guard Digits
History of IEEE Floating-Point Format
IEEE Floating-Point Standard
IEEE Storage Format
IEEE Operations
Special Values
Exceptions and Traps
Compiler Issues
Closing NotesPART 2: Programming and Tuning Software
Chapter 5. What a Compiler Does
History of Compilers
Which Language to Optimize?
Optimizing Compiler Tour
Optimization Levels
Classical Optimizations
Closing NotesChapter 6. Timing and Profiling
Timing
Subroutine Profiling
Basic Block Profilers
Virtual Memory
Closing NotesChapter 7. Eliminating Clutter
Subroutine Calls
Branches
Branches Within Loops
Other Clutter
Closing NotesChapter 8. Loop Optimizations
Operation Counting
Basic Loop Unrolling
Qualifying Candidates for Loop Unrolling
Nested Loops
Loop Interchange
Memory Access Patterns
When Interchange Won't Work
Blocking to Ease Memory Access Patterns
Programs That Require More Memory Than You Have
Closing NotesPART 3: Shared-Memory Parallel Processors
Chapter 9. Understanding Parallelism
Dependencies
Loops
Loop-Carried Dependencies
Ambiguous References
Closing NotesChapter 10. Shared-Memory Multiprocessors
Symmetric Multiprocessing Hardware
Multiprocessor Software Concepts
Techniques for Multithreaded Programs
A Real Example
Closing NotesChapter 11. Programming Shared-Memory Multiprocessors
Automatic Parallelization
Assisting the Compiler
Closing NotesPART 4: Scalable Parallel Processing
Chapter 12. Large-Scale Parallel Computing
Amdahl's Law
Interconnect Technology
A Taxonomy of Parallel Architectures
A Survey of Parallel Architectures
The Top 500 Report
Shared Uniform Memory MIMD
Shared Non-Uniform Memory MIMD Systems
Distributed-Memory MIMD Architecture
Single Instruction, Multiple Data
Closing NotesChapter 13. Language Support for Performance
Data-Parallel Problem: Heat Flow
Explicitly Parallel Languages
FORTRAN 90
Problem Decomposition
High Performance FORTRAN (HPF)
Closing NotesChapter 14. Message-Passing Environments
Parallel Virtual Machine (PVM)
Message-Passing Interface (MPI)
Closing NotesPART 5: Benchmarking
Chapter 15. Using Published Benchmarks
User Benchmarks
Industry Benchmarks
Closing NotesChapter 16. Running Your Own Benchmarks
Choosing What to Benchmark
Types of Benchmarks
Preparing the Code
Closing NotesPART 6: Appendixes
Appendix A. Processor Architectures
Appendix B. Looking at Assembly Language
Appendix C. Future Trends: Intel IA-64
Appendix D. How FORTRAN Manages Threads at Runtime
Appendix E. Memory Performance
Index