How Easy is to Think in Parallel?

How Easy is to Think in Parallel?

P. Takis Metaxas

Computer Science Department

Wellesley College

Introduction

Several courses on parallel computation are currently taught in many schools nationwide. While research schools are often able to offer many course in the area, smaller schools usually offer a single introductory course. In this paper we describe our approach and experience in offering such an introductory course.

There is no widely acceptable content for these courses, in terms of both syllabus and level of difficulty.

The Course

In the Fall of 1993 we taught a course entitled "How to Tame 60,000 Processors which we developed from scratch the previous Summer. This course was essentially a broad introduction to theory and programming of parallel computation, and was geared towards the senior level. (Commenting on the term "broad", a colleague described it as "squeezing 10 years of research into one semester".) In this abstract we wish to briefly mention the scope of the syllabus and the materials developed, and outline the experience gained.

The course was aimed at the senior level. As only prerequisite we listed our course that corresponds to ACM's CS2, however we felt that a course on fundamental algorithms could help considerably. The aims of the course were to:

(1) Discuss the need for parallel machines.

(2) Link computation with sciences and real-life problems.

(3) Introduce the different models of parallel computation and their existing implementations today.

(4) Discuss fundamental algorithmic techniques for specific models.

(5) Teach parallel programming.

In the appendix we present the intended syllabus of the course.

After searching extensively the literature (as well as posting the question to parallel community network) we concluded that there was no book published that could be used as a text for our course. At the same time we did not want to make this a seminar-type course in which students would have to read from original research papers, not only because one would not be able to cover all of the intended material in one semester, but also because we have reservations on the effectiveness and pedagogical value of such an approach when introducing a new area at the undergraduate level.

Teaching parallelism presents also an extra challenge: It is not easy to visualize the steps taken by a parallel program when you have to keep track of data structures that are being shared by many, potentially thousands of processors. So we wanted to provide some visualization software that could be used to facilitate teaching. Choosing the right language for this course was another non trivial decision, but we eventually adopted Parallaxis 2.0. Finally, we needed to develop a set of assignments and exams that would be used to test the progress of our students.

In the Summer, 1993, we developed an extended set of notes (about 200 TeX-typesetted pages), that, in addition to two sections of Leighton's book were successfully used as a text. In the near future we plan to complete these notes and compile them into a book. We also developed 14 algorithm animations of some of the parallel algorithms described in the notes using Hypercard, which were used along with few animations developed at Connecticut College. Finally, a few dozen of problems and their solutions were developed and used as weekly assignments. The human genome project required considerable time for covering background information and was abandoned because of the very tight semester. We are not happy with this decision, since we think that joining computing with sciences is an aspect that should be given priority. In its place, as a short project, we used parts of the Game of Life developed by Schaller and McCreary.

In general the whole experiment was a success, and we plan to pursue this project further. Some of the areas that need further development are: (a) Selection of a subset of the syllabus as the core of the course (since we found in practice that the syllabus was in fact, too broad). (b) Programming experience on real parallel machines, so that the excitement of learning can be kept high. (c) A strong programming environment (because in the 90's just a programming language is not enough). (d) A more careful description of the short project. (e) Treat carefully some subjects that may seem simple to describe, but are unexpectedly difficult to digest by students that have spent the four last years thinking in a different computing model. Tree contraction falls in this category.

Appendix

The intended syllabus of the course appears below. Starred items were not covered in Fall, 1993. A different project were implemented, as described in the abstract.

CS349: How to Tame 60,000 Processors:

Parallel Machines and Their Algorithms

Syllabus

PART I: Basics

1. Why parallel Computation?

Introduction.

Computation in scientific problems: Biology and Astronomy.

Discussion of the need for parallel computers.

History of the efforts to create successful parallel machines.

2. Parallel models and implementations.

Shared memory (SM) model.

The PRAM model

Implementations of SM prototypes.

3. Fundamental algorithmic techniques.

Prefix sum

List ranking

Tree contraction

Pseudotree contraction

Eulerian tour on trees

Tree nodes ordering

Edge-list augmentation

PART II: Interconnection networks architectures:

1. Linear Arrays

Description and Properties

Num of procs, diameter, I/O bandwidth, bisection width, special properties

Algorithms

0-time sorting, odd-even transpos, vector convolution, matrix-vector multiplic.

2. Meshes

Description and Properties

Algorithms

matrix-matrix mult, sorting

Related networks (3-D Meshes, multigrid, Pyramid)

3. Trees

Description and Properties

Algorithms

Prefix sum, counting

Related Network (fat trees)

4. Meshes of Trees (MOT)s

Description and Properties

Algorithms

simulating bipartite graph, sorting, matrix-vector mult, MST, cc, shortest paths

Related Networks (3-D MOTs)

5. Hypercubes

Description and Properties

Embeddings (linear arrays, multidimensional arrays, complete binary trees)

Related Networks (CCC, butterfly, shuffle-exchange, deBruijn)

PART III : Algorithms

1. Parallaxis: a flexible parallel programming language

Network definitions and algorithm implementations on Parallaxis

2. Parallel sorting on a PRAM, mesh, MOT

Batcher's odd-even mergesort

Odd-even transposition

Mergesort.

3. Graph algorithms

Connectivity

Eulerian Tours

*Biconnectivity

Minimum spanning trees

*Ear decomposition

*st -numbering

*Shortest paths.

*4. Advanced algorithms

Maximal independent set

Database search

Matchings.

*5. Theory.

Model simulations

Lower bounds

P-complete problems.

PART IV : Parallelism in Scientific Problems

*1. The Human Genome Project

Genome Sequence

Simulation and complexity of Human genome mapping

Term Project.