Prof. David Harel

D. Harel, Algorithmics: The Spirit of Computing, Addison-Wesley, Reading, MA, 1st edition, 1987; 2nd edition, 1992. 3rd edition (with Y. Feldman), 2004. Special printing for the Turing Centennial year, published by Springer for the U.S. - Canada market.

(1st edn.: Dutch, 1989; Hebrew (Open University Press), 1991; 2nd edn.: Polish, 1992, 2001; 3rd edn.: Chinese, 2006; German, 2006; Italian, 2008.)

Table of Contents
Preface
New to the 2nd Edition
New to the 3rd Edition
Preface written for the 2012 Printing
Book Covers

~ Book info. from Amazon COM
~ Book info. from Amazon UK
~ Book info. from Barnes & Noble for Addison-Wesley edition
~ Book info. from Barnes & Noble for Springer edition

Table of Contents

Part I. Preliminaries

Introduction and historical review
or, What's it all about?
Algorithms and data
or, Getting it done
Programming languages and paradigms
or, Getting it done by computer

Part II. Methods and Analysis

Algorithmic methods
or, Getting it done methodically
The correctness of algorithms
or, Getting it done right
The efficiency of algorithms
or, Getting it done cheaply

Part III. Limitations and Robustness

Inefficiency and intractability
or, You can't always get it done cheaply
Noncomputability and undecidability
or, Sometimes you can't get it done at all!
Algorithmic universality and its robustness
or, The simplest machines that get it done

Part IV. Relaxing the Rules

Parallelism, concurrency and alternative models
or, Getting lots of stuff done at once
Probabilistic algorithms
or, Getting it done by tossing coins
Cryptography and reliable interaction
or, Getting it done in secret

Part V. The Bigger Picture

Software engineering
or, Getting it done when it's big
Reactive systems
or, Getting it to behave properly over time
Algorithmics and intelligence
or, Are they better at it than us?

Preface

Read this, I pray thee
ISAIAH 29:12

This book tells a story. The story concerns the concepts, ideas, methods and results fundamental to computer science. It is not specifically about computer technology, nor is it about computer programming, though obviously it is heavily influenced by both.

The book is intended to fill a rather disturbing gap in the literature related to the computer revolution. Scores of excellent books can be found on computers themselves, with details of their structure, workings, and operation. There are also numerous books about the act of writing programs for the computers in any of a growing number of languages. These books come at a wide range of levels, some aimed at people with no computer-related background at all, and some aimed at the most computer-literate professionals. In addition, there are many books on subjects peripheral to the technology, such as the social and legal aspects of the revolution, as well as books describing the relevance of computers to a variety of application areas. All this comes as no surprise. People are curious about computers, and want to learn how to put them to use. They are typically interested in specific kinds of computers, and often for specific purposes, too.

Then there are textbooks. Indeed, computer science is a fast-growing academic discipline, with ever-larger numbers of potential students knocking at the doors of admission offices. Well-established academic disciplines have a habit of yielding excellent textbooks, and computer science is no exception. Over the years many comprehensive and clearly written textbooks have appeared, containing detailed technical accounts of the subjects deemed appropriate to students of computer science. However, despite the dizzying speed with which some of the technological innovations become obsolete and are replaced by new ones, the fundamentals of the science of computation, and hence many of the basic concepts that are considered important in a computer science curriculum, change slowly, if at all. Of course, new technologies and new languages require revisions in scientific emphasis, which are eventually reflected in the scientific literature. However, by and large, there is almost universal agreement on a core of fundamental topics that computer science students should be taught.

It would appear that anyone associated with computers ought to be aware of these topics, and not only those who have decided to spend three or four years getting a particular kind of academic diploma. Moreover, given that a revolution is indeed taking place before our very eyes, many of these topics, and the special ways of thinking that go with them, ought to be available to the enquiring person even if that person is not directly associated with a computer at all.

Books concerned primarily with computers or programming are intended to fulfill quite different needs. Computers are made of bits and bytes, and programming is carried out using languages with rigid rules of grammar and punctuation. Consequently, computer books often suffer from the "bit/byte syndrome" and programming books from the "semicolon syndrome". In other words, the reader becomes predominantly involved in the principles of a particular computer or the syntactic rules of a particular programming language (or both). It would seem that things cannot be explained without first describing, in detail, either a machine or a medium for communicating with one (or both).

Many advanced textbooks do treat the fundamentals, but by their very nature they concentrate on specific topics, and do so at an advanced technical level that is usually unsuitable for the general reader. Even professional programmers and systems analysts might lack the background or motivation required to get through books aimed at full-time computer science students.

Curiously, there appears to be very little written material devoted to the science of computing and aimed at the technically-oriented general reader as well as the computer professional. This fact is doubly curious in view of the abundance of precisely this kind of literature in most other scientific areas, such as physics, biology, chemistry, and mathematics, not to mention humanities and the arts. There appears to be an acute need for a technically-detailed, expository account of the fundamentals of computer science; one that suffers as little as possible from the bit/byte or semicolon syndromes an their derivatives, one that transcends the technological and linguistic whirlpool of specifics, and one that is useful both to a sophisticated layperson and to a computer expert. It seems that we have all been too busy with the revolution to be bothered with satisfying such a need.

This book is an attempt in this direction. Its objective is to present a readable account of some of the mot important and basic topics of computer science, stressing the fundamental and robust nature of the science in a form that is virtually independent of the details of specific computers, languages, and formalisms.

This book grew out of a series of lectures given by the author on "Galei Zahal", one of Israel's national radio channels, between October 1984 and January 1985. It is about what shall be called algorithmics in this book, that is, the study of algorithms. An algorithm is an abstract recipe, prescribing a process that might be carried out by a human, by a computer, or by other means. It thus represents a very general concept, with numerous applications. Its principal interest and use, however, is in those areas where the process is to be carried out by a computer.

The book could be used as the basis of one-semester introductory course in computer science or a general computer science literacy course in science and engineering schools. Moreover, it can be used as supplementary reading in many kinds of computer-related educational activities, from basic programming courses to advanced graduate or undergraduate degree programs in computer science. The material covered herein, while not directly aimed at producing better programmers or system analysts, can aid people who work with computers by providing an overall picture of some of the most fundamental issues relevant to their work.

The preliminary chapters discuss the concept of an algorithmic problem and the algorithm that solves it, followed by cursory discussions of the structure of algorithms, the data they manipulate, and the languages in which they are programmed. With the stage thus set, the first chapter of Part Two turns to some general methods and paradigms for algorithmic design. This is followed by two chapters on the analysis of algorithms, treating, respectively, their correctness and efficiency (mainly time efficiency), including techniques for establishing the former and estimating the latter. Part Three of the book is devoted to the inherent limitations of effectively executable algorithms, and hence of the computers that implement them. Certain precisely defined problems, including important and practical ones, are shown to be provably not solvable by any computers of reasonable size in any reasonable amount of time (say, the lifetime of a person), and never will be. Worse still, it is shown that some problems are provably not solvable by computers at all, even with unlimited time! In Part Four of the book the requirements are relaxed, for example, by employing concurrent activities or coin tossing, in order to overcome some of these difficulties. These chapters also discuss reactive and distributed systems, and cryptography. Finally, the relationship of computers to human intelligence is discussed, emphasizing the "soft" heuristic, or intuitive, nature of the latter, and the problems involved in relating it to the "hard" scientific subject of algorithmics.

The book is intended to be read or studied sequentially, not be used as a reference. It is organized so that each chapter depends on the previous ones, but with smooth readability in mind. Most of the material in the preliminary Part One should be familiar to people with a background in programming. Thus, Chapters 1 and 2 and parts of Chapter 3 can be browsed through by such readers.

Certain sections contain relatively technical material and can be skipped by the reader without too much loss of continuity. They are indented, set in smaller type and are prefixed by a small square. It is recommended, however, that even those sections be skimmed, at least to get a superficial idea of their contents.

Whenever appropriate, brief discussions of the research topics that are of current interest to computer scientists are included. The text is followed by Bibliographic Notes for each chapter, with "backward" pointers connecting the discussions in the text with the relevant literature.

It is hoped that his book will facilitate communication between the various groups of people who are actively involved in the computer revolution, and between that group, and those who, for the time being, are observers only.

David Harel, Pittsburgh, PA; February 1987

New to the Second Edition

See, this is new; but it has already been.
ECCLESIASTES 1:10

The first edition of this book was intended to be read from beginning to end; it could also be used as a supplementary reading in a number of courses. Teaching a course based exclusively on it was possible, but would have required that the instructor prepare exercises and add examples and more detail in certain places. The present edition contains numerous exercises, as well as solutions to about a third of them. The solved exercises can thus be used to supplement the text.

Three chapters do not have exercises: Chapter 1 is an introduction, the bulk of Chapter 3 is really just a brief survey of several programming languages, and Chapter 12 is a nontechnical account of some topics in artificial intelligence. In a sense, these chapters are not integral parts of the topic of the book -- algorithmics -- and hence in teaching a course based on the book these should probably be assigned as homework reading.

Apart from the inclusion of exercises and solutions, which mark the most obvious change made in this edition, the text has been revised and updated. The reader may wonder why a more extensive revision of the text was not called for. Have computer scientists been idle during the five years since the first edition was published? Rather than taking this as a criticism of the field, I think that it shows that the topics selected for inclusion in the book are really of fundamental nature, so that no significant changes had to be made. The issues discussed herein are thus probably basic and lasting; maybe the term "classical" is most fitting.

David Harel, Rehovot, Israel; May, 1991

New to the Third Edition

they three were of one measure
EZEKIEL 40:10

This time around, a significant revision was carried out. There are several important changes in this edition of the book, compared to the first and second editions, including two brand new chapters, new sections, and more.

The first noticeable difference is that for this revision I needed real help..., and was fortunately joined by Yishai Feldman. He has taken part in all aspects of the revision, but most significantly took upon himself the thorough revision of the material on programming languages and the writing of the new chapter on software engineering .

The main changes are as follows:

The book now has five Parts, rather than four. In Part I (Preliminaries) Chapter 3 has been completely rewritten, and is now titled "Programming Languages and Paradigms." The list of languages discussed has been revised and is organized into paradigms, thus giving a far more informative and updated exposition of the media we use when we program computers. Discussions of some languages (e.g., APL and Snobol) have been dropped altogether and those of others (e.g., C, C++ and Java) have been added.

Part II (Methods and Analysis) and Part III (Limitations and Robustness), i.e., Chapters 4 through 9, required no sweeping changes. This can again be attributed to the "classical" nature of the topics chosen for these, as mentioned in the ``New to the Second Edition" section above.

The first chapter of Part IV (Relaxing the Rules) was previously titled "Parallelism and Concurrency" and is now called "Parallelism, Concurrency, and Alternative Models." It incorporates new sections on quantum computing, including Shor's factoring algorithm, and a discussion of molecular computing. These topics may be considered to be additional forms of parallelism, albeit more radical ones. The remaining two chapters of Part IV were constructed by separating out the material on probabilistic algorithms (Chapter 11) from that on cryptography
(now Chapter 12) -- presented together in a single chapter in the previous editions -- and extending both by discussions of some of the new developments in these fields.

Part V (The Bigger Picture) ends with the closing chapter of the previous editions, "Algorithms and Intelligence," which is now Chapter 15. However, this is now preceded by two new chapters. Chapter 13, "Software Engineering," and Chapter 14, "Reactive Systems." The first of these is an attempt to provide a general
introduction to the issues and problems arising in the development of large software systems. The second new chapter zeros in on the particular difficulties arising in the special case of reactive systems, as a result of their complex behavior over time.

Besides these more noticeable changes, the entire text has been brought up to date in many less subtle and more subtle ways. There are discussions on abstract data types, on the non-approximability of certain NP-complete problems, on probabilistically checkable proofs, and, of course, on the brand new AKS polynomial-time algorithm for primality. The final chapter has been modified in many places too, e.g., with a discussion added on the Chinese room argument.

While we have left the exercises and solutions essentially as they were in the second edition, the bibliographic notes were a completely different story. Twelve years in Computer Science is almost an eternity... The format of the notes is the same as in the previous editions; i.e., a general section at the start of each chapter, which lists relevant books and periodicals, followed by detailed notes that progress with the text of the chapter itself and point back to its page numbers. In revising them, we had to prepare new notes for the large amount of newly added
material, of course, but we also had to painstakingly reconsider and thoroughly revise the entire set of existing notes. Hopefully, the result of all of this will turn out to be a useful and up-to-date tool linking the text of this expository book with the accepted archival scientific literature.

Now that the revision is done, if hard-pressed to give my list of the most significant developments in pure, "classical" algorithmics (i.e., excluding software and systems engineering) in the last dozen or so years, it would probably contain three: the non-approximability results for NP-complete problems, Shor's QPTIME factoring algorithm, and the AKS PTIME primality test. And all I can say about these is this: wouldn't it be wonderful if the
bulk of the work on the next edition of this book -- if and when, of course -- will be spent on results of similar calibre and importance.

David Harel, Rehovot, Israel; August, 2003

a threefold cord is not quickly broken
ECCLESIASTES 4:12

Write the vision, and make it plain upon tablets,
that he who reads it may run
HABAKKUK 2:2

Preface written for the 2012 Printing

The first edition of this book was published 25 years ago, in 1987. Second and third editions were published in 1992 and 2004, respectively (with Yishai Feldman joining the "team" for the 3rd edition). The preface you are now reading accompanies a special 2012 reprint of the book, published to celebrate 25 years of the its existence, and, more significantly, the centennial year of Alan M. Turing.

Alan Mathison Turing (b. 23 June 1912, d. 7 June 1954) will, in all probability, be remembered as one of the thinkers with the greatest impact ever, and is definitely one of the most important scientists of the 20th century. This is not the appropriate place to describe in any detail Turing's contributions to computing and to humanity in general. The interested reader can find lots of relevant material in the many articles and books written about him. However, in terms of the Algorithmics, the topic of this book, it would not be an exaggeration to say that Turing is the grand ancestor of several of the key ideas and subtopics thereof. Perhaps most significantly, Turing's insights from the mid-1930s, alongside the work of Alonzo Church and others, formed the foundations of our understanding that the general notion of computing, and thus of algorithms and the actual computers that implement them, are severely limited. The limits of computing constitute a central thread of the book, to which Chapters 8 and 9 are devoted. In that respect, Turing's name is associated with both the Church-Turing thesis and the Turing machine, two of the most fundamental notions discussed in these chapters.

Another of Turing's pioneering contributions to computer science revolves around his deep insights into what later became known as artificial intelligence (the person who coined the term, John McCarthy, passed away in late 2011). Turing's test for computerized artificial intelligence is also central to the book and is discussed in detail towards its end, in Chapter 15.

As is well-known, Turing was also instrumental in the code-breaking efforts in the World War II, and most notable is his work on the Enigma code. Chapter 12 is devoted to cryptography, and although Turing's work is not mentioned there explicitly, it played a classical and crucial part in the development of the field. Turing also carried out pioneering work on morphogenesis and the way patterns are formed in the process. In modern terms this work would be considered part of systems biology or bioinformatics, topics that are not discussed in the present book.

On a more personal level, but without getting into any details here, I would like to add that large parts of my own research in the last 38 years can be viewed as very modest continuations and extensions of the work of Turing. Thus, to a large extent I am but a dwarf standing on the shoulders of a true giant; the true giant of our field. I am thus extremely happy that Springer has agreed to publish this new printing of Algorithmics. For me, and Yishai Feldman joins me in this, it is a true celebration by any measure!

In terms of the basic fundamentals of algorithmics (that is, if we exclude the more practical and faster-changing material of Chapters 3, 13, and 14) little in the book has to be changed. None of the central open problems therein have been resolved, none of the basic notions underlying the topics therein have undergone a major modification, and very few of the new notions that have been defined since 2000 seem to deserve a place alongside the fundamental ones that are included. Thus, even had we decided to go for a fully-fledged new edition of the book, rather than merely a new printing, the text would have undergone only relatively minor changes. The next few paragraphs contain very brief discussions about a few of the relevant things that have happened in the last few years (thanks to Uri Feige for helping me compile this list, and, of course, to my co-author Yishai Feldman). The reader can read these now or come back to them after reading the book itself.

Towards the end of Chapter 4 there is a discussion of approximating a network coloring. In recent years, strong new results have been proved, culminating in the fact that approximating a coloring by any constant (not just the 50% factor mentioned in the text) is also as hard as solving the P=NP problem.

Parallelism, as discussed in the first parts of Chapter 10, has become more and more crucial recently, in part because of the change in hardware trends. The exponential increase in single-processor power tapered off several years ago, being replaced by the development of multiple-core chips. Four cores per chip are common these days, and the numbers are expected to increase drastically, at the expense of single-core performance. In order to take advantage of these new processors, new algorithmic and programming techniques are necessary. One popular technique is map-reduce, inspired by functional programming (as described in Chapter 3). It is a way of dividing computation on large amounts of data into parts that are performed on each piece separately, the results being combined using an appropriate accumulation function. This style of programming is not appropriate for every problem, but it lends itself very well to parallelization. Parallelism is thus becoming an even more central and crucial topic in computer science, so that it is probably fair to say that resolving the open problems in this area has become a lot more urgent.

Another topic central to Chapter 10 is quantum computing. Here the main thing to mention is the existence of larger quantum computers. The text mentions that at the time of its publication the largest quantum computer actually built consisted of seven qubits. This number has grown steadily in recent years, and while the jury isn't in on the exact current number (among other things this has to do with whether one has a general purpose or special purpose machine), a company called D-Wave Systems has been working recently with 128-qubit chips, and is said to be developing a 512-qubit one. What relevance this will have to the fundamental issues of quantum computation discussed in the text remains to be seen.

As to randomized and probabilistic algorithms, the topic of Chapter 11, there has been a lot of interesting research done in recent years, much of it around the random classes RP and BPP and their connections to pseudo-random number generators. In fact, the text says that some researchers believe that RP is strictly larger than PTIME. That may still be true, but in recent years you will also find many who don't.

The last chapter of the book, Chapter 15, on artificial intelligence and heuristics, is the one that could really do with a facelift. While the main issues raised there still stand strong (e.g., passing the Turing test is still nowhere in sight), there have been several exciting developments. These include significant improvements in computerized chess, and great improvements in natural language understanding and translation. One well-known example is IBM's Watson machine, which beat the top two human contestants in Jeopardy! (the popular American quiz show) in a 3-part match in early 2011. Watson exhibits an impressive ability to "understand" highly ambiguous language and to deal with situations that have long been associated exclusively with human talent. However, Watson is strongly based on statistical techniques rather than classical knowledge representation, continuing the trend discussed towards the end of Chapter 15. These techniques seem to have great potential for intelligent search in areas such as medicine, law, and others. In general, the tools underlying heavy-duty artificial intelligence applications are becoming more powerful, such as powerful new SAT solvers, which are very successful in practice, though in the worst case they don't do as well.

Finally, I would like to add that one of the most impressive and potentially revolutionary uses of computer science in the last 10-15 years has been in the life sciences. True, bioinformatics has been around for longer than that, but the use of deep techniques from algorithmics and system and software engineering in systems biology and the modeling and analysis of biological systems has grown by an order of magnitude in recent years. I share with others the opinion and belief that computer science will play a role in the science of the 21st century (which is poised to be the century of the life sciences) similar to the role played by mathematics in the physical sciences of the 20th century. Thus, chances are that the subject matter of this book will only become more relevant and of more interest.

David Harel, Rehovot; February, 2012