D. Harel, Algorithmics: The
Spirit of Computing, Addison-Wesley, Reading, MA, 1st
edition, 1987; 2nd edition, 1992. 3rd
edition (with Y. Feldman), 2004. Special printing for the Turing Centennial year, published by Springer for the U.S. - Canada market.
(1st edn.: Dutch, 1989; Hebrew (Open University Press), 1991;
2nd edn.: Polish, 1992, 2001; 3rd edn.: Chinese, 2006; German, 2006; Italian, 2008.)
Table of Contents
Part I. Preliminaries
- Introduction and historical review
or, What's it all about?
- Algorithms and data
or, Getting it done
- Programming languages and paradigms
or, Getting it done by computer
Part II. Methods and Analysis
- Algorithmic methods
or, Getting it done methodically
- The correctness of algorithms
or, Getting it done right
- The efficiency of algorithms
or, Getting it done cheaply
Part III. Limitations and Robustness
- Inefficiency and intractability
or, You can't always get it done cheaply
- Noncomputability and undecidability
or, Sometimes you can't get it done at all!
- Algorithmic universality and its robustness
or, The simplest machines that get it done
Part IV. Relaxing the Rules
- Parallelism, concurrency and alternative models
or, Getting lots of stuff done at once
- Probabilistic algorithms
or, Getting it done by tossing coins
- Cryptography and reliable interaction
or, Getting it done in secret
Part V. The Bigger Picture
- Software engineering
or, Getting it done when it's big
- Reactive systems
or, Getting it to behave properly over time
- Algorithmics and intelligence
or, Are they better at it than us?
Back to Top
Preface
Read this, I pray thee
ISAIAH 29:12
This book tells a story. The story concerns the concepts, ideas, methods
and results fundamental to computer science. It is not specifically about
computer technology, nor is it about computer programming, though obviously
it is heavily influenced by both.
The book is intended to fill a rather disturbing gap in the literature
related to the computer revolution. Scores of excellent books can be found
on computers themselves, with details of their structure, workings, and
operation. There are also numerous books about the act of writing programs
for the computers in any of a growing number of languages. These books
come at a wide range of levels, some aimed at people with no computer-related
background at all, and some aimed at the most computer-literate professionals.
In addition, there are many books on subjects peripheral to the technology,
such as the social and legal aspects of the revolution, as well as books
describing the relevance of computers to a variety of application areas.
All this comes as no surprise. People are curious about computers, and
want to learn how to put them to use. They are typically interested in
specific kinds of computers, and often for specific purposes, too.
Then there are textbooks. Indeed, computer science is a fast-growing
academic discipline, with ever-larger numbers of potential students knocking
at the doors of admission offices. Well-established academic disciplines
have a habit of yielding excellent textbooks, and computer science is
no
exception. Over the years many comprehensive and clearly written textbooks
have appeared, containing detailed technical accounts of the subjects
deemed appropriate to students of computer science. However, despite the
dizzying speed with which some of the technological innovations
become obsolete and are replaced by new ones, the fundamentals of the
science of computation, and hence many of the basic concepts that are
considered important in a computer science curriculum, change slowly,
if at all. Of course, new technologies and new languages require revisions
in
scientific emphasis, which are eventually reflected in the scientific
literature. However, by and large, there is almost universal agreement
on a core
of fundamental topics that computer science students should be taught.
It would appear that anyone associated with computers ought to be aware
of these topics, and not only those who have decided to spend three or
four years getting a particular kind of academic diploma. Moreover, given
that a revolution is indeed taking place before our very eyes, many of
these topics, and the special ways of thinking that go with them, ought
to be available to the enquiring person even if that person is not directly
associated with a computer at all.
Books concerned primarily with computers or programming are intended
to fulfill quite different needs. Computers are made of bits and bytes,
and
programming is carried out using languages with rigid rules of grammar
and punctuation. Consequently, computer books often suffer from the
"bit/byte syndrome" and programming books from the "semicolon
syndrome". In other words, the reader becomes predominantly involved
in the
principles of a particular computer or the syntactic rules of a particular
programming language (or both). It would seem that things cannot be
explained without first describing, in detail, either a machine or a medium
for communicating with one (or both).
Many advanced textbooks do treat the fundamentals, but by their very
nature they concentrate on specific topics, and do so at an advanced
technical level that is usually unsuitable for the general reader. Even
professional programmers and systems analysts might lack the background
or
motivation required to get through books aimed at full-time computer science
students.
Curiously, there appears to be very little written material devoted to
the science of computing and aimed at the technically-oriented general
reader
as well as the computer professional. This fact is doubly curious in view
of the abundance of precisely this kind of literature in most other scientific
areas, such as physics, biology, chemistry, and mathematics, not to mention
humanities and the arts. There appears to be an acute need for a
technically-detailed, expository account of the fundamentals of computer
science; one that suffers as little as possible from the bit/byte or semicolon
syndromes an their derivatives, one that transcends the technological
and linguistic whirlpool of specifics, and one that is useful both to
a
sophisticated layperson and to a computer expert. It seems that we have
all been too busy with the revolution to be bothered with satisfying such
a
need.
This book is an attempt in this direction. Its objective is to present
a readable account of some of the mot important and basic topics of computer
science, stressing the fundamental and robust nature of the science in
a form that is virtually independent of the details of specific computers,
languages, and formalisms.
This book grew out of a series of lectures given by the author on "Galei
Zahal", one of Israel's national radio channels, between October
1984 and January 1985. It is about what shall be called algorithmics in
this book, that is, the study of algorithms. An algorithm is an
abstract recipe, prescribing a process that might be carried out by a
human, by a computer, or by other means. It thus represents a very general
concept, with numerous applications. Its principal interest and use, however,
is in those areas where the process is to be carried out by a computer.
The book could be used as the basis of one-semester introductory course
in computer science or a general computer science literacy course in science
and engineering schools. Moreover, it can be used as supplementary reading
in many kinds of computer-related educational activities, from basic programming
courses to advanced graduate or undergraduate degree programs in computer
science. The material covered herein, while not directly aimed at producing
better programmers or system analysts, can aid people who work with computers
by providing an overall picture of some of the most fundamental issues
relevant to their work.
The preliminary chapters discuss the concept of an algorithmic problem
and the algorithm that solves it, followed by cursory discussions of the
structure of algorithms, the data they manipulate, and the languages in
which they are programmed. With the stage thus set, the first chapter
of Part Two turns to some general methods and paradigms for algorithmic
design. This is followed by two chapters on the analysis of algorithms,
treating, respectively, their correctness and efficiency
(mainly time efficiency), including techniques for establishing the former
and estimating the latter. Part Three of the book is devoted to the inherent
limitations of effectively executable algorithms, and hence of the
computers that implement them. Certain precisely defined problems, including
important and practical ones, are shown to be provably not solvable by
any computers of reasonable size in any reasonable amount of time (say,
the lifetime of a person), and never will be. Worse still, it is shown
that some problems are provably not solvable by computers at all, even
with unlimited time! In Part Four of the book the requirements are relaxed,
for example, by employing concurrent activities or coin tossing,
in order to overcome some of these difficulties. These chapters also discuss
reactive and distributed systems, and cryptography. Finally, the relationship
of computers to human intelligence is discussed, emphasizing the
"soft" heuristic, or intuitive, nature of the latter, and the
problems involved in relating it to the "hard" scientific subject
of algorithmics.
The book is intended to be read or studied sequentially, not be used
as a reference. It is organized so that each chapter depends on the previous
ones, but with smooth readability in mind. Most of the material in the
preliminary Part One should be familiar to people with a background in
programming. Thus, Chapters 1 and 2 and parts of Chapter 3 can be browsed
through by such readers.
Certain sections contain relatively technical material and can be skipped
by the reader without too much loss of continuity. They are indented,
set in
smaller type and are prefixed by a small square. It is recommended, however,
that even those sections be skimmed, at least to get a superficial
idea of their contents.
Whenever appropriate, brief discussions of the research topics that are
of current interest to computer scientists are included. The text is followed
by Bibliographic Notes for each chapter, with "backward" pointers
connecting the discussions in the text with the relevant literature.
It is hoped that his book will facilitate communication between the various
groups of people who are actively involved in the computer revolution,
and between that group, and those who, for the time being, are observers
only.
David Harel, Pittsburgh, PA; February 1987
New to the Second
Edition
See, this is new; but it has already
been.
ECCLESIASTES 1:10
The first edition of this book was intended to be read from beginning
to end; it could also be used as a supplementary reading in a number of
courses. Teaching a course based exclusively on it was possible, but would
have required that the instructor prepare exercises and add examples
and more detail in certain places. The present edition contains numerous
exercises, as well as solutions to about a third of them. The solved
exercises can thus be used to supplement the text.
Three chapters do not have exercises: Chapter 1 is an introduction, the
bulk of Chapter 3 is really just a brief survey of several programming
languages, and Chapter 12 is a nontechnical account of some topics in
artificial intelligence. In a sense, these chapters are not integral parts
of the
topic of the book -- algorithmics -- and hence in teaching a course based
on the book these should probably be assigned as homework reading.
Apart from the inclusion of exercises and solutions, which mark the most
obvious change made in this edition, the text has been revised and
updated. The reader may wonder why a more extensive revision of the text
was not called for. Have computer scientists been idle during the five
years since the first edition was published? Rather than taking this as
a criticism of the field, I think that it shows that the topics selected
for inclusion
in the book are really of fundamental nature, so that no significant changes
had to be made. The issues discussed herein are thus probably basic
and lasting; maybe the term "classical" is most fitting.
David Harel, Rehovot, Israel; May, 1991
New to the Third
Edition
they three were of one
measure
EZEKIEL 40:10
This time around, a significant revision was carried out. There are
several important changes in this edition of the book, compared to the
first and second editions, including two brand new chapters, new sections,
and more.
The first noticeable difference is that for this revision I needed real
help..., and was fortunately joined by Yishai Feldman. He has taken part
in all aspects of the revision, but most significantly took upon himself
the thorough revision of the material on programming languages and the
writing of the new chapter on software engineering .
The main changes are as follows:
The book now has five Parts, rather than four. In Part I (Preliminaries)
Chapter 3 has been completely rewritten, and is now titled "Programming
Languages and Paradigms." The list of languages discussed has been revised
and is organized into paradigms, thus giving a far more informative and
updated exposition of the media we use when we program computers.
Discussions of some languages (e.g., APL and Snobol) have been dropped
altogether and those of others (e.g., C, C++ and Java) have been added.
Part II (Methods and Analysis) and Part III (Limitations and
Robustness), i.e., Chapters 4 through 9, required no sweeping changes.
This can again be attributed to the "classical" nature of the topics
chosen for these, as mentioned in the ``New to the Second Edition" section
above.
The first chapter of Part IV (Relaxing the Rules) was previously
titled "Parallelism and Concurrency" and is now called "Parallelism,
Concurrency, and Alternative Models." It incorporates new sections on
quantum computing, including Shor's factoring algorithm, and a discussion
of molecular computing. These topics may be considered to be additional
forms of parallelism, albeit more radical ones. The remaining two chapters
of Part IV were constructed by separating out the material on
probabilistic algorithms (Chapter 11) from that on cryptography
(now Chapter 12) -- presented together in a single chapter in the previous
editions -- and extending both by discussions of some of the new
developments in these fields.
Part V (The Bigger Picture) ends with the closing chapter of the
previous editions, "Algorithms and Intelligence," which is now Chapter 15.
However, this is now preceded by two new chapters. Chapter 13, "Software
Engineering," and Chapter 14, "Reactive Systems." The first of these is an
attempt to provide a general
introduction to the issues and problems arising in the development of
large software systems. The second new chapter zeros in on the particular
difficulties arising in the special case of reactive systems, as a result
of their complex behavior over time.
Besides these more noticeable changes, the entire text has been brought up
to date in many less subtle and more subtle ways. There are discussions on
abstract data types, on the non-approximability of certain NP-complete
problems, on probabilistically checkable proofs, and, of course, on the
brand new AKS polynomial-time algorithm for primality. The final chapter
has been modified in many places too, e.g., with a discussion added on the
Chinese room argument.
While we have left the exercises and solutions essentially as they were in
the second edition, the bibliographic notes were a completely different
story. Twelve years in Computer Science is almost an eternity... The
format of the notes is the same as in the previous editions; i.e., a
general section at the start of each chapter, which lists relevant books
and periodicals, followed by detailed notes that progress with the text of
the chapter itself and point back to its page numbers. In revising them,
we had to prepare new notes for the large amount of newly added
material, of course, but we also had to painstakingly reconsider and
thoroughly revise the entire set of existing notes. Hopefully, the result
of all of this will turn out to be a useful and up-to-date tool linking
the text of this expository book with the accepted archival scientific
literature.
Now that the revision is done, if hard-pressed to give my list of the most
significant developments in pure, "classical" algorithmics (i.e.,
excluding software and systems engineering) in the last dozen or so years,
it would probably contain three: the non-approximability results for
NP-complete problems, Shor's QPTIME factoring algorithm, and the AKS PTIME
primality test. And all I can say about these is this: wouldn't it be
wonderful if the
bulk of the work on the next edition of this book -- if and when, of
course -- will be spent on results of similar calibre and importance.
David Harel,
Rehovot, Israel; August, 2003
a threefold cord is not
quickly broken
ECCLESIASTES 4:12
Write the vision, and make it plain upon
tablets,
that he who reads it may run
HABAKKUK 2:2
Preface written for the 2012 Printing
The first edition of this book was published 25 years ago, in 1987. Second and third editions were published in 1992 and 2004, respectively (with Yishai Feldman joining the "team" for the 3rd edition). The preface you are now reading accompanies a special 2012 reprint of the book, published to celebrate 25 years of the its existence, and, more significantly, the centennial year of Alan M. Turing.
Alan Mathison Turing (b. 23 June 1912, d. 7 June 1954) will, in all probability, be remembered as one of the thinkers with the greatest impact ever, and is definitely one of the most important scientists of the 20th century. This is not the appropriate place to describe in any detail Turing's contributions to computing and to humanity in general. The interested reader can find lots of relevant material in the many articles and books written about him. However, in terms of the Algorithmics, the topic of this book, it would not be an exaggeration to say that Turing is the grand ancestor of several of the key ideas and subtopics thereof. Perhaps most significantly, Turing's insights from the mid-1930s, alongside the work of Alonzo Church and others, formed the foundations of our understanding that the general notion of computing, and thus of algorithms and the actual computers that implement them, are severely limited. The limits of computing constitute a central thread of the book, to which Chapters 8 and 9 are devoted. In that respect, Turing's name is associated with both the Church-Turing thesis and the Turing machine, two of the most fundamental notions discussed in these chapters.
Another of Turing's pioneering contributions to computer science revolves around his deep insights into what later became known as artificial intelligence (the person who coined the term, John McCarthy, passed away in late 2011). Turing's test for computerized artificial intelligence is also central to the book and is discussed in detail towards its end, in Chapter 15.
As is well-known, Turing was also instrumental in the code-breaking efforts in the World War II, and most notable is his work on the Enigma code. Chapter 12 is devoted to cryptography, and although Turing's work is not mentioned there explicitly, it played a classical and crucial part in the development of the field. Turing also carried out pioneering work on morphogenesis and the way patterns are formed in the process. In modern terms this work would be considered part of systems biology or bioinformatics, topics that are not discussed in the present book.
On a more personal level, but without getting into any details here, I would like to add that large parts of my own research in the last 38 years can be viewed as very modest continuations and extensions of the work of Turing. Thus, to a large extent I am but a dwarf standing on the shoulders of a true giant; the true giant of our field. I am thus extremely happy that Springer has agreed to publish this new printing of Algorithmics. For me, and Yishai Feldman joins me in this, it is a true celebration by any measure!
In terms of the basic fundamentals of algorithmics (that is, if we exclude the more practical and faster-changing material of Chapters 3, 13, and 14) little in the book has to be changed. None of the central open problems therein have been resolved, none of the basic notions underlying the topics therein have undergone a major modification, and very few of the new notions that have been defined since 2000 seem to deserve a place alongside the fundamental ones that are included. Thus, even had we decided to go for a fully-fledged new edition of the book, rather than merely a new printing, the text would have undergone only relatively minor changes. The next few paragraphs contain very brief discussions about a few of the relevant things that have happened in the last few years (thanks to Uri Feige for helping me compile this list, and, of course, to my co-author Yishai Feldman). The reader can read these now or come back to them after reading the book itself.
Towards the end of Chapter 4 there is a discussion of approximating a network coloring. In recent years, strong new results have been proved, culminating in the fact that approximating a coloring by any constant (not just the 50% factor mentioned in the text) is also as hard as solving the P=NP problem.
Parallelism, as discussed in the first parts of Chapter 10, has become more and more crucial recently, in part because of the change in hardware trends. The exponential increase in single-processor power tapered off several years ago, being replaced by the development of multiple-core chips. Four cores per chip are common these days, and the numbers are expected to increase drastically, at the expense of single-core performance. In order to take advantage of these new processors, new algorithmic and programming techniques are necessary. One popular technique is map-reduce, inspired by functional programming (as described in Chapter 3). It is a way of dividing computation on large amounts of data into parts that are performed on each piece separately, the results being combined using an appropriate accumulation function. This style of programming is not appropriate for every problem, but it lends itself very well to parallelization. Parallelism is thus becoming an even more central and crucial topic in computer science, so that it is probably fair to say that resolving the open problems in this area has become a lot more urgent.
Another topic central to Chapter 10 is quantum computing. Here the main thing to mention is the existence of larger quantum computers. The text mentions that at the time of its publication the largest quantum computer actually built consisted of seven qubits. This number has grown steadily in recent years, and while the jury isn't in on the exact current number (among other things this has to do with whether one has a general purpose or special purpose machine), a company called D-Wave Systems has been working recently with 128-qubit chips, and is said to be developing a 512-qubit one. What relevance this will have to the fundamental issues of quantum computation discussed in the text remains to be seen.
As to randomized and probabilistic algorithms, the topic of Chapter 11, there has been a lot of interesting research done in recent years, much of it around the random classes RP and BPP and their connections to pseudo-random number generators. In fact, the text says that some researchers believe that RP is strictly larger than PTIME. That may still be true, but in recent years you will also find many who don't.
The last chapter of the book, Chapter 15, on artificial intelligence and heuristics, is the one that could really do with a facelift. While the main issues raised there still stand strong (e.g., passing the Turing test is still nowhere in sight), there have been several exciting developments. These include significant improvements in computerized chess, and great improvements in natural language understanding and translation. One well-known example is IBM's Watson machine, which beat the top two human contestants in Jeopardy! (the popular American quiz show) in a 3-part match in early 2011. Watson exhibits an impressive ability to "understand" highly ambiguous language and to deal with situations that have long been associated exclusively with human talent. However, Watson is strongly based on statistical techniques rather than classical knowledge representation, continuing the trend discussed towards the end of Chapter 15. These techniques seem to have great potential for intelligent search in areas such as medicine, law, and others. In general, the tools underlying heavy-duty artificial intelligence applications are becoming more powerful, such as powerful new SAT solvers, which are very successful in practice, though in the worst case they don't do as well.
Finally, I would like to add that one of the most impressive and potentially revolutionary uses of computer science in the last 10-15 years has been in the life sciences. True, bioinformatics has been around for longer than that, but the use of deep techniques from algorithmics and system and software engineering in systems biology and the modeling and analysis of biological systems has grown by an order of magnitude in recent years. I share with others the opinion and belief that computer science will play a role in the science of the 21st century (which is poised to be the century of the life sciences) similar to the role played by mathematics in the physical sciences of the 20th century. Thus, chances are that the subject matter of this book will only become more relevant and of more interest.
David Harel, Rehovot; February, 2012
Back to Top |