I have been planning on writing such a short essay on this topic for a couple of years. My concrete motivation for this essay is my strong objection to the spreading tendency to use various quantitative measures (e.g., paper and citation counts) as a (main) basis for scientific evaluation. My main thesis is that professional evaluation of scientific work cannot be reduced to quantitative measures but rather relies in an inherent way on conceptual understanding provided by experts.
Discomfort with the subjective nature of "expert opinion" (as well as intellectual laziness per se) leads some people to seek an objective and simple alternative to "expert opinion", and such an alternative is supposedly offered by various numerical counts (e.g., paper and citation counts). Using such measures is indeed simple, but their objective nature (as a way of evaluating the importance of scientific work) is nothing but a big illusion. Most importantly, the question of which quantitative measure is most correlated with the importance of scientific work is highly controversial. In particular, the answer cannot be determined objectively, because one of the two parts of the relation is an intuitive notion. That is, in order to claim a correlation of some quantitative measure with with scientific importance, one has to obtain an evaluation of scientific importance, which is bound to be based on expert opinion.
So we are back in square one, except that one may suggest to use the expert opinion only for calibration of the quantitative measure that will be used from that point on. However, I claim that such a reliable calibration is infeasible to obtain. The point is that the subjective nature of expert opinion means that we may not reach a consensus on the scientific importance of an individual work, except maybe in extreme cases. Thus, even if perfect correlation is found between these extreme cases of quality and some quantitative measure, this can not guarantee good correlation on non-extreme cases (which are the bulk of the evaluation process). Furthermore, even good correlation does not suffice when what we care about is the evaluation of a specific work or a specific research direction (or a small set of such items). (Correlation will be good enough only if all that we care about the average behavior of a large sample...) Thus, if we really care about evaluating the merits of a specific work or a specific research direction or a specific individual, we cannot use any quantitative measure: There are no shortcuts to obtaining opinions of numerous experts and studying them with great care while applying good judgment.
It follows that the evaluation of the importance of scientific work cannot be performed well by a person lacking an overview of the relevant field. That is, understanding the technical contents of the evaluated work does not suffice for a sound evaluation. One needs to know the context of this work and how it fits into the big picture of the relevant field in order to be able to evaluate the work's contribution to this field.
[Indeed, the above is somewhat related to a statement by ten TCSists (incl myself), which addresses the balance between conceptual and technical considerations (and expresses a concern that this balance is being violated recently in PC of some TOC conferences).]
I believe that when it comes to subtle human problems, rigid and/or formal rules only create an illusion of coping with the problem. In contrast, an atmosphere that enforces certain norms via informal mechanisms of social acceptability is far more effective and suitable. Specifically, if the relevant scientific community is intolerant of unethical behavior (be it dishonesty or intellectual laziness), then this behavior will become very rare (and will seize to constitute a real problem).
Back to Oded's page of essays and opinions or to Oded's homepage.
[First posted on Jan. 5, 2009.]
[Revised: Feb. 10, 2009]