Excerpts from the Preface:
In the seventies and eighties, interest in Computer Vision
was concentrated on the development of general purpose seeing machines. There was wide agreement on research priorities, developing "bottom-up" computer algorithms that would organise the raw intensity values in images into a more compact form. The purpose of this was not just to compress the data but also to extract its salient features. Salient features could include corners, edges and surface fragments, to be used in identifying objects and deducing their positions. However, experience suggests strongly that general purpose vision is too difficult a goal for the time being.
If general purpose vision is abandoned, what alternative approach could be taken? One answer is that generality can be abated by introducing some “prior” knowledge — knowledge that is specific to the objects that the computer is expected to see. An extreme form of this approach is exemplified by automatic visual inspection machines of the kind used on factory assembly lines. In that context, it is known in advance precisely what objects are to be inspected — it is rare, after all, for potatoes streaming along a conveyor to give way, without notice, to a crop of spanners or chocolate bars. When computer hardware and software are specialised entirely to deal with one object, phenomenal performance can be obtained. A striking example is the "Niagara" machine (Sortex, UK Ltd) for sorting rice grains which "sees" 70,000 grains every second and almost literally spits out the rejects.
It is a commonly held view that it is hard to make progress in research by building such specialised machines because general principles are lost to engineering detail. That is a fair point but by no means, in our view, outlaws the use of prior knowledge about shape in computer vision research. Instead, we would argue, scientific principles for representing prior knowledge need to be developed. Then, when a new problem area is addressed, the principles can be applied to “compile” a new vision system as rapidly as possible. This includes such issues as how to represent classes of shapes that are de.ned loosely. Potatoes, for instance, might be characterised as roundish but with substantial size variations, with or without knobs. On the other hand, the class of human faces could be represented in terms of a common basic layout, but with considerable variation in the sizes and separations of features. Modelling classes of shapes, their variability and their motion is one of the principal themes of the book. The use of those models to help interpret moving images is the other central theme.
We have tried to present ideas about shape and motion in a way that will be readable not only by specialists, but also by those who are not regularly immersed in the ideas of machine vision. In particular we would hope that those with backgrounds in graphics or signal processing or neural computing would find the book a useful and accessible guide.